Regional spread of chikungunya fever in the Americas

Chikungunya fever (CHIKV), a viral disease transmitted by mosquitoes, is currently affecting several territories in the Caribbean. The vector is found in the Americas from South Florida to Brazil and the Caribbean is a highly connected region. There is therefore a significant risk for the epidemic to quickly expand to a wide area in the Americas, and of introduction to Europe where competent vectors are present. Here, we describe the spread of CHIKV in the first three territories to report cases and between territories of the region. Local transmission of CHIKV in the Caribbean is found to be very effective (mean number of cases generated by a human case in the range 2 to 4). There is a strong spatial signature in the regional epidemic, with the risk of transmission between territories estimated to be inversely proportional to the distance rather than being driven by air transportation. So far, this simple model has successfully predicted observed patterns of spread. The spatial structure allows ranking territories according to their current risk of invasion. This characterization may help national and international agencies to optimize resource allocation for monitoring and control and encourage territories with elevated risks to act. This is important to reduce the risk of dissemination to other parts of the world, including Europe.


Serial Interval distribution
The serial interval distribution is the time distribution between symptoms in the index case and that of its secondary cases. To account for the extrinsic period of transmission, it was described as a mixture of distributions based on the between bites duration (i.e. the gonotrophic cycle) and mortality rate in the mosquito 1 . In summary, secondary cases infections take place at time given by multiples of the gonotrophic cycle, starting with the infecting bite in the mosquito, with a weight decreasing due to mosquito mortality. On the basis of viremia time profile, a human host was assumed to be infectious for 5 to 8 days, with infectivity starting 1 to 2 days before symptoms. The gonotrophic cycle duration is largely determined by temperature (Otero) and is about 4 days in Saint-Martin with temperatures in the range around 26°C at this time of the year. This is also in accordance with infections in the same households, where about successive cases in the same household occurred roughly at multiple of 4 days ( Figure S1). To account for the time it takes before the virus is found in the salivary glands of the mosquito, we considered that the first gonotrophic cycle after mosquito infection would not lead to transmission. The mortality rate of A. aegypti mosquitoes in the Caribbean was reported between 0.2 and 0.1 per day in Puerto Rico 2 -we selected 0.1/day. Finally, we only considered the first 10 gonotrophic cycles after infection, as few mosquitoes would survive more than 40 days (= 10 gonotrophic cycles). Overall, this led to a generation time distribution with mean 23 days and SD 6 days ( Figure S2). We aggregated the serial interval distribution over week-long intervals, with the first interval only 4 days long so that the average duration was preserved (i.e. days 0-3, 4-11,…).

Reproduction number estimates in the three French islands
The reproduction ratio was estimated by the Exponential Growth method, i.e. ∫ ( ) ( ) where g is the probability density function of the serial interval distribution and r is the exponential growth rate (see Figure S2). To determine R, we fitted a Poisson regression to observed weekly incidence as a function of time and obtained r, then applied the transformation above. A large part of the variability in these estimates came from the time window selected for analysis, as exemplified in Figure S3. Indeed, the growth rate estimate changes and the R estimates are, from left to right, 1.6, 3.6 and 1.4. To overcome and illustrate this uncertainty, we reported the estimates obtained on the 10 periods where the exponential growth provided the best fit. The goodness of fit of the exponential growth regression was estimated by the deviance R-squared residual, computed as R 2 = (Dev(Null) -Dev(Exponential)) / Dev(Null) where Dev(Null) is the deviance of the null model computed as Dev(Null) = ∑ ̅ and Dev(Exponential) is the deviance of the exponential model computed as Dev(Exponential) = = ∑ ( ̂ ) ̂ with ̂ the fitted Poisson prediction. The results of this approach can be found in Table S1.
Sensitivity analyses were performed regarding the choice of the serial interval distribution. Using a shorter duration for the gonotrophic cycle (3 days vs. 4 days), led to little change in the serial interval distribution (2 days) and less than 5% variation on the estimates of R. With higher daily mortality in mosquitoes (15% instead of 10%), the generation interval was shorter, and the estimates of R were reduced by ~20%.

Model for a fully observed epidemic in the Caribbean
Data for the analysis of the spatial expansion of the chikungunya epidemic in the Caribbean are presented in Table S2.
Let's first assume that both the invasion status of territories and the times of invasion are known (we will show in the following section how to handle uncertainty about these variables).
We denote T the time when the analysis is done. We denote if a territory has not yet been infected at time T.
We denote ij   the instantaneous risk of transmission from territory i to territory j as defined in Table 1.
The force of invasion exerted on territory j at time t is: The contribution of territory j to the likelihood is

Model for an imperfectly observed epidemic in the Caribbean
In practice, invasion statuses are imperfectly observed. Times of invasion are unobserved and there is a delay between invasion and detection. We develop here a data augmentation strategy [3][4][5] to tackle these problems.
Denote d i the (observed time) of the first report indicating that territory i is affected. Let The first term on the right hand side of equation (2) corresponds to the likelihood of the invasion process, which is given in equation (1).
The second term corresponds to the likelihood of the reporting process. For a territory that has been invaded, it is equal to    

Inference in a context of incomplete data
In practice, the exact time of invasion t j is unobserved and the invasion status I j is imperfectly observed. We have the following relationships: -For a territory j that has been reported as being affected, the invasion status is known (I j =1). Besides, the time t j of invasion satisfies the constraint: t j <d j . -For a territory j that has not been reported as being affected, we could be in one of the following situations: o The territory was not invaded: I j =0, t j =T. o The territory has already been invaded, but has not been reported yet: I j =1, t j <T, d j ≥T.
To tackle the problem of inference in this context of missing data, we implement a standard Bayesian data augmentation strategy 6 (2).
We then develop a Reversible Jump Markov chain Monte Carlo strategy to explore the joint posterior distribution of parameters and augmented data 3,4,7 . The algorithm allows the following steps: -Update of the parameters with a standard Metropolis-Hastings step; -For territories that were invaded (I j =1), update of the time of invasion with a standard Metropolis-Hastings step; -For territories that were not reported as being affected, Reversible Jump steps were developed to jump between one of these two options: o The territory was not invaded (I j =0); o The territory was invaded (I j =1, t j <T) but was not reported.

Measure of goodness of fit
We measured goodness of fit by assessing how well the models agreed with the set of territories officially affected up to the time of analysis. To that end, for each model k, we simulated 10,000 epidemics up to the time of analysis under the sole assumption that the epidemic was seeded in Saint-Martin. We then computed the predicted probability of territory i being officially affected up to the time of analysis. Our measure of goodness of fit was the log-likelihood for the set of territories officially affected where I i =1 if the territory was officially affected; I i =0 otherwise. The larger LL TOA , the better the fit.

Model comparison and impact of reporting delays
In our baseline scenario, we assume that the mean reporting delay is equal to 30 days. In a sensitivity analysis, we also consider a mean reporting delay of 0 and 15 days. We were confronted with convergence issues in the MCMC when the mean reporting delay was assumed to be 60 days. Table S3 shows the measure of goodness of fit LL TOA for the different models and the different reporting delays. Overall, the fit does not appear to be much affected by assumptions about delay and the ordering of models remains unchanged.
  k i p Figure S4 presents the monthly probability of transmission as a function of distance between territories and for different reporting delays. The spatial kernel remains robust to a change in the reporting delay with transmission estimated to be only slightly more localized when the assumed reporting delay increases. Increasing the mean reporting delay increases the chance that a territory is already invaded or will be invaded soon but does not change the relative ordering of territories by risk of invasion (Figures S5). Table S3: Summary of the models used to study the determinants of regional spread. The covariates being considered to model the risk of transmission from territory i to territory j include: passenger flows (T ij ) or distance (D ij ) between the territories, population size of the origin territory (P i ) or of the destination territory (P j ). We also provide a measure of goodness of fit ( Table S4 presents the fits of the null model, the distance model and the air transportation model at different times of the epidemic. It shows that throughout the epidemic, the distance model was preferred to both the null and the air transportation model.   Figure S6 presents the estimated transmission kernel at different time points of the epidemic. Earlier estimates favoured slightly more short range transmission when compared with later estimates.