Genotypic anomaly in Ebola virus strains circulating in Magazine Wharf area , Freetown , Sierra Leone , 2015

SL Smits 1 , SD Pas 1 , CB Reusken 1 , BL Haagmans 1 , P Pertile 2 , C Cancedda 2 , K Dierberg 2 , I Wurie 3 , A Kamara 3 , D Kargbo 3 , SL Caddy 4 , A Arias 4 , L Thorne 4 , J Lu 4 , U Jah 5 , I Goodfellow 4 , MP Koopmans 1 6 1. Department of Viroscience, Erasmus Medical Center, Rotterdam, Netherlands 2. Partners in Health, Boston, United States 3. Sierra Leone Ministry of Health and Sanitation, Sierra Leone 4. University of Cambridge, Department Virology, Cambridge, United Kingdom 5. University of Makeni, School of Public Health, Sierra Leone 6. Virology Division, Centre for Infectious Diseases Research, Diagnostics and Screening, National Institute for Public Health and the Environment, Bilthoven, Netherlands

The Magazine Wharf area, Freetown, Sierra Leone was a focus of ongoing Ebola virus transmission from late June 2015.Viral genomes linked to this area contain a series of 13 T to C substitutions in a 150 base pair intergenic region downstream of viral protein 40 open reading frame, similar to the Ebolavirus/H.sapienswt/SLE/2014/Makona-J0169strain (J0169) detected in the same town in November 2014.This suggests that recently circulating viruses from Freetown descend from a J0169-like virus.
In Sierra Leone, two new Ebola virus (EBOV) cases were reported from the densely populated Magazine Wharf area of Freetown in the Western Area Urban district after a period of two weeks in June 2015 with no cases in the district.The Magazine Wharf area was subsequently a focus of transmission for several weeks (http://apps.who.int/ebola/current-situation/ebolasituation-report-15-july-2015) up to 12 August 2015 (http://apps.who.int/ebola/current-situation/ebolasituation-report-12-august-2015), after which no new cases were reported from the area (http://apps.who.int/ebola/current-situation/ebola-situation-report-30-september-2015).In this study, the whole genomes of viruses from patient samples, originating from the Western Area Urban district and other districts of the country (i.e.Kenema, Kono, and Tonkolili) between January and July 2015 are sequenced.Genomes derived from samples collected from 30 June onwards in the Western Area Urban district have a particular anomaly consisting of a series of 13 T to C substitutions in a 150 bp intergenic region downstream of the viral protein 40 (VP40) open reading frame (ORF).This anomaly is also present in a viral strain, the Ebolavirus/H.sapiens-wt/SLE/2014/Makona-J0169 (J0169), which was detected in Freetown in November 2014.The finding suggests that viruses retrieved in June and July 2015 from the Western Area Urban district are direct descendants of a J0169-like virus.Near real time application of whole EBOV genome sequencing and the identification of lineage signatures can be used to monitor the ongoing outbreak and test whether newly infected patients are part of an identified transmission chain.

Ebola virus disease epidemic in West Africa
An epidemic of EBOV (a negative-sense RNA virus, family Filoviridae) disease has been ongoing in West Africa since December 2013 affecting mainly Guinea, Liberia and Sierra Leone [1].As of 9 September 2015, the cumulative number of suspect, probable and confirmed cases stands at 28,183, including 11,306 deaths (http://apps.who.int/ebola/current-situation/ebola-situation-report-9-september-2015).EBOV cases continue to be detected and shifts in foci of transmission are observed.One of the pillars in the emergency response to the epidemic in the area has been the deployment of temporary (mobile) laboratories by the international community in collaboration with local authorities, among which the Dutch Mobile Laboratories (http:// dutchebolalabs.nl/rapport-implementatie-dutchmobile-labs-in-sierra-leone-en-liberia/).These laboratories provide rapid testing capacity for EBOV and malaria in support of clinical triage of (suspected) patients.In addition, the international community is currently working together in sequencing EBOV genomes in order to study EBOV evolution [2][3][4][5][6][7].

Sampling and whole genome sequencing
A total of 49 samples of EBOV positive patients who had been tested by Dutch Mobile Laboratories (Table) located in the Western Area Urban and Kono districts were included in the study.The samples were collected   from patients residing in Kenema, Kono, Tonkolili and Western Area Urban districts between January and July 2015.Nucleic acids were extracted from the samples using EZ Advanced XL automated RNA extraction (Qiagen).Isolated nucleic acids were subjected to reverse transcription/polymerase chain reaction (PCR) amplification using the Ion AmpliSeq Ebola Panel Assay (Thermo Fisher Scientific) and the Ion Torrent sequencing platform at a local sequencing facility established at the Mateneh Ebola Treatment Centre in Makeni, Bombali district, Sierra Leone (European Nucleotide Archive Study: PRJEB10265).Reads were extracted from unfiltered BAM files using CLC Genomics Workbench 7.5.1 and trimmed based on quality with an ambiguous limit of 2 and quality limit of 0.05.Reads longer than 1,000 nucleotides (nt) or shorter than 15 nt were discarded.Trimmed reads were mapped to Zaire ebolavirus isolate H.sapiens-wt/GIN/2014/Makona-Gueckedou-C05, complete genome (GenBank accession number: KJ660348), using CLC Genomics Workbench 7.5.1 map reads to reference beta module, with the following parameters: no masking, mismatch cost 2, insertion and deletion open cost 7, insertion and deletion extend cost 3, length fraction 0.95, similarity fraction 0.9.Consensus sequences were extracted and regions where depth of coverage was less than 2 were called as 'N'.All generated genomes were manually inspected for accuracy, such as for the presence of intact ORFs and low coverage regions for adequate base calling, resulting in 48 near full-length EBOV genomes (Table ).

Detection of evolutionary lineages and mutations
A median haplotype network was constructed in PopART version 1.7 (http://popart.otago.ac.nz) using 563 determined EBOV genomes from Sierra Leone, including those determined in this study ( [3][4][5]7]; Figure A; Table ).Accordingly, the 48 determined genome sequences appeared to belong to multiple distinct evolutionary lineages.Most viruses from the Western Area Urban district grouped together as did viruses from Kono.
Viruses EBOV_DML14077_SLe_WesternUrban_20150630, EBOV_DML14163 _ SLe_WesternUrban_20150703, and EBOV_DML14366_SLe_WesternUrban_20150711 (DML14077, DML14163, DML14366) isolated from patients from Freetown at the end of June and July (Table ) were highly similar to each other and were clearly different from viruses isolated in Freetown between January and March 2015 (Figure A).Two of these viruses were linked to the Magazine Wharf area in Freetown, a hotspot for EBOV transmission in June and July 2015.
The DML14077, DML14163, and DML14366 viruses were most closely related to Ebolavirus/H.sapiens-wt/SLE/2014/Makona-J0169 (GenBank accession number: KP759706) isolated on 9 November 2014 in Freetown (Figure A).The J0169 virus had a striking genotypic anomaly containing a series of 13 T to C substitutions occurring in a region of 150 bp in length in an intergenic region downstream of the viral protein 40 (VP40) ORF.This anomaly was also present in DML14077, DML14163 and DML14366 (Figure B).An excessive accumulation of T to C mutations has been previously observed in a variety of virus genomes, including negative-sense RNA viruses and in EBOV in infected cells in vitro, and are a hallmark of host adenosine deaminases acting on RNA (ADARs) [8][9][10].DML14077, DML14163, and DML14366 had 19, 17, and 18 additional single nt variants, respectively, compared with the J0169 virus, consistent with current estimates of evolutionary rate in the Ebola virus outbreak [5], suggesting that DML14077, DML14163 and DML14366 are direct descendants from a J0169-like virus (Figure A).The same stretch of T to C substitutions was found in a partial genome sequence from another patient at the end of June, who was also linked to the Magazine Wharf area (DML13828; Table ; data not shown) and it serves as a lineage signature allowing identification and tracking of transmission chains.

Discussion and conclusions
At present only a handful of Ebola virus genomes sequenced from the current outbreak contain stretches of T to C mutations, and these can be found at different locations in the genome [5,7].All shared nt variants, such as the conserved to C mutations observed between J0169 [7] and the three viral genomes newly characterised in this study, are unlikely the source of recurrent induced mutations in individual patients.Rather, the T to C substitutions could have been fixed in viral lineages in the past and these lineages may be capable of efficient human-to-human transmission.This would explain how a J0169-like virus with a characteristic series of T to C mutations could have further spread in the Magazine Wharf area of Freetown.
It is unlikely that the T to C mutation stretch observed in the J0169-like viruses had a fitness advantage over other spreading EBOV lineages as in the ca 6-7 months of transmission between the first identification of this virus lineage (November 2014) and the present, this EBOV lineage did not become dominant over others.The J0169-like virus lineage seemingly lingered on at sufficiently low prevalence that it was not captured in genomic surveillance until now.This is not surprising given that the amount of whole genome sequences available is only a fraction of the 28,183 confirmed, probable and suspected cases (http://apps.who.int/ebola/current-situation/ebola-situation-report-9-september-2015) and whole genome sequencing started to build momentum relatively late in the outbreak.It shows however, that near real time application of whole EBOV genome sequencing and the identification of lineage signatures can be used to monitor the ongoing outbreak and test whether newly infected patients are part of an existing known transmission chain in the area.

Figure A )
Figure A) Median-joining haplotype network constructed from an alignment of 563 Ebola virus sequences derived from clinical samples and B) alignment of a sequence region where four Ebola virus strains present a genotypical anomaly, Sierra Leone, 2015 Each vertex represents a sampled viral haplotype.The size of each vertex is relative to the number of sampled isolates.Coloured vertices indicate determined viral genomes from Western Area Urban (red), Kono (green), Tonkolili (blue), and Kenema (purple) districts.The larger font numbers depicted in red represent viral genomes derived from Freetown patients' samples retrieved in June and July 2015 (two of them linked to Magazine Wharf area) with a genotypic anomaly consisting of a series of 13 T to C mutations in a 150 bp long sequence, which is located in an intergenic region downstream of the VP40 open reading frame.

Table Characteristics of
Ebola virus positive specimens, which were subjected to whole genome sequencing, Sierra Leone, January-July 2015 (n=49) ID: identity; Western_Urban: Western Area Urban district.In the column with specimen IDs, entries with the same number of asterisks indicate specimens derived from the same patient.a District of patient residence.b Less reliable consensus sequence due to low coverage regions.c Not shown in Figure, unreliable consensus sequence due to low coverage regions.