Genomic analyses of Francisella tularensis strains confirm disease transmission from drinking water sources , Turkey , 2008 , 2009 and 2012

A Karadenizli1, M Forsman (mats.forsman@foi.se)2, H Şimşek3, M Taner3, C Öhrman2, K Myrtennäs2, A Lärkeryd2, A Johansson4, L Özdemir1, A Sjödin2 1. Department of Medical Microbiology, Kocaeli University, Kocaeli, Turkey 2. Division of CBRN Security and Defence, FOI Swedish Defence Research Agency, Umeå, Sweden 3. Public Health Institution of Turkey, National Tuberculosis Reference Laboratory, Ankara, Turkey 4. The Laboratory for Molecular Infection Medicine Sweden (MIMS), Department of Clinical Microbiology, Umeå University, Umeå, Sweden


Introduction
Tularaemia is a bacterial zoonosis caused by Francisella tularensis, a small Gram-negative coccobacillus.The dose required for human infection depends on the transmission route; less than 10 bacteria may be sufficient for infection via inhalation or the subcutaneous route, while infection via the oral route requires greater than 10 6 bacteria [1,2].Several clinical forms of tularaemia (including ulceroglandular/glandular, oculoglandular, oropharyngeal and respiratory) may develop, depending on the infection route.Although ulceroglandular tularaemia is most frequently reported in the literature, the clinical form most frequently seen in Turkey is oropharyngeal tularaemia [3,4].Epidemiological surveys have revealed a seasonal pattern of the disease in Turkey, where it is endemic in large areas.Most reported cases occur during the winter (December to March), frequently resulting from ingestion of infected food or contaminated water [3,5].
Two main subspecies of F. tularensis are clinically important: the most virulent (subsp.tularensis) is found only in North America, while the less virulent (subsp.holarctica) occurs in most of Eurasia and North America [6,7].F. tularensis subsp.holarctica can be further divided into four major genetic clades (B.4,B.6, B.12 and B. 16) based on clade-specific canonical single nucleotide polymorphisms (canSNP) markers [8,9].The distribution of these genetic clades in the waterborne outbreaks in Turkey seems to be widespread [10].
Four documented outbreaks of tularaemia occurred in Turkey between 1936 and 1953; then, following an epidemiologically silent period, it re-emerged in 1988 and its incidence has strongly increased in the past 20 years [11,12].From 2005, when tularaemia became a notifiable disease in Turkey, to 2012, several outbreaks and in total 4,773 human infections were reported [13,14].During that time, the three outbreaks studied here of oropharyngeal tularaemia occurred in the Kocaeli (n=35), Çorum (n=54) and Sivas (n=89) areas.Epidemiological investigations indicated that drinking water was the source of infection, but the genotypes and precise sources remained uncertain.In this study, we investigated possible genetic links between F. tularensis subsp.holarctica strains isolated from patients and drinking water in the same areas and at the same time.

Tularaemia diagnostics, isolation of bacteria and DNA preparation from clinical and water samples
The outbreaks in the Çorum, Sivas and Kocaeli provinces (Figure 1) commenced on 21 January 2008 (duration approximately two month), 22 February 2009 (duration approximately three month) and 24 March 2012 (duration approximately 40 days), respectively.All tonsil swab specimens collected from index patients were subjected to culturing, RT-PCR and serology.The samples were cultured for 4 to 10 days on cysteine heart agar supplemented with VCNT inhibitor (6 mg/L vancomycin, 15 mg/L colistin, 10 mg/L trimethoprim and 25 mg/L nystatin, Becton Dickinson, Sparks, US).Later during the outbreaks, laboratory verification of tularaemia was based on microagglutination (F.tularensis antigen, Becton Dickinson, Sparks, United States (US)) using patient sera [15].In total, 33 tonsil swab samples and 221 serum samples from patients were analysed.All samples and cultures were handled in a Class III safety cabinet (Labconco, US).
In addition, early during the outbreaks, 51, 34 and 55 specimens of (0.3 to 1.5 L) were collected from unchlorinated drinking water sources or village fountains in the Çorum, Sivas and Kocaeli areas, respectively.They were transported at + 4 °C to Kocaeli University, filtered through cellulose acetate membranes (pore size, 0.22 μm), and filters were placed on cysteine heart agar supplemented with VCNT inhibitor for culturing.Water filtration was carried out in a Class II safety hood and cultivation was performed in a separate laboratory under Class III biological safety conditions.
All analyses, including serology, culturing and RT-PCR of clinical samples and culturing of filtered water, were performed during the outbreaks and without delay upon arrival of the samples at the laboratory.

PCR amplification, detection and identification of Francisella tularensis strains
For identification of colonies from clinical tonsil-swabs, DNA was extracted from the colonies and PCR using primer and probe sets targeting ISFtu2 was done as previously described [13].Negative and positive controls (the latter consisting of 10-fold dilutions of F. tularensis subsp.holarctica LVS; NCTC 10857) were amplified in parallel [13].Colonies cultivated from the filters were identified as F. tularensis by agglutination test using specific antibody (F.tularensis Antisera, Becton Dickinson, Sparks, US).A PCR assay targeting RD1 was used to identify subspecies of F. tularensis isolates from both the tonsil swab and water samples [16].

Whole genome sequencing
Genome sequencing on DNA extracted from the three pairs of patient and water isolates was performed using Nextera XT DNA preparation kit (Illumina, San Diego, US) and 150 bp paired-end libraries on an Illumina MiSeq instrument.The reads were mapped against the F. tularensis subsp.tularensis FSC200 [17] genome using Bowtie2 [18] to evaluate the fraction of Francisella in the samples.Samples containing a mixture of other bacteria were filtered using MIRA v4.0 mirabait [19].The filtered reads were subsequently assembled de novo using ABySS [20], and the overall genome coverage was 130-fold (95 contigs).GenBank accession number for the sequences are: FDC200:JPPL00000000, FDC201:JPMT00000000, FDC202:JPPM00000000, FDC203:JPMU00000000, FDC204:JPSV00000000 and FDC205: JPSW00000000.

In silico screening of Francisella strains
Genome sequences were initially screened in silico by CanSNPer software [21] using published canSNP markers.The markers B.4, B.6, B.10, B.11, B.12, B.13 [22], B.20, B.21 [9], B.26 [23] and B.33 [24] were needed to describe the genotypes in a schematic (canonical SNP) phylogenetic tree.Multiple genome alignment with 507 archived F. tularensis strains were used to identify unique SNPs and develop new specific canSNPs for the two clusters detected (B.67 and B.68, see Results), as previously described [25].The archival genomes represent the available genetic and geographical diversity within F. tularensis subspecies holarctica from Asia, North America, and Europe.

Genome alignment and phylogenetic tree
A multiple genome alignment of the six Francisella genomes obtained in the outbreaks and the 74 most closely related archived Francisella genomes was generated by concatenation of a number of pairwise alignments where each strain was aligned against the reference strain FSC200 [17] using progressive MAUVE [26].The neighbour-joining phylogenetic tree with complete deletion was constructed in MEGA6 [27].

Epidemiology
A total of 178 patients were diagnosed with laboratoryverified tularaemia in the three outbreaks (Figure 1).Pharyngotonsillitis and cervical lymphadenopathy were the most frequent clinical findings in these patients.Epidemiological investigations indicated that drinking water was the likely source of infection.Accordingly, F. tularensis subsp.holarctica was successfully isolated from cultures of throat swabs collected from 12 patients among 33 index cases investigated.In the 140 drinking water samples, growth of F. tularensis subsp.holarctica colonies was identified in five samples.One sample pair (one patient and one water isolate) for each of the three outbreaks were selected for phylogenetic analysis.

Whole genome sequencing
Six strains were sequenced in total, obtained from one sample pair (one patient and one F. tularensis-contaminated water sample) for each of the three outbreaks.Whole genome multiple alignment and CanSNPer analysis showed that all of them grouped in the B.12 clade [8] (Figure 2A).In comparative analyses with whole genome sequences of F. tularensis strains of global origin, an isolate from the tonsil of the patient in Kocaeli (FDC204) and the corresponding water isolate from the contaminated water source (FDC205) formed a separate new cluster with identical genome sequences (Figure 2B).The strain most closely related to this cluster was a Swedish strain isolated in 2003 (FSC374), differing at 20 SNPs from the FDC204-5 group (Figure 2B).Other close relatives have been isolated 1967 in Slovakia and 2009 in Hungary (Figure 2B).The other two patient-water pairs, from Sivas and Çorum, were assigned to another new cluster separated by 26 SNPs from its most closely related strain (FSC930, isolated in 1961 in Bulgaria).The water isolates from Sivas and Çorum were identical.However, the patient isolate from Sivas (FDC201) differed from the corresponding water isolate (FDC200) at one SNP at the whole genome level, while the patient isolate from Çorum (FDC203) differed at seven SNPs from the Çorum water isolate (FDC202).

Identification of unique canonical SNPs (canSNPs)
Nineteen and 15 SNPs, respectively, uniquely identified the FDC200-3 and FDC204-5 clusters in the multiple alignment with 507 archived F. tularensis genomes.
One synonymous SNP was chosen as a new canSNP for each cluster (B.67 and B.68, respectively).The genomic positions of the B.67 and B.68 SNPs, relative to the F. tularensis subsp.tularensis SCHU S4 genome (GenBank ID AJ749949.2) [28], were 975,050 and 857,235, respectively.The ancestral base for both B.67 and B.68 was C, and the derived base was A and T, respectively.It was not possible to resolve the two patient-water pairs in the FDC 200-3 cluster, representing strains associated with the outbreaks in Çorum and Sivas, because the sequenced water isolates from these areas (FDC202 and FDC200, respectively) were identical at all aligned positions.

Discussion
Oropharyngeal tularaemia is a significant waterborne disease in Turkey, where more cases were recorded in 2011 than in all European Union countries combined [5].Better epidemiological tools can help identify the most important transmission chains resulting in human infections in the region and may ultimately aid in preventing the disease.
Because F. tularensis exhibits very little genetic variability, high-resolution analytical methods such as whole genome sequencing are appropriate for genetic typing in epidemiological investigations of tularaemia outbreaks and for source-tracing [29,30].Initial canSNP analysis of the six included isolates revealed that they all belonged to the F. tularensis genetic clade B.12, which dominates in Europe between Scandinavia and the Black Sea [7,8,[23][24][25].The higher resolution

Turkey
provided by whole genome sequencing showed that the analysed isolates formed two novel genetic clusters separated by at least 20 and 26 SNPs from all archived F. tularensis genome sequences.Their distinctions were also confirmed by in silico screening of a large reference collection of F. tularensis genomes using the canSNPs (B.67 and B.68) developed in this work to define the two new genetic clusters.The Kocaeli cluster contained two strains with identical genome sequences (isolated from a patient and a water sample), strongly suggesting a common source.The other cluster contained the paired patient-water isolates from Çorum and Sivas.Surprisingly, the genome sequences of the two water isolates from Çorum and Sivas were identical, implying that caution is needed when using genetic data to infer direct epidemiological links.In addition, the human isolate from Çorum represented a subgroup differing at seven SNPs from the water isolates from Çorum and Sivas.Assuming that the source of the human infection was the water sampled in Çorum, the water must have been contaminated by multiple genetically distinct F. tularensis possibly from different sources, e.g.several dead rodents.
Isolating F. tularensis from environmental specimens by cultivation is very difficult, mainly because it grows slowly and fastidiously, hence overgrowth of environmental background flora is a major problem.In this study the bacterium was isolated by filtering samples of cold spring water collected during the winter through cellulose acetate membranes before cultivation on selective media.Very low background growth was recorded, probably because the sampled water was pure spring water, which is oligotrophic and thus supports low background bacterial populations.
The isolates analysed in this study were obtained from three locations, 280 to 798 km apart, in central Anatolia and north-western Turkey (Figure 1).No tularaemia outbreaks were reported in these locations before 2004 [1,8].Ingestion of contaminated water appears to have been the main cause of the focal outbreaks, as in previously reported Turkish outbreaks [13,14].The patients were using water from the same reservoir (which was unchlorinated at the time of the outbreaks) for both drinking and other needs.The water inlet supplying the used reservoirs in these areas was not piped and the canals supplying drinking water were not covered, leaving the water open for contamination from environmental sources.Thus, it is reasonable to assume that upstream water became contaminated by small rodents killed by tularaemia [5].
The results presented here suggest that although the required infectious dose of F. tularensis in humans by the oral route may be high, contaminated drinking water poses a substantial risk to human health.Thus, the often cited bioterrorism potential of F. tularensis may not be solely restricted to aerosol distribution, and intentional contamination of drinking water may be an underestimated risk.
In conclusion, whole genome sequencing of outbreak strains confirmed an epidemiological link between drinking water and three outbreaks of human tularaemia in Turkey.The genomic epidemiology approach is particularly powerful for genetically monomorphic bacterial pathogens such as F. tularensis.