Molecular identification of the source of an uncommon tularaemia outbreak, Germany, autumn 2016

Background In 2016, an uncommon outbreak of oropharyngeal tularaemia involving six human cases occurred in Germany, caused by drinking contaminated fresh must after a grape harvest. Aim We describe the details of laboratory investigations leading to identification of the outbreak strain, its characterisation by next generation sequencing (NGS) and the finding of the possible source of contamination. Methods We incubated wine samples in different media and on agar plates. NGS was performed on DNA isolated from young wine, sweet reserve and an outbreak case’s lymph node. A draft genome of the outbreak strain was generated. Vertebrate-specific PCRs using primers targeting the mitochondrial cytochrome b gene and product analyses by blast search were used to identify the putative source of must contamination. Results No bacterial isolate could be obtained. Analysis of the draft genome sequence obtained from the sweet reserve attributed this sequence to Francisella tularensis subsp. holarctica, belonging to the B.12/B.34 phylogenetic clade (erythromycin-resistant biovar II). In addition, the DNA sequence obtained from the case’s isolate supported our hypothesis that infection was caused by drinking contaminated must. The vertebrate-specific cytochrome b sequence derived from the young wine and the sweet reserve could be assigned to Apodemus sylvaticus (wood mouse), suggesting that a wood mouse infected with F. tularensis may have contaminated the must. Conclusion The discovered source of infection and the transmission scenario of F. tularensis in this outbreak have not been observed previously and suggest the need for additional hygienic precautionary measures when processing and consuming freshly pressed must.


Introduction
Francisella tularensis, a facultative intracellular Gramnegative bacterium, is the causative agent of tularaemia, a zoonotic disease. Outbreaks in humans are often associated with exposure to infected animals, contaminated water or aerosols, and different arthropod vectors [1][2][3][4][5]. The clinical manifestation mainly depends on the route of infection, and the two main subspecies-F. tularensis subsp. tularensis and subsp. holarctica-are clinically relevant [1,6].
On 2 October 2016, there was an unusual outbreak of oropharyngeal tularaemia involving six cases in a group of 29 persons attending a grape harvest in Rhineland-Palatine, Germany [16]. Grapes collected by a mechanical harvester were pressed at the winery and participants had the opportunity to consume the fresh must at the end of the harvest. Because tularaemia was not initially considered as a possible differential diagnosis, a delay of about 5 weeks occurred in confirming the diagnosis of tularaemia. One of the six serologically confirmed tularaemia cases had complicated tularaemia and was hospitalised with pharyngitis and cervical abscess-forming lymphadenopathy.
In order to identify and characterise the causative agent of the outbreak, we analysed the contaminated must-derived products, sweet reserve (SR) and young wine (YW), as the contaminated must was no longer available for investigation. The SR is fumigated must that already contains low amounts of ethanol and the YW is must with added yeasts to start the fermentation process. In addition, we used lymph node material (PL) from the patient with complicated tularaemia for further investigation [16].
The aim of this report was to describe details of the laboratory investigations leading to the finding of the likely source of contamination of the must.
Moreover, by using next generation sequencing (NGS), we further characterised the outbreak strain and confirmed the presence of its DNA in the clinical material of one of the patients.

DNA extraction
We obtained an aspirate lymph node sample from one of the six patients of the must-associated outbreak who had a protracted clinical course with abscessforming lymphadenitis (Wetzstein N, Wolf T, personal communication, December 2016). The PL sample was collected directly after suspicion of the outbreak, i.e. 6 weeks after the event leading to the outbreak, which had not been recognised earlier. The contaminated must-derived products (sort 1A [16]), sweet reserve (RKI -sample number SR; A-856/3 (from sort 1A) and young wine (YW; A-856/2; from sort 1A), were collected by public authorities 3 weeks after the outbreak and were investigated. An additional YW (A-856/1; from sort 1B, pressed directly after sort 1A [16]) was also collected and studied. A total of 200 mL of each specimen were concentrated by centrifugation (45 min, 4,500 × g). Each pellet (SR, very thin pellet; YW, more pellet material, mainly yeast cells) was resuspended in 2 mL supernatant and further centrifuged at 20,000 × g for 5 minutes. Each pellet was again resuspended in 100 µL. DNA extractions were performed from the resuspended pellets (100 µL) and from 100 µL of aspirate (lymph node fluid) from the patient's left neck lymph node (PL; A-877/1), according to the protocol for Gramnegative bacteria in the manufacturer's instructions, using either the MoBio Power Soil DNA isolation kit (MoBio Laboratories, Carlsbad, California (CA), United States (US)) or the Qiagen DNeasy Blood and Tissue kit (Qiagen, Hilden, Germany), respectively. During the extraction, at the point of the proteinase K digestion, a plasmid with an artificial sequence insert was added to each sample as an internal extraction and amplification control (pKoMa2 [17]). Elution of DNA was performed twice with 75 µL (SR, YW) and 50 µL (PL) of Aqua bidest, respectively.

Singleplex and multiplex real-time PCRs, RD1-PCR
Multiplex real-time PCR (5' nuclease assay, TaqMan technology) targeting fopA and tul4 specific for F. tularensis, in combination with the extraction and amplification control targeting KoMa2, were performed with oligonucleotides and probes as described in Table 1.
A singleplex real-time PCR assay was performed from the clinical human sample for the detection of c-myc as a process control (oligonucleotide and probe; Table  1). Both real-time PCR assays were run in a total volume of 25 µL, including 5 µL of DNA of the samples to be analysed. Samples were analysed in duplicate in each run. The reaction mix components were 6.25 µL TaqMan Environmental MasterMix 2.0 (ThermoFisher, Henningsdorf, Germany), 10 pmol/µL primers (0.75 µL each) and probes (0.25 µL each). Amplification was performed in an Applied Biosystems 7500 Real-Time PCR System (ThermoFisher Scientific, Langenselbold, Germany), each run with an initial denaturation step at 95 °C for 10 minutes, followed by 40 cycles containing a denaturation step at 95 °C for 15 seconds and a combined primer annealing and elongation step at 60 °C for 60 seconds.
The calculation of genome equivalents for tul4 and c-myc was done for the SR, YW and PL by using the plasmids of the TOPO TA vector cloning kit (Invitrogen, Karlsruhe, Germany), containing the respective target region for tul4 of F. tularensis or c-myc as quantitative standards. For this purpose, in each real-time PCR run, standard control plasmids at  the concentration of 10 2 , 10 4 and 10 6 copies/25 µL for the different targets (tul4, fopA and c-myc) were added to generate target-specific standard curves that allow the calculation of the quantity of samples.

PCR for Francisella tularensis subspecies differentiation
The block PCR of the region of difference 1 (RD1-PCR) was used for the subspecies differentiation of F. tularensis. The PCR was carried out using the DreamTaq Polymerase (ThermoFisher, Hennigsdorf, Germany) with 15-100 ng of template DNA, according to the protocol described by Broekhuijsen et al. [18].

Mapping and generation of consensus sequence
Sequence quality assessment and trimming was performed with our quality control pipeline QCumber, developed in house [19], which combines the external tools Trimmomatic [20] and FastQC [21]. Furthermore, the pipeline uses the tools Bowtie2 [22] and Kraken [23] to analyse the origin of sequenced reads. Results of the taxonomic classification performed by Kraken on a Kraken-customised database consisting of bacterial, archaeal, viral and fungal genomes were visualised with Krona [24].
Trimmed reads were then mapped to the reference genomes of the expected background organisms (for SR and YW: yeast genome (GCA_000146045.2) and grape genome (GCA_000003745.2); for the patient sample: human genome (GRCh38) using Bowtie2).
Reads not mapping to the background organisms' genomes were then mapped to the Fthlive vaccine strain (LVS) (NC_007880.1) genome and a consensus sequence (draft genome sequence) was generated using Geneious (version R9.1.3 [25]; with a threshold of 75% and following the IUPAC code for ambiguities). The draft genome sequence generated from DNA isolated from SR (Fth-Must) has been submitted to GenBank (CP024807) and raw reads have been uploaded to the Sequence Read Archive (SRA, https://www.ncbi.nlm. nih.gov/sra; BioProject PRJNA417909).
The phylogenetic tree was constructed by Geneious (Geneious 10.0.5) using the neighbour-joining method (Tamura-Nei, outgroup: OSU18; 100 bootstrap replicates). The canSNP analysis was performed using published canSNP positions [26,28] to confirm the clade and subclade determination of the draft genome sequence of Fth-Must. Sequences used are given in Table 2.

PCR detecting vertebrate cytochrome b and DNA cloning
A vertebrate-specific PCR assay was used to identify mammalian DNA within all SR and YW samples. As described by Kent

Analyses of must products and patient samples
We incubated samples of SR and YW in medium T, on cystein-heart-blood-agar (CHAB) plates and on  CHAB agar plates containing selective antibiotics (CHAB-PACCV) [27], but we were not able to obtain a Francisella strain. Fth DNA was detected in samples of SR and YW by real-time PCR targeting tul4 and fopA. The subspecies was identified and confirmed by RD1-PCR analysis in both YW and SR ( Figure 1A, lanes 1-3), as well as in the PL aspirate sample ( Figure 1B, lane 6). The obtained DNA sample was used for NGS analysis to identify the outbreak strain. The PL sample and the SR and YW samples (sort 1A and 1B) were analysed by quantitative PCR analysis and revealed the following genome equivalents per mL: SR, 1.0 x 10 2 (sort 1A, sample A-856/3); YW, 1.7 x 10 4 (sort 1A, sample A-856/2); YW 4.4 x 10 2 (sort 1B, sample A-856/1)(data from [16]); and PL 1.7 x 10 6 (A-877/1).  Figure  2). The Fth-Must DNA was found to cluster with another human Fth isolate (A810-1) from Germany in the phylogenetic subclade B.34 ( Figure 2). In addition, we performed an in silico analysis of the canSNP analysis scheme, confirming that isolate Fth-Must corresponded to the B.12 clade (erythromycin-resistant strains) and to subclade B.34 ( Table 2). The affiliation to the B.12 clade was further confirmed by the identification of two SNPs within the 23S rRNA gene rrl that were recently found to be specific for strains belonging to this clade [30].
The alignment was performed using progressive Mauve alignment (MUSCLE 3.6) and the phylogenetic tree was built with Geneious (Geneious 10.0.5) (Tamura-Nei, Neighbour-Joining, 100 bootstrap replicates), setting Fth OSU18 strain as outgroup. The branch labels indicate the posterior probability and the branch length corresponds to the evolutionary distance (substitutions per site). For details of canSNP analysis B.X clusters and subclusters, see Table 2and references [26][27][28].  However, 691 and 690 reads (covering ca 107,073 bp of the whole Fth LVS genome sequence) mapped to the genome of Fth LVS and Fth-Must, respectively. We analysed the obtained reads for the presence of canSNPs used for subtyping and found one read mapping to the B.12 canSNP and two reads mapping to two respective SNPs only found in strains of subclade B.33 (Table 2). In addition, three reads mapped to two of three copies (present at the genome of Francisella) of SNP rrl-2 in the rrl gene ( Table 2). The results confirmed that the isolated Francisella DNA belonged to a Francisella strain of clade B.12 associated with erythromycin resistance [30] and at least to sub-clade B.33 (Figure 2), although the low coverage did not allow a clear identification. Further analyses could not be performed due to the restricted amount of DNA. Altogether, the results suggested that this patient was infected with the same Fth strain that was identified in the SR and served as must during the grape harvest. Tularaemia was also confirmed for all other diseased participants of the grape harvest [16].

Identification of the putative contamination source
Despite the identification of the outbreak strain DNA, the question of how the must had been contaminated with Francisella was still not answered. Since small rodents are occasionally found in mechanically harvested grapes [16], we used the unmapped reads of the last mapping step (using DNA from the SR, see above) and performed an additional mapping to a mouse genome (GCA_000001635.7). About 2,600 reads were found mapping to this mouse genome, but a definite identification of the species was not successful. However, based on the obtained results, we analysed the samples for the presence of vertebrate DNA to identify the species that may have contaminated the must with infectious Francisella.
For this purpose, we used a vertebrate-specific primer pair to amplify specifically the mitochondrial cytochrome b gene if present in the must. The different sample DNAs (A-856/1-3) were used in the PCR reaction. A PCR band of the expected size (623 bp) could be detected in YW and SR samples of sort 1A The alignment of the obtained cytochrome b consensus sequence of 578bp (cytochrome b PCR DNA product without primer sequences) with cytochrome b genes of different Apodemus species revealed that the obtained DNA sequence was 99.8% identical (one R at position 279) to the cytochrome b gene of a wood mouse (also called long-tailed field mouse) (Apodemus sylvaticus (Haplotype Germany-1/Haplotype France)). The phylogenetic tree of this alignment is shown in Figure 4. The DNA region of the cytochrome b gene of haplotype Germany-1 and -2 exhibited only one SNP at position 329 (Germany-1: T, Germany-2: C). These results suggested that a wood mouse infected by Fth might have been the contamination source of the must consumed by participants in the grape harvest.
The alignment was performed using progressive Mauve alignment (MUSCLE 3.6) and the phylogenetic tree was built with Geneious (Geneious 10.0.5) (Tamura-Nei, neighbour-joining, 100 bootstrap replicates), setting Mus musculus as the outgroup. The branch length corresponds to the evolutionary distance (substitutions per site). cyt b: cytochrome b gene; Hap.: haplotype.

Discussion
Freshly pressed must was served to some of the 29 participants in the grape harvest [16]. Wine yeast was added to a large portion of the must for the production of young wine and a smaller portion was fumigated and served as sweet reserve. In the SR and YW of sort 1A, Francisella DNA was detected at a high concentration and was confirmed to be Fth specific by PCR analysis (Figure 1). In sort 1B, pressed directly after sort 1A, only 440 genome equivalents per mL of F. tularensis was detected, suggesting a cross-contamination of this wine in the winepress [16].
NGS sequencing of DNA isolated from the YW (sort 1A) and SR revealed, as expected, more reads mapping to the yeast genome in the YW (8.6%) than in the SR (0.5%).
In both unmapped reads, an equal but small proportion (ca 1%) of reads was mapped to the grape genome, suggesting that there was a low amount of grape DNA in the YW and SR. In addition, in the obtained unmapped reads, 1.9% and 9.6% of the YW and SR reads mapped to the Fth LVS genome, respectively, demonstrating that there was still a lot of Francisella DNA present in these samples. Half of the unmapped reads from the SR did not map to any known sequences in the Krakencustomised database, and some reads mapped to different plant-and soil-associated bacteria, probably environmental and grape-associated bacteria.
After NGS sequencing of DNA from the SR, we were able to generate a consensus sequence (draft genome, Fth-Must) covering nearly the whole genome of Fth LVS. This draft genome may contain regions shared between different organisms in the sample. However, the distribution of mapped NGS reads, as well as the phylogenetic tree and canSNP analysis, demonstrated a good quality of the generated consensus sequence. The phylogenetic analysis of the Fth-Must DNA sequence revealed that the DNA belonged to a strain that clustered into subclade B.34 ( Figure 2). Although the obtained reads of the NGS analysis of PL-DNA covered only a small part of the Fth-Must sequence, the results confirmed the hypothesis that the patient from the outbreak had been infected by drinking must. Further, in 2016, we identified a Francisella isolate (A-810/1, data not shown) from a patient in Lower Saxony who had contracted tularaemia, confirming the presence of the subclade B.34 in Germany ( Figure 2).
To identify the source that contaminated the must with Fth, we performed a vertebrate-specific cytochrome b gene PCR, a conserved mitochondrial gene used for phylogenetic investigations [31]. Sequencing of the obtained PCR products revealed a nucleotide sequence that was 99.8% identical to the cytochrome b gene of A. sylvaticus haplotype Germany-1/France, a wood mouse. A. sylvaticus is known to consume fruit, and it has been reported that this species can be infected with Francisella [32]. Different studies in Croatia, Germany, Hungary and Spain demonstrated that different small rodents, like Myodes (bank voles), Sorex (common shrew), Microtus (common vole), Muscardinus(common dormouse) and various species of Apodemus (A. flavicollis, A. agrarius and A. sylvaticus) can be infected by or can be carriers of Francisella tularensis [5,10,[32][33][34]. The findings from these studies support our hypothesis of an infected wood mouse as the source that contaminated the must. Unfortunately, the late suspicion of the tularaemia outbreak did not allow the isolation of the outbreak-causing strain from the must, the patient or the suspected mouse for further functional investigation. It should be emphasised that modern laboratory techniques made it possible that all genomic characterisation could be obtained from DNA only. More data on the occurrence of F. tularensis in the region, including in rodents and other wild animals, would be helpful for further risk assessment and greater awareness of tularaemia during wine production.

Conclusion
Analysing this uncommon tularaemia outbreak, we were able to determine a draft genome sequence of the responsible Francisella strain, although no isolate could be obtained. Using this draft genome, a phylogenetic analysis was successful. Some reads exhibiting specific canSNPs identified in the DNA extracted from a patient's lymph node supported the finding of our previous cohort study that the patients were infected by consuming the fresh must. In addition, through the identification of the putative source of the contamination, we could propose a most likely route of transmission for this outbreak: The automatic harvester may have collected a wood mouse (or its carcass) infected with a high dose of Fth, then transferred it to the mash car, contaminating the mash, the press and finally 730 L of must-an infectious dose for humans. Subsequently, this must was served to a group of participants in the grape collection and six people contracted tularaemia. Based on our results, it was suggested that additional hygienic precautions should be undertaken during wine harvesting and production. For example, rodent control should be put into practice throughout all steps of wine production and freshly pressed must for tasting should be produced from hand-picked instead of mechanically harvested wine grapes, since the latter is more difficult to control. As raw food products can be associated with a risk for infectious agents, pasteurisation before consumption is also recommended [35]. Further, our investigation shows that tularaemia should be considered when individuals fall ill with relevant symptoms after a grape harvest event.