Integrated genomic surveillance enables tracing of person-to-person SARS-CoV-2 transmission chains during community transmission and reveals extensive onward transmission of travel-imported infections, Germany, June to July 2021

Background Tracking person-to-person SARS-CoV-2 transmission in the population is important to understand the epidemiology of community transmission and may contribute to the containment of SARS-CoV-2. Neither contact tracing nor genomic surveillance alone, however, are typically sufficient to achieve this objective. Aim We demonstrate the successful application of the integrated genomic surveillance (IGS) system of the German city of Düsseldorf for tracing SARS-CoV-2 transmission chains in the population as well as detecting and investigating travel-associated SARS-CoV-2 infection clusters. Methods Genomic surveillance, phylogenetic analysis, and structured case interviews were integrated to elucidate two genetically defined clusters of SARS-CoV-2 isolates detected by IGS in Düsseldorf in July 2021. Results Cluster 1 (n = 67 Düsseldorf cases) and Cluster 2 (n = 36) were detected in a surveillance dataset of 518 high-quality SARS-CoV-2 genomes from Düsseldorf (53% of total cases, sampled mid-June to July 2021). Cluster 1 could be traced back to a complex pattern of transmission in nightlife venues following a putative importation by a SARS-CoV-2-infected return traveller (IP) in late June; 28 SARS-CoV-2 cases could be epidemiologically directly linked to IP. Supported by viral genome data from Spain, Cluster 2 was shown to represent multiple independent introduction events of a viral strain circulating in Catalonia and other European countries, followed by diffuse community transmission in Düsseldorf. Conclusion IGS enabled high-resolution tracing of SARS-CoV-2 transmission in an internationally connected city during community transmission and provided infection chain-level evidence of the downstream propagation of travel-imported SARS-CoV-2 cases.

Background: Tracking person-to-person SARS-CoV-2 transmission in the population is important to understand the epidemiology of community transmission and may contribute to the containment of SARS-CoV-2.Neither contact tracing nor genomic surveillance alone, however, are typically sufficient to achieve this objective.Aim: We demonstrate the successful application of the integrated genomic surveillance (IGS) system of the German city of Düsseldorf for tracing SARS-CoV-2 transmission chains in the population as well as detecting and investigating travel-associated SARS-CoV-2 infection clusters.Methods: Genomic surveillance, phylogenetic analysis, and structured case interviews were integrated to elucidate two genetically defined clusters of SARS-CoV-2 isolates detected by IGS in Düsseldorf in July 2021.Results: Cluster 1 (n = 67 Düsseldorf cases) and Cluster 2 (n = 36) were detected in a surveillance dataset of 518 high-quality SARS-CoV-2 genomes from Düsseldorf (53% of total cases, sampled mid-June to July 2021).Cluster 1 could be traced back to a complex pattern of transmission in nightlife venues following a putative importation by a SARS-CoV-2-infected return traveller (IP) in late June; 28 SARS-CoV-2 cases could be epidemiologically directly linked to IP. Supported by viral genome data from Spain, Cluster 2 was shown to represent multiple independent introduction events of a viral strain circulating in Catalonia and other European countries, followed by diffuse community transmission in Düsseldorf.

Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has caused a worldwide pandemic with > 593 million cases and > 6.4 million associated deaths up to August 2022 [1].SARS-CoV-2 vaccines have greatly contributed to reductions in coronavirus disease (COVID-19)-associated morbidity and mortality in many countries; however, non-pharmaceutical interventions (NPIs) to limit viral spread and reduce the healthcare burden of SARS-CoV-2 remain important in many contexts.Such contexts include instances of low vaccine availability or high rates of vaccine hesitancy in some countries, the potential for vaccine breakthrough infections and, more generally, the emergence of novel viral variants.
As the aim of NPIs is to interrupt or prevent pathogen transmission chains, a comprehensive understanding of these transmission chains in the population -who infected whom, and in which epidemiological context -could be greatly beneficial.Contact tracing regimes, which typically employ structured case interviews and which are operated by many countries, are an important data source on pathogen transmission in the population.Contact tracing is generally recognised as an important element of SARS-CoV-2 mitigation strategies [2][3][4][5].However, the ability of classical contact tracing regimes to reliably track transmission chains in the population is limited and a substantial number of infections typically remain unexplained.For example, in the German city of Düsseldorf, an international economic and air travel hub of ca 600,000 inhabitants, ca 45% of SARS-CoV-2 infections remained unexplained in 2021 (Düsseldorf Health Department internal data), despite the operation of a well-staffed and comprehensive contact tracing effort.Similar numbers have been reported from other localities [6].Genomic surveillance, another potential data source on the structure of population transmission chains, has also emerged as an important element of SARS-CoV-2 mitigation strategies [7,8].However, due to the relatively low mutation rate of SARS-CoV-2 [9] and the fact that many genomic surveillance systems only sample a limited proportion of total cases, genomic surveillance by itself is typically not sufficient to enable reconstruction of transmission chains at the person-to-person level in the population outside of confined outbreak scenarios.Integrated genomic surveillance (IGS) is an emerging approach that refers to the integrated analysis of genetic and complementary epidemiological data.As we and others have shown [10][11][12][13], IGS can contribute to identification of otherwise unrecognised SARS-CoV-2 transmission chains in the general population even under conditions of high-incidence community transmission and thus provide important complementary information for the design and implementation of NPIs.
Here we use the IGS system of Düsseldorf (IGSD) to investigate person-to-person transmission chains in this city in late June and July 2021.

Integrated genomic surveillance in Düsseldorf
The IGSD has been described elsewhere [10].Briefly, when fully operational, the system operates as follows.First, a large proportion of SARS-CoV-2 from local cases is rapidly sequenced.Viral genomes (Z* samples) are primarily generated by the Centre for Medical Microbiology, Hospital Hygiene, and Virology of Heinrich Heine University Düsseldorf as part of a dedicated local sequencing effort.Viral genome sequences of local cases generated under the national German SARS-CoV-2 surveillance programme by a collaborating large diagnostic laboratory (N* samples) are also integrated.In 2021, the achieved sequencing rate typically varied between 40 and 60% of new cases on a weekly basis; the 'routine Düsseldorf surveillance dataset' described below consists of sequence data obtained as described here (Z* and N* samples).
Second, putative infection clusters are identified with a search algorithm for groups of pairwise-identical samples ('cliques').
Third, the generated sequencing data and identified putative infection clusters are displayed in a visual form ('dashboard'; available at https://covgen.hhu.de); this visualisation is continuously-updated.This system is used as the main information exchange mechanism with the Düsseldorf Health Authority.
Fourth, the identified putative infection clusters are investigated at the Düsseldorf Health Authority.The investigation combines (i) routine data collected as part of Düsseldorf Health Authority's contact tracing activities, including on symptom onset, travel history and contact persons and (ii) information obtained from structured case interviews ('deep backward contact tracing'; see below) to elucidate potential case connections not captured by standard contact tracing.
Of note, the IGSD does not comprise the routine collection of clinical metadata, and case severity is not used as a sample selection criterion.
To investigate the applicability of the developed system beyond Düsseldorf, a trial run of the IGS system was carried out in the nearby smaller city of Solingen in July and August 2021; the Solingen data were processed and analysed separately from the Düsseldorf data and only integrated with the Düsseldorf data during the phylogenetic cluster refinement analysis (see below).
a The phylogenetic tree, generated with Geneious (see Methods) and visualised with iTol [19], shows 699 sequences.Nine sequences were found to exhibit low-quality alignments in the multiple sequence alignment of the 708 input sequences by manual inspection and were removed from the alignment before construction of the tree.
Cluster 1 and Cluster 2 are highlighted (areas of the tree shaded in blue and red, respectively).The nodes serving as the two clusters' root nodes, I361 and I584, are displayed as little boxes with green background.The presence of T14064C and C18744T, mutations used in the process of defining Cluster 1 and Cluster 2, is indicated by blue circles and red triangles, respectively.

Inter-sample distance metric
Inter-sample genetic distances were calculated as defined previously [10], with one modification (point (iv) below).Briefly, a multiple sequence alignment (MSA) of all sequences was built with MAFFT [14], using GISAID [15] instructions.The distance d(x, y) d(x, y) between two samples x and y was defined as the number of differences between the MSA entries of xx and yy , (i) ignoring leading or trailing gap characters, (ii) counting matches and mismatches according to International Union of Pure and Applied Chemistry (IUPAC) ambiguity codes, (iii) counting subsequent non-matching gaps columns as a difference of 1, (iv) ignoring deletions aligned to 'N' regions in the other genome, and (v) ignoring any mismatches in the MSA regions between the beginning of the MSA and the 20th ACGT character of either sequence and the end of the MSA and the 20 last ACGT characters of either sequence.

Structured case interviews
A specialised team of interviewers within Düsseldorf Health Authority conducted structured case interviews.These covered (i) occupation and place of work; (ii) utilisation of public transport; (iii) social, household and family contacts; (iv) utilisation of medical services; (v) supermarket and retailer visits; (vi) gastronomy and nightlife; (vii) travel history.Before a case was classified as unavailable, a minimum of three contact attempts were carried out using available landline or mobile phone numbers; participation in the structured case interviews was voluntary.

Phylogenetic cluster refinement analysis
To refine the definition of two large groups of genetically-related-SARS-CoV-2-infected cases, which are referred to as Cluster 1 and Cluster 2, phylogenetic analysis of an 'extended' dataset (see Results section) was carried out using the neighbour-joining method with the Tamura-Nei genetic distance model as implemented in Geneious version 10.2.6 (Figure 1).The samples previously flagged by the IGSD routine cluster analysis algorithms were located in the phylogenetic tree.Once the presence of two large clusters of genetically-related isolates was confirmed, an analysis of the mutational patterns observed downstream of the putative cluster-associated root nodes was carried out.FASTA files and the phylogenetic tree are publicly available (see Data availability).

Cluster 1 strain-of-origin analysis background dataset
To investigate potential origins of the viral strain of Cluster 1, a background dataset was assembled by combining (i) a random sample of non-Cluster 1 Düsseldorf sequences (n = 30); (ii) the set of all SARS-CoV-2 sequences from the Balearic Islands sampled between 15 June and 01 July 2021 available on GISAID (n = 173); (iii) the sequence of EPI_ISL_2710175, a viral genome from the Balearic Islands sampled on 14 June 2021 that was identified using the GISAID Audacity Instant Search [15]; (iv) 61 GISAID sequences related to Cluster 1.This GISAID set was assembled by carrying out a tree neighbourhood search in the GISAID 'Global Phylogeny' tree from August 2021 (GISAID-hCoV-19-phylogeny-2021-08-16; representing 624,052 sequences).Specifically, two Cluster 1 sequences (N1501, N1506) with genetic distance 0 to an individual SARS-CoV-2-infected traveller returning to Düsseldorf from the island of Mallorca (IP), who was retrospectively identified as a likely Cluster 1 index case in the city, were located in the tree.The identities of all leaves with a tree distance (defined as the cumulative length of the edges along the shortest path between two nodes) of ≤ 3/29,903 to either of the two Cluster 1 sequences and sampling date ≤ 15 July 2021 were extracted.
The corresponding viral genome sequences were obtained from the GISAID MSA, and details and acknowledgements are provided in Supplementary Table 1.

Detection and refinement of two large clusters in July 2021
Between 15 June and 01 August 2021, 541 SARS-CoV-2 surveillance genome sequences from Düsseldorf were registered within the IGSD, of which 518 were high-quality sequences (< 5,000 undefined nt; Supplementary Table 2); this set is referred to as the 'routine Düsseldorf surveillance dataset'.Over the same period, 976 new SARS-CoV-2 cases were registered in Düsseldorf, i.e. a high-quality viral genome sequence was available for ca 50% of cases.
In mid-July 2021, the emergence of multiple overlapping putative infection clusters ('cliques' in the pairwise isolate distance matrix, i.e. groups of multiple pairwise-identical viral isolates) was detected by the system's routine cluster analysis algorithms in the routine Düsseldorf surveillance dataset and indicated the presence of two novel large groups of closely related viral isolates (Delta variant; Phylogenetic Assignment of Named Global Outbreak (Pango) lineage designation: B.1.617.2 [16]).
Phylogenetic analysis (see Methods section; Figure 1) was carried out based on an expanded dataset, referred to as the 'extended dataset', with 708 viral genome sequences sampled between 15 June and 01 August 2021 (Supplementary Table 2) that comprised the original Düsseldorf routine surveillance dataset (n = 518); lower-quality Düsseldorf surveillance sequences (n = 23); Düsseldorf University Hospital patient sequences from the same period (n = 27); and available sequences from the nearby city of Solingen, where a trial run of the IGS system took place in July and August 2021 (n = 140).In this analysis, quality thresholds were applied after the construction of the phylogenetic tree; the rationale for including sequences from Solingen was to investigate potential transmission beyond Düsseldorf.
The phylogenetic analysis identified the mutation T14064C (blue circles in Figure 1) as associated with Cluster 1, and C18744T (red triangles) as associated with Cluster 2. The internal nodes I361 and I584 of the constructed phylogenetic tree (see Figure 1 and 'Data availability') were chosen as the root nodes for Cluster 1 and Cluster 2 respectively.The isolate clusters Cluster 1 and Cluster 2 were provisionally defined as the sets of leaf-level descendants of these nodes, including Z4116, a sample carrying an isolated undefined genotype ('N') at position 14064 with distance 0 to other Cluster 1 samples (e.g.IP).
Subsequent to the creation of the phylogenetic tree (Figure 1) with the extended dataset, phylogenetic outliers, repeat samples from the same individual, sequences with > 5,000 undefined nt, and non-surveillance Düsseldorf University Hospital sequences sampled after 09 July were removed (see Supplementary Table 3 for a full list of included and removed samples).
For Cluster 1 (n = 71 leaf-level descendants of node I361 in the tree), this meant removing 11 sequences (two outlying -including a sequence from a case named KP1_5; four redundant; two with > 5,000 undefined nt -including one from a case named KP5_2_1; and three non-surveillance).For Cluster 2 (n = 49 leaf-level descendants of node I548 in the tree), this meant taking away seven sequences (one with > 5,000 undefined nt; four redundant; and two non-surveillance).
The phylogenetics-based definition of Cluster 1 comprised 60 viral sequences with an average pairwise genetic distance of 0.91 (59 from Düsseldorf, A. Bar plot showing (i) the number cases who had a direct contact-tracing link to IP (i.e.first-, second-and third-order contacts), (ii) the number of cases with no contact-tracing link to IP but with an epidemiological link to other Cluster-1 cases, who were also not related to IP and (iii) the number of cases with no contact-tracing links to any Cluster 1 cases.
B. Visualisation of the reconstructed epidemiological structure of Cluster 1.Each node in the transmission chain graph represents one case.Nodes shaded in grey represent epidemiologically linked, but non-sequenced cases (see text).Unshaded nodes represent epidemiologically linked and sequenced cases.Of note, the sequence of IP was not used for Figure 1, as this case and respective sequence data were found later in the investigation.While KP1_5 and KP5_2_1 cases had sequence data, these were initially not included in the phylogenetics-based definition of Cluster 1: KP1_5 exhibited an increased genetic distance to the other samples in the cluster (and is therefore shown with a dashed border; see Supplementary Note for a discussion) and KP5_2_1 had > 5,000 undefined nt.Upon contact tracing findings, however, these two cases were included as part of Cluster 1. Test dates of assumed index cases, who were epidemiologically unconnected to IP, but for whom contact-tracing suggested that they further transmitted the Cluster-1 strain to other IP-unconnected cases, are respectively shown above the respective nodes of these assumed index cases.The inset to the right of the transmission-chain graph shows the complex patterns of visits of IP and IP's first-and second-order contacts to two bars in the Old Town District of Düsseldorf.The two cases shown with red boxes also visited Bar A, but their sequenced viral isolates group with Cluster 2 in phylogenetic analysis.Cases KP9 and KP10 participated in a pub crawl in the Düsseldorf Old Town area around Bar A. The text inset details the precise nature of the identified case relationships.Transmission chain graphs were plotted with Graphviz [20].
The FASTA file used for the phylogenetic analysis comprising all analysed sequences as well as the constructed tree in Newick format are publicly available (see 'Data availability').To further investigate Cluster 1 and Cluster 2, integration of routine contact-tracing data and structured case interviews were carried out (see Methods' section).

Cluster 1 was associated with nightlife spreading events following a putative travelassociated importation
The emergence of Cluster 1 in Düsseldorf could be traced back to multiple nightlife spreading events following putative importation of the Cluster 1-associated strain by IP, an individual SARS-CoV-2-infected traveller returning to Düsseldorf from the island of Mallorca on 28 June (the sequence of IP was not included to compile the phylogenetic tree in Figure 1 as this case was identified during the epidemiological investigation).
The identified epidemiological links between Cluster 1 cases are visualised in Figure 2. Transmission of the imported viral strain (Delta variant) in Düsseldorf was likely initiated during encounters between IP and eight first-order contacts (KP1-KP8) in two bars ('Bar A', 'Bar B') in the Old Town District of Düsseldorf, a popular area for nightlife activities with narrow streets and more than 200 bars, on 30 June.Additional transmissions took place in a complex pattern of additional visits of the first-order contacts to Bar A on 02 July (KP2 and KP1 were present in the bar in the same time as KP1_1-KP1_7) and 03 July (Figure 2B Inset) during a likely encounter between the first-order contacts and KP9 and KP10, who were on a pub crawl in the area around Bar A on 03 July (where KP5 and KP6 were present, as well as KP5_1-KP5_4); and from the second-order contacts into the local population via private meetings, family and household contacts (Figure 2B).Contact tracing and structured interviews also uncovered links between an additional 15 cases without direct links to IP (Figure 2B); these likely represented ongoing community transmission of the introduced viral strain or secondary introduction events (see below).Apart from IP, the other cases had no recorded travel histories.
Of note, looking into a potential link between IP and Cluster 1 begun after it emerged during routine contact The figure shows daily new registered SARS-CoV-2 cases in Düsseldorf in July 2021, and how many of these were associated with Cluster 1 or Cluster 2. For Cluster 1, 'directly linked to index case' refers to uninterrupted, contact tracing-supported putative transmission chains between the linked cases and IP; 'genotype' refers to respective case samples that were identified as belonging to Cluster 1 in the phylogenetic tree analysis (see text).For Cluster 2, 'directly linked to travel from Catalonia' refers to cases who either recently returned from Catalonia and to cases who were directly linked to Catalonia returnees in a manner supported by contact tracing; 'genotype' refers to respective case samples that were identified as belonging to Cluster 2 in the phylogenetic tree analysis (see text).
tracing that IP had frequented Düsseldorf Old Town nightlife venues on 30 June; when this link started to be explored, the investigation of Cluster 1 was already under way and had identified the Old Town and 30 June as focal points for Cluster 1-related viral transmission.The positive PCR test of IP was carried out in a laboratory not located in Düsseldorf and therefore not covered by the IGSD; the viral genome of IP (available on GISAID under EPI_ISL_3044996), however, was sequenced under Germany's SARS-CoV-2 national genomic surveillance programme and could be requested by Düsseldorf Health Authority after identification of IP.Analysis of the viral genome sequence of IP confirmed that it was highly related to Cluster 1, carrying the T14064C mutation and exhibiting a genetic distance of 0 to 34 of the 60 Cluster 1 sequences phylogenetically defined (Supplementary Table 4).
The assignment of IP as the likely Cluster 1 index case was based on the reconstructed pattern of likely infection events as well as on the dates of symptom onset (Supplementary Table 4) of IP (01 July) and KP1-8 (04-07 July for all cases but KP1, who reported symptom onset on 02 July).IP's symptom onset on 01 July rendered an infection on 30 June unlikely and favoured Mallorca or the return flight to Düsseldorf as infection contexts.In addition, apart from IP, only KP4 and KP5 were present in both Bar A and Bar B (Bar B was where KP7 and KP8 were likely infected) on 30 June, and KP4 and KP5 reported symptom onset on 04 July, consistent with an infection transmitted by IP on 30 June.
To further investigate potential origins of the viral strain of Cluster 1, we analysed the sequences of Cluster 1 against a background dataset of other contemporaneous sequences from Düsseldorf, the Balearic Islands, and GISAID samples related to Cluster 1 (see Methods section).Consistent with an assumed infection of IP on Mallorca, phylogenetic analysis (Supplementary Figure 1) showed that the sequences of Cluster 1 and a small number of isolates from the Balearic Islands and GISAID formed a distinct cluster.Furthermore, an analysis of genetic distances (Supplementary Table 6) showed that the Cluster 1-related sequences from the Balearic Islands were as closely related (genetic distance 1) to the sequence of IP as any of the GISAID sequences up to a sampling date of 07 July, approximately 1 week after the initiation of Cluster 1 transmission in Düsseldorf.IP-identical viral isolates started appearing in the GISAID dataset with sampling dates from 07 July onwards; the 'originating laboratory' record of the earliest three such isolates, however, indicated a likely sampling location in the area around Düsseldorf and thus a likely connection Cluster 1.
The first IP-identical isolates in the GISAID dataset from another German state were collected from 13 July onwards; these, as well as earlier Cluster 1-related sequences from June and July with genetic distance 1, may reflect wider circulation of Cluster 1-related strains in Europe and highlight the possibility of independent introduction events, as well as, in particular for the GISAID samples collected from mid-July onwards, potential export of the Cluster 1 viral strain from the Düsseldorf area.
Including IP; two individuals (KP2, KP5) who were in the company of KP1, KP3, KP4, and KP6 when they were likely infected by IP; three individuals (KP1_1, KP1_6, KP1_7) who were with KP1_4 and KP1_5 when they were likely infected by KP1 or KP2; KP5_2_1 (the viral genome of whom had more than 5,000 undefined nt and who was therefore removed from the initial results of the phylogenetic analysis); and KP1_5 (the viral genome of whom exhibited an increased genetic distance to the other samples in the cluster; see Supplementary Note), 28 SARS-CoV-2 cases in Düsseldorf could be directly linked to IP (defined as the identification of an uninterrupted, contact tracingsupported putative transmission chain between the linked cases and IP; Figure 2A), with a median serial interval of 3 days.With these cases included, Cluster 1 comprised 67 Düsseldorf cases, or 8% of new SARS-CoV-2 cases registered in Düsseldorf in July (Figure 3), and one Solingen case.Of note, two cases belonging to Cluster 2, Z4187 and Z4145, also visited Bar A on 02 July (without recorded direct contacts to other Cluster 1 cases); despite this potential epidemiological link, the genetic data clearly showed that these belonged to a different infection cluster.For 24 Düsseldorf cases and S88, the only Solingen sample in Cluster 1, no links to other Cluster 1 samples or other putative infection sources were identified.
Investigations by the Düsseldorf Health Authority and the discovery of a video posted to social media channels showed limited adherence to mandatory SARS-CoV-2 infection prevention measures in Bars A and B in force at the time, including dancing and non-compliance with indoor masking rules.In addition, the investigations demonstrated insufficient tracking of customer contact details, also in violation of mandatory German pandemic regulations in place at the time.

Cluster 2 represents multiple independent importation events linked to return travel
For Cluster 2, detected from 30 June onwards and comprising 42 cases, no clear index case could be identified.While the integration of contact tracing data and structured case interviews enabled delineating relationships for 25 cases (e.g.household contacts), the overall size of the identified transmission chains was limited compared with Cluster 1 (Figure 4A).Examination of the travel history of the cases, however, showed that almost a quarter of the 42 Cluster 2 cases could be linked to return travel from Catalonia (seven returnees from Catalonia and three associated downstream infections in Düsseldorf); an additional five cases had been travelling to France before testing positive for SARS-CoV-2 (Figure 4B, Supplementary Table 5).Analysis of symptom onset (Supplementary Table 5) in relation to return travel dates suggested that eight of 15 return travellers in Cluster 2 were likely infected during their stay abroad because their symptoms had started either when abroad or within the first 24 hours after return.For some of these cases, exposure in Düsseldorf could be ruled out with certainty (e.g.Z4106, with symptom onset on the same day as the return flight).The viral sequences of Z4077 and Z4076, which belonged to some of the earliest Cluster 2 cases in Düsseldorf, exhibited a genetic distance of 3. The cases corresponding to these two sequences were likely infected during a joint trip to Barcelona (symptom onset 1 and 3 days after return to Düsseldorf, respectively), and likely represented two independent infection events in Barcelona.

Discussion
Increased understanding of SARS-CoV-2 transmission chains in the population is important to support improved containment strategies.Here, we used IGS, an emerging approach integrating genetic and classical epidemiological data, to investigate two large clusters of genetically-related SARS-CoV-2 isolates occurring in Düsseldorf.Taken together, isolates from these two clusters accounted for more than 10% of SARS-CoV-2 cases in the city during the considered period (Figure 3).We show that IGS allowed to trace complex SARS-CoV-2 transmission chains in both clusters, each involving cases who had journeyed abroad.IGS was a necessary approach for the investigation of the two clusters; in many instances, links between cases were only uncovered by the structured case interviews carried out after genetic links had been identified.On the other hand, Clusters 1 and 2 may have been considered as connected without the additional information gathered through genomic surveillance.Indeed, two cases from Cluster 2 (Z4187 and Z4145) were present in one of the bars where Cluster 1-related transmissions took place.Furthermore, the genomic data collected during the investigation of Cluster 2 could be analysed together with genomic data from the city of Barcelona to trace infection chains beyond Germany.The joint effort between researchers in Düsseldorf and Barcelona, which contributed to understanding the spread of the virus in Cluster 2, also demonstrates the potential of pan-European collaboration.
This study has multiple potential limitations.First, relevant cases may be missing from the analysis of the two clusters.Reasons for this may include undetected infections in asymptomatic individuals or failure to identify relevant cases during contact tracing or based on genetic data.Second, high-quality viral genomes were only available for ca 50% of SARS-CoV-2 cases in Düsseldorf during the considered period, so relevant cases in the genetic analysis may have been missed (see previous point).There remains also uncertainty with respect to cases for whom no sequencing data were available, such as the epidemiologically linked additional KP cases of Cluster 1.In addition, the assignment of KP1_5 to Cluster 1 remained ambiguous even with genetic data.Third, while the assignment of IP as the index case of Cluster 1 was supported by dates of symptom onset, the reconstructed pattern of putative transmission events in different bars, and the detection of related viral isolates from the Balearic Islands, a degree of uncertainty remained, as there was no way to rule out the presence of undetected cases or independent importation events at the beginning of the detected transmission chains.The detection of related sequences in other regions highlights the possibility of additional independent introduction events, in particular for Cluster 1 cases from mid-July onwards for which no link to IP could be identified, as well as the possibility of export of viral strains from the Düsseldorf area, as inferring the directionality of transmission from genetic data alone is generally not possible.In addition, it is possible that the structured case interviews failed to uncover relevant case travel histories.Fourth, the possibility of multiple exposures to SARS-CoV-2 during high-incidence periods, likely observed e.g. for two Cluster 2 cases in this study, represents a general challenge for the accurate tracing of transmission chains; inferred links that are not supported by both genomic and epidemiological evidence should be interpreted with caution.Fifth, the current study was carried out on a timescale of weeks and retrospective in nature; in the future, the sequencing speeds achievable with modern single-molecule sequencing technologies (from 'swab to sequence' in < 72 hours [10]) may enable implementations of IGS that support 'real-time' containment efforts.Sixth, the IGSD does not comprise the routine collection of clinical metadata, and case severity is not used as a sample selection criterion; inclusion of clinical metadata may further increase the utility of IGS.
Due to the emerging nature of IGS, there are many remaining open questions with respect to how to best design and implement an IGS system.For example, it is unclear at which level -national, regional, or at the level of a city -the integration between genetic and contact tracing data should be carried out, and how structured case interviews for backward contact tracing should best be conducted.The two clusters presented here clearly demonstrate the benefits of integrating local knowledge acquired 'on-the-ground' with data gathered by surveillance systems in different cities or countries; in the future, integration of local systems with larger national or European networks may enable the improved characterisation of introduction events and viral strain flow across cities and states.
Our work demonstrates the feasibility of tracing SARS-CoV-2 infection chains through a locally implemented system and during the later phases of the pandemic with high-incidence community transmission in an internationally connected city.This study complements existing studies from earlier phases of the pandemic [12,18] or from national or state-level genomic surveillance systems [7,11,13].While the developed IGS system is currently limited to the tracing of SARS-CoV-2, its future potential applications include other emerging pathogens or multi-resistant bacterial pathogens.
Any supplementary material referenced in the article can be found in the online version.
This article is copyright of the authors or their affiliated institutions, 2022.

Figure 3
Figure 3Distribution of SARS-CoV-2 cases according to time, their cluster, and epidemiological or genomic basis for inclusion in a cluster, Düsseldorf, Germany, 01 July-31 July 2021 (n = 850 cases)