Improving the surveillance of tuberculosis (TB) is especially important for multidrug-resistant (MDR) and extensively drug-resistant (XDR) TB. The large amount of publicly available whole genome sequencing (WGS) data for TB gives us the chance to re-use data and to perform additional analyses at a large scale.


We assessed the usefulness of raw WGS data of global MDR/XDR isolates available from public repositories to improve TB surveillance.


We extracted raw WGS data and the related metadata of isolates available from the Sequence Read Archive. We compared this public dataset with WGS data and metadata of 131 MDR- and XDR isolates from Germany in 2012 and 2013.


We aggregated a dataset that included 1,081 MDR and 250 XDR isolates among which we identified 133 molecular clusters. In 16 clusters, the isolates were from at least two different countries. For example, Cluster 2 included 56 MDR/XDR isolates from Moldova, Georgia and Germany. When comparing the WGS data from Germany with the public dataset, we found that 11 clusters contained at least one isolate from Germany and at least one isolate from another country. We could, therefore, connect TB cases despite missing epidemiological information.


We demonstrated the added value of using WGS raw data from public repositories to contribute to TB surveillance. Comparing the German with the public dataset, we identified potential international transmission events. Thus, using this approach might support the interpretation of national surveillance results in an international context.


Supplementary data

