High impact of molecular surveillance on hepatitis A outbreak case detection in Sweden: a retrospective study, 2009 to 2018

Background Swedish hepatitis A surveillance includes sequence-based typing, but its contribution to outbreak detection in relation to epidemiological investigations has not been fully evaluated. Aim To evaluate the role of sequence-based typing in hepatitis A outbreak detection and to describe the hepatitis A epidemiology in Sweden to improve surveillance. Methods We retrospectively investigated hepatitis A virus sequences of 447 cases notified in Sweden 2009–18. We performed a phylogenetic analysis of evolutionary distances to identify cases with similar virus sequences (≥ 459/460 identical nt in the VP1/P2A junction). Unique sequences, dyads and sequence-based clusters (SBCs) were identified. We linked non-sequenced cases by epidemiological information and retrospectively assessed the value of typing for outbreak identification. Results Fifty-five percent (n = 542/990) of the notified hepatitis A cases were referred to the Public Health Agency of Sweden for typing and 447 (45%) were sequenced successfully. Subgenotypes included IA (42.5%, n = 190), IB (42.7%, n = 191) and IIIA (14.8%, n = 66). Phylogenetic analysis identified 154 unique sequences, 33 dyads (66 cases) and 34 SBCs (227 cases). The combination of molecular and epidemiological data revealed 23 potential outbreaks comprising 201 cases. Cases were linked by sequence (59%, n = 118), epidemiological data (11%, n = 23) or both (30%, n = 60). Typing was needed to identify 15 of 23 potential outbreak signals. Conclusion Sequence-based typing contributed substantially to detecting clustering cases and identifying outbreaks in Sweden. The results show routine sequence-based typing detects outbreaks, promotes timely outbreak investigations and facilitates international collaboration.


Introduction
Hepatitis A virus (HAV) infections have gained increased awareness in Europe in recent years. This is mainly due to large and prolonged European-wide outbreaks [1][2][3][4][5][6][7][8][9][10] and an interrupted trend of decreasing hepatitis A notification rates [11]. In Sweden, the number of yearly reported hepatitis A cases is low [11], but outbreaks occur at the local/regional and national levels and Swedish cases are repeatedly identified as part of European outbreaks.
HAV is primarily transmitted through the faecal/oral route and sources of transmission include contaminated water or food products, close contacts (e.g. at school, shared household or intimate contacts), contaminated blood products and shared equipment for intravenous drug use. The incubation period is ca 28 days (range: 15-50 days) [12,13]. Six HAV genotypes are known, of which I-III are infectious for humans, and can be further divided in subgenotypes A and B, respectively [5,12]. Genotype I is most prevalent globally, genotype III appears predominantly in south Asia and Eastern Europe and genotype II is rarely reported [14]. The mutation rate of the HAV genome is slow [15]. During hepatitis A outbreak investigations clusters are commonly defined from sequences that are identical or have small genetic variations. In international and prolonged outbreak situations, acquisition of larger variations of up to 3 nucleotides (nt) difference in a ca 460 nt fragment (0.7%) have been reported [1,4,7,16].
It is essential to rapidly and comprehensively identify outbreak cases in order to understand the extent and likely source of the outbreak. Sequence-based typing is critical for connecting cases and determining the source of the outbreak, especially for cross-border outbreaks [1,[4][5][6][7]9,10,17,18]. With sequence-based typing, a HAV strain of a contaminated food item may be linked to the outbreak strain and identified as the source [1]. Yet, as per the last systematic assessment done in 2016, ca 50% of European countries do not perform sequence-based typing, in particular those countries with high endemicity [19]. Therefore, the number of domestic outbreaks and cases linked to European outbreaks are likely underestimated.
In Sweden, hepatitis A is a notifiable disease by law and is monitored by a national passive surveillance system. Since 2006, sequence-based typing of clinical samples has been carried out by the Public Health Agency of Sweden (PHAS) as part of the national microbiological surveillance programme. Changes in the programme's focus on domestic versus travel-associated cases has led to varying proportions of samples being referred for sequencing. Obtained sequences are stored and continuously analysed.
This study aimed to retrospectively describe the epidemiology of hepatitis A in Sweden from 2009 to 2018 and investigate the role of HAV sequence-based typing on outbreak identification during this period. Furthermore, the value of sequence-based typing is discussed and suggestions for improved HAV molecular surveillance are presented.

Study design and data source
All confirmed hepatitis A cases from 2009-18 mandatorily notified in the Swedish electronic system for communicable diseases surveillance (SmiNet), were included in the study. SmiNet is a common platform shared between the regional and national levels of public health agencies in Sweden.
For a case to be confirmed, a blood sample from a patient with clinically suspected hepatitis A is referred to a regional microbiological laboratory for anti-HAV IgM and/or HAV RNA detection. If positive, the treating physician and the laboratory notify the confirmed case to SmiNet. The regional County Council Departments of Communicable Disease Control and Prevention (CDC departments) work largely autonomously, they carry out case management and contact tracing, evaluate data at regional level and implement control measures. PHAS evaluates the SmiNet surveillance data at national level, has a coordinating role in national outbreak investigations and provides the sequence-based typing service (Sanger sequencing). This is free of charge and PHAS reports the results in SmiNet, visible to the CDC departments. For sequence-based typing, the regional microbiological laboratories refer samples to PHAS as part of an agreement between the respective regional CDC departments and PHAS, depending on the focus of the molecular surveillance program on domestic and travel-associated HAV infections ( Figure  1). The genomic HAV region sequenced at PHAS covers  to other cases, details on symptoms and suspected route of transmission) communicated between PHAS and the CDC departments during the study period was also included in the evaluation. Swedish population data were obtained from the national statistics office, Statistics Sweden [21].

Swedish case definition
Sweden applies the European case definition for hepatitis A [22]. A possible hepatitis A case is defined as having a clinical picture consistent with a hepatitis A infection and with an epidemiological link to another case or source. Confirmation of a case requires detection of a HAV-specific antibody reaction indicating an acute infection (IgM) and/or the detection of HAV RNA.

Definition of sequence-based clusters and outbreaks
Sequences were analysed for similarity of ≥ 99.7% (≥ 459nt/460nt) and identified as (i) HAV sequences of sporadic cases if no other similar sequence was found; (ii) dyads if two similar sequences were found; (iii) sequence-based clusters (SBCs) if three or more similar sequences or one or more sequences matching a European outbreak case definition were found. We added further cases to the SBCs if they had an epidemiological link to a SBC case. An epidemiological link was considered if a case had contact with a case in a cluster (e.g. lived in the same household, visited the same school or work place or leisure activities or was a close or intimate contact) or had been exposed to the same suspected or confirmed vehicle of transmission.
A potential outbreak was considered if one or more cases in the SBC matched to a European outbreak strain, or if three or more cases in a SBC were infected less than 8 weeks apart (twice the mean incubation time) and/or shared an epidemiological link. Cases in the same household or family were counted as one case in the assessment of outbreaks. Investigations of European HAV outbreaks have been coordinated by the ECDC and updates were published in rapid risk assessments (RRA).
We evaluated the contribution of sequence-based typing to outbreak identification by defining each outbreak case as linked by (i) sequence only; (ii) epidemiological information only; (iii) both types of information. We investigated how often sequence information contributed to cluster sizes meeting or exceeding our predefined outbreak criteria of three cases and international outbreak threshold of one case linked to an international outbreak strain in the European Union/European Economic Area (EU/EEA).

Data analysis
HAV sequences were aligned using CLC Main Workbench version 7.9 (Qiagen, Aarhus, Denmark). MEGA7 software version 7.0.14 was used to compute a phylogenetic tree from this alignment based on the 460nt HAVNet region using the neighbour-joining method [23]. Bootstrap analysis was computed with 1,000 replicates. Evolutionary distances were computed using the Tamura-Nei method [24]. Data management and statistical analysis were performed using STATA version 15.0 (StataCorp, Texas, United States).
Categorical variables were described as proportions and 95% confidence intervals (CI), and continuous variables were described using means and standard deviations (SD), or medians, ranges or interquartile ranges (IQR). Variables were compared using chi-squared tests for categorical variables, and t-tests or the nonparametric Wilcoxon rank-sum test for continuous variables. Trends were assessed using an extension of the Wilcoxon rank-sum test. Observations with missing values for variables under comparison were excluded from the respective analyses. Time to sampling was calculated as the time between the date of symptom onset and date of sampling. We used an alpha level of 0.05 for all statistical tests. Stata outputs of p < 0.000 were reported as p < 0.001.

Ethical statement
This study only included anonymised surveillance data without personal identifiers, therefore no ethical approval was needed according to Swedish national regulations. This study did not affect the diagnosis or the therapeutic strategy.

Study population and hepatitis A epidemiology characteristics
During the 10-year study period, 990 confirmed hepatitis A cases were notified to SmiNet with a median of 92 reported cases per year (range: 54-154) ( Figure  1A). This reflects a median notification rate of 0.9 cases per 100,000 citizens and year (range: 0.6-1.7). Approximately half the study population was male (52%, n = 518), and 52% were travel-associated (n = 513) (

Sequence-based typing of confirmed hepatitis A cases
Of the 990 notified hepatitis A cases, samples from 542 cases (55%) were referred to PHAS for typing and 447 (45%) were successfully sequenced ( Figure  2A). The majority of identified subgenotypes were IA (42.5%; n = 190 cases) and IB (42.7%; n = 191 cases), with a minority of cases infected with subgenotype IIIA (14.8%; n = 66 cases) ( Table 1). The frequency of referral for typing was different among age groups and increased with age (trend test, p < 0.001) ( Table 2). No difference in the number of male or female cases being referred for typing (p = 0.911) was observed. Symptomatic cases were more likely to be referred for sequencing than asymptomatic cases (p < 0.001), also, domestic cases were more likely to be referred than travel-associated cases (p < 0.001) (

Outbreak identification and contribution of sequence-based typing
A phylogenetic tree was computed, and all obtained 447 sequences ( Figure 2B) were analysed for similarity. We identified 154 sequences as belonging to sporadic cases (34%), 66 sequences clustered in 33 dyads (15%) and 227 sequences clustered in 34 SBCs (51%). The SBCs varied in size and contained up to 28 sequences. The proportion of sequences from travelassociated cases within these three categories were 65% (100/154), 45% (30/66) and 32% (73/227) respectively, and the categories had a similar age group distribution (p < 0.537).
An additional 23 non-sequenced cases could be linked to SBCs by epidemiological information, which led to a total of 250 clustering cases. Forty-nine cases were discarded as they neither occurred within 8 weeks of another case in the cluster, nor did they share a known epidemiological link to another case in the cluster. The remaining 201 potential outbreak cases were defined as 23 independent potential outbreaks ( Figure 2B). These were spread across the whole study period (Table  3) and the majority were of subgenotype IA (n = 13; 57%) followed by subgenotypes IB (n = 7; 30%) and IIIA (n = 3; 13%) ( Table 3). Sweden is divided into 21 counties and potential outbreak cases were distributed over on average three counties (SD ± 2 counties; range: 1-8). Moreover, for 13 of the 23 national outbreaks, we found strains that were detected elsewhere in Europe at the time ( Table 3).
Strains of eight SBCs reappeared an additional 1-5 times up to 6.5 years apart without causing another potential outbreak cluster. Only one HAV IA strain appeared on three occasions in the 10-year period and generated potential outbreaks in two of the occasions 4 years apart (outbreak numbers 3 and 12 in Table 3).
Each potential outbreak case was evaluated with respect to the type of information needed to link it to that potential outbreak. Cases were linked by typing information only (n = 118; 59%), epidemiological information only (n = 23; 11%) or both types of information (n = 60; 30%) ( Figure 2A, Table 3).
For 15 of the 23 potential outbreaks, the predefined outbreak criteria would not have been met had typing information not been available. Therefore, typing was considered essential for the identification of these potential outbreak clusters (Table 3).

Discussion
We observed a median yearly notification rate for HAV infections of 0.9 per 100,000 inhabitants during the 10-year study period in Sweden, comparable to other European countries [25,26]. Vaccination confers immunity against HAV infection but is not implemented free of charge in the childhood vaccination programmes of many European countries, including Sweden [25,26].
In the absence of vaccine or infection derived immunity, population susceptibility is presumably high and outbreaks are more likely to occur [25,26], as was observed by previous years' prolonged outbreaks in Sweden and Europe [1][2][3][4][5][6][7][8][9][10]. Swedish hepatitis A surveillance and typing data were sporadically published as part of European RRAs coordinated by the ECDC or through joint European outbreak reports [1][2][3][4][5][6][7]10], and sequences were also shared in HAVNet. In the present All cases in household clusters are listed. These were counted as one case in the outbreak assessment if typing was necessary to identify the respective potential outbreak cluster. study, all data from the Swedish national molecular hepatitis A surveillance from 2009-18 were compiled and analysed for the first time.
In our combined approach of performing a phylogenetic analysis of sequence similarity, integrating epidemiological data and applying outbreak detection criteria, we identified 23 potential outbreaks. Twentytwo of these had also been investigated at the time the outbreak occurred and 10 were described in peerreviewed literature (Table 3) [1][2][3][4][5][6][7][8]10]. One outbreak was not recognised at the time even though a similar strain was detected in another European country at the same time, as this was only communicated several years later (outbreak number 7 in Table 3). This suggests that a sequence-based approach is successful at identifying outbreak signals. The published outbreak strains were linked to food vehicles [1][2][3][5][6][7][8], sexual transmission (men who have sex with men (MSM)) [4] and travelling to a HAV endemic country [10]. During the study period the majority of HAV infections were of subgenotype IA or IB (42.5% and 42.7%, respectively), which fits well with the generally detected subgenotypes in Europe [14].
One IA strain was detected in Sweden three times and caused known outbreaks on two occasions (outbreak numbers 3 and 12 in Table 3). This strain was likely reintroduced to Sweden on each occasion since it was circulating in central Europe during the same time. We did not detect any strain with sustained endemic circulation. Only eight strains were detected occasionally in sporadic cases, with long time spans in between. The majority of clustering cases (59%) required typing to link them to a potential outbreak cluster. Moreover, we found that two of three potential outbreak clusters required typing to be identified. This clearly underlines the relevance of HAV typing to detect outbreak cases. Typing offers a fast and unambiguous conclusion on relatedness of cases' strains, which is especially important considering the long incubation time of hepatitis A of up to one month. This long delay from exposure to case notification makes it difficult to draw immediate conclusions on relatedness of cases by patient interviews or epidemiological information only. Typing is a useful tool for vigilant surveillance to detect outbreaks early before they become apparent by the sheer number of cases. Moreover, it is essential to detect clusters of cases that are too small to raise suspicion by case count only, as these may slowly spread over time and between countries in Europe. Typing is also useful for distinguishing circulating strains present at the same time, allowing unrelated cases to be excluded from an epidemiological investigation. This was seen several times during the study period in Sweden. In particular, in 2014 when an independent transmission of a HAV strain was recognised during a HAV outbreak in the same municipality. During 2012-14, typing was necessary to determine the different strains during multi-country outbreaks linked to frozen berries (outbreak numbers 8, 10 and 11 in Table 3) even though epidemiological information was sufficient in two of the three outbreaks to trigger an outbreak signal.
National outbreaks in Sweden were found to be distributed between counties and over time. Similarly, international outbreaks affecting several European countries were also connected to cases detected in Sweden. Wide geographical and temporal distribution of cases is usually linked to a widely distributed food source with a long shelf-life, or to travelling [19,27]. Specifically, contaminated frozen berries [1][2][3][5][6][7] and circulation of strains among MSM [4] have been implicated in European HAV outbreaks in recent years. While local outbreaks come to the attention of the regional CDC department, mostly as increased case counts, and response measures are implemented quickly, a widespread outbreak will still benefit from linking cases by sequencing. Often, case linking will trigger an investigation. In Sweden, PHAS has a national responsibility for overseeing all HAV cases, and is therefore able to assist with coordinating between regions; a collaboration that is important to link cases and identify outbreaks across regional boarders. Likewise, on a European level, HAVNet and European agencies (the ECDC's EPIS-FWD network and the European Food Safety Authority (EFSA)) facilitate collaborations among European public health agencies and the food sector, and the sharing of sequence data. Such cooperation has proven key to quickly identifying the extent of the geographical spread of outbreaks in Europe, identifying outbreak sources and introducing targeted control measures [1,17]. These measures are necessary to prevent the spread of the pathogen in the population, especially when facing increasing population mobility and a shared European/global food market [17,28].
Overall, 55% of all samples were referred for typing. Reasons for incomplete typing were, in part, due to a previous typing policy at PHAS, which mainly focused on domestic cases. Under this policy it was not deemed necessary to type samples from all cases involved in HAV transmissions in local community settings with epidemiologically well-linked clusters, for example in a day-care centre or in a family. Moreover, according to a previous outbreak investigation [29], not all cases are sampled in these settings which likely explains the low proportion of asymptomatic cases in the 0-4 years age group found in this study (34%) compared to other reports [12]. Increased testing and typing in outbreak settings among children may be advisable in order to detect more cases and provide comprehensive typing information. Increased typing would contribute to improved outbreak detection and facilitate the implementation of preventive measures. From 2020, PHAS aims to type all notified hepatitis A cases in Sweden. This target has not yet been reached; a typing fraction of 66.7% was achieved (38/57) in 2020.
Ninety-five samples could not be typed successfully during the study period. These samples were associated with a longer time delay between symptom onset and sampling, which likely affected the amount of viral RNA in the samples. Within 1-2 weeks after infection, viraemia occurs and may last for several more weeks, albeit at quickly declining levels [15]. Therefore, earlier sampling may lead to a further increase in positive typing frequency. Sampling of stools, where virus shedding is observed for a longer time period, could be an alternative.
To our knowledge, this is the first comprehensive evaluation of the contribution of sequence-based typing to hepatitis A outbreak identification in Sweden. Study limitations include a possible underestimation of the number of outbreak cases due to a large number of cases that were not sequenced, as well as varied completeness of the epidemiological information available for each case. It was not in the scope of this study to improve or collect those data retrospectively. We do not expect a selection bias of samples based on budget considerations since the costs for typing lay with PHAS. For a number of potential and known outbreaks, the source was unknown or suspected but not confirmed, which prevented us from analysing the impact of typing on identifying outbreak sources. Our thresholds for number of cases needed to identify potential outbreak clusters remained low and are therefore sensitive. In particular, the threshold of one case with a matching strain to an international EU/EEA outbreak cluster is very low, but was necessary to detect single cases of cross-border events, which are otherwise categorised as sporadic cases. A national outbreak criterion of at least three non-household cases was feasible to use in a very low incidence setting where similar HAV strains are seldom encountered unrelatedly. If used prospectively, it is also valuable to detect common clusters early, trigger an investigation and prevent larger outbreaks. Such low thresholds cannot be applied in high incidence settings with endemic circulation of highly similar or identical strains.

Conclusions
This study confirmed a high impact of sequence-based typing for the detection of HAV outbreak cases and outbreaks in Sweden, and advocates for microbiological hepatitis A surveillance programs to include sequencing of HAV strains. Importantly, sequence analysis should be used to support outbreak investigations together with epidemiological data, as has been demonstrated in previous outbreaks in Sweden. In our experience, prompt sequencing, timely analysis, the maintenance of a national database and consulting the international database HAVNet strongly facilitates data interpretation. Sharing sequences in HAVNet, and by other means, internationally, ensures recognition of circulating strains and facilitates collaboration. While comprehensive sampling and typing would allow the most complete overview of HAV strains in a country, limited resources likely demand selection of strains to be sequenced, especially in a medium to high incidence setting. In such settings, studies evaluating the cost-effectiveness of sequence-based typing are needed in order to recommend such programmes from a societal perspective.