Research articles Standardisation of multilocus variable-number tandemrepeat analysis (MLVA) for subtyping of Salmonella enterica

Salmonella enterica serovar (S.) Enteritidis is an important cause of food-borne infection in Europe and the United States. Further subtyping of isolates is necessary to support epidemiological data for the detection of outbreaks and identification of the vehicle of infection. Multilocus variable-number tandem-repeat analysis (MLVA) is reportedly more discriminatory and produces data that are easier to share via databases than other molecular subtyping methods. However, lack of standardisation of the methodology and interpretive criteria for data analysis has meant that comparison of data between laboratories can be problematic. On the basis of MLVA profiles of 298 S. Enteritidis isolates received at the Health Protection Agency’s Salmonella Reference Unit and sequence analysis of selected isolates, we propose a MLVA scheme for S. Enteritidis based on five loci (SENTR4, SENTR5, SENTR6, SENTR7 and SE-3) that have been selected from previously published S. Enteritidis MLVA schemes. A panel of reference strains has been developed that can be used by laboratories to normalise their raw fragment data to actual fragment sizes. We also provide recommendations for analysing and interpreting MLVA data. We urge laboratories to consider implementing these guidelines, thereby allowing direct comparison of data between laboratories irrespective of the platform used for fragment analysis, to facilitate international surveillance and investigation of international outbreaks.


Introduction
Non-typhoidal Salmonella enterica cause a considerable disease burden, with an estimated 93.8 million cases of infection worldwide every year, resulting in 155,000 deaths [1].In the European Union, this pathogen is ranked second among the causes of bacterial gastrointestinal disease; S. enterica serovar (S.) Enteritidis is responsible for approximately 60% of salmonella infections in humans, making it the leading cause of salmonellosis [2].Further subtyping of isolates is needed to support classical epidemiological data for the detection of outbreaks and identification of the vehicle of infection.
Phage typing is a phenotypic method traditionally used for surveillance and subtyping of salmonellae but is performed in only a few laboratories due to the requirement for standardised phage panels [3].Although useful in detecting outbreaks caused by isolates exhibiting less common phage reaction patterns, the technique may otherwise lack discriminatory capacity -20% of salmonellosis cases reported in Europe in humans in 2006 were caused by S. Enteritidis phage type (PT) 4 [2].Subsequently, DNA fingerprinting techniques such as pulsed-field gel electrophoresis (PFGE), which is currently the gold standard for subtyping of salmonellae, are used in outbreak investigations to supplement phage typing data where further strain discrimination is required.However, the lack of genetic variation within the serovar Enteritidis population can make discrimination beyond phage type difficult.In a multicentre European study conducted between 2001 and 2004, over half of the serovar Enteritidis strains produced PFGE profile SENTXB.0001regardless of phage type [4].Strong associations between phage type and a particular PFGE pattern also make strain differentiation within certain phage types challenging [4].
Multilocus variable-number tandem-repeat analysis (MLVA) targets rapidly evolving genomic elements known as tandem repeats (TRs).The application of MLVA to a variety of bacterial species including several salmonellae serovars has led to the conclusion that the technique is more discriminatory than other molecular methods, and is reproducible, quicker, easier to perform and produces data that are easier to analyse and share via databases.To date, several schemes for MLVA subtyping of S. Enteritidis have been published [5][6][7][8][9][10].However, the use of different loci in each protocol (or different primers for the same loci), different sequencer platforms, dye chemistries and size standards used for fragment analysis, differences in interpretation of loci where incomplete TRs or TRs of heterogeneous sequence occur, and different ways of assigning allele numbers means comparison of data between laboratories can be problematic.In addition, with few data available on the stability of the loci, it is uncertain whether TRs may evolve so rapidly that variation leading to multiple types could emerge during an outbreak caused by a single ancestral isolate.Such concerns threaten to diminish the utility of MLVA for S. Enteritidis outbreak detection unless specific guidance is developed for performing MLVA and a consensus reached for the interpretation of MLVA data, as has been proposed for S. Typhimurium [11].
In this study we analysed the DNA sequence of the TR regions at the nine loci used in the MLVA method proposed by Malorny et al. to determine the actual number, stability and heterogeneity of the TRs [8].This scheme was chosen due to its emphasis on discrimination within PTs 4 and 8, which rank in the top three most prevalent PTs in several European countries [2].We also examined other published MLVA schemes for S. Enteritidis to determine whether other loci could be used to supplement the scheme.
Here we propose a standardised MLVA typing scheme for S. Enteritidis targeting five loci, where profiles are assigned based on the number of TRs at each locus.In addition, we developed a panel of reference strains that can be used by laboratories to normalise their raw fragment data to actual fragment sizes, thereby allowing direct comparison of data between laboratories irrespective of the platform used for fragment analysis.

S. Enteritidis strains
To evaluate the MLVA method, 298 S. Enteritidis strains were selected from the Health Protection Agency (HPA) Salmonella Reference Unit culture collection, which consists of strains originating from human clinical specimens, animals, food and the environment.The strain panel comprised 91 strains from the phage typing 'type strain' panel (the first recorded isolations of each phage type), 88 strains isolated in England and Wales (30 strains of PT14b, 21 strains of PT4, 15 strains of PT8, nine strains of PT42 and one to three strains of PTs 1, 1b, 2, 3, 5c, 6, 21, 22, 59 and 14c) and 15 PT14b strains from University Hospital Galway, Ireland.In addition, strains isolated during well-characterised outbreaks in 2009 and 2010 associated with S. Enteritidis PTs 4 (n=23), 8 (n=26), 14 (n=11) and 14b (n=40, representing 12 geographically distinct outbreaks, A-L), were analysed to determine TR stability during an outbreak and therefore the utility of MLVA for outbreak detection (Table 1).A total of 16 strains that cover the range of alleles seen at each locus were selected as the reference strain panel.

Table 1
Outbreaks due to infection with Salmonella enterica serovar Enteritidis characterised in this study MLVA: multilocus variable-number tandem-repeat analysis; NA: no amplification at this locus.a Numbers in bold indicate locus variants.7* refers to an allele with seven tandem repeats (TRs), but TR2 lacks a 6 base pair (bp) insert and TR6 is missing 21 bp.
Strain NCTC 13349 from the National Collection of Type Cultures was included in our fragment analysis protocol as a positive control.

Comparison of published MLVA schemes for S. Enteritidis strains
Published MLVA schemes for S. Enteritidis other than that of Malorny et al. [8] were examined to determine whether other loci [5][6][7]9,10] could be used to supplement the scheme using the Tandem Repeats Database [12].We used the TR region targeted and length and sequence similarity of the TR units to compare the loci.

MLVA typing
MLVA was performed using previously described primers [8].Forward primers for loci SENTR1, SENTR4, SE-3 and SE-7 were labelled with the fluorescent dye VIC, SENTR2, SENTR5 and SENTR7 with 6-FAM, and SENTR3 and SENTR6 with NED.The nine loci were amplified in one multiplex PCR (12.5 μl volume) using a Multiplex PCR Kit (Qiagen, United Kingdom), 0.5 pmol of the primers amplifying loci SENTR4 and SENTR7, 1 pmol of primers targeting SENTR5, SENTR6 and SE-3, 5 pmol of primers targeting SENTR1, 7.5 pmol of primers targeting SENTR3 and SE-7, 10 pmol primers targeting SENTR2 and 1 μl cell lysate prepared by emulsifying one colony in 100 μl of sterile distilled water and boiling for 10 minutes.PCR cycling conditions were as previously described [8].Amplification products were diluted 1:40 in sterile distilled water and 1 μl aliquots of this dilution were mixed with 10 μl Hi-Di formamide (Applied Biosystems, United Kingdom) and 0.5 μl GeneScan1200 LIZ Size Standard (Applied Biosystems) before being subjected to capillary electrophoresis using POP7 polymer on an ABI 3730 DNA Analyzer (Applied Biosystems) spectrally calibrated to run filter set G5.
Data were imported into Peak Scanner software (Applied Biosystems) where each fragment was identified according to colour and size.Naming of profiles was based on a string of allele numbers (in order of SENTR7-SENTR5-SENTR2-SENTR6-SENTR3-SENTR4-SE3-SENTR1-SE7) showing the actual number of repeats at each locus.We assigned 'NA' (no amplification) to loci that failed to amplify, in accordance with guidance issued by Larsson et al., to distinguish between strains where the locus is absent and those where there are no TRs but the flanking regions are present [11].

Validation of MLVA results
A combination of agarose gel electrophoresis (for loci SENTR1-3 and SE-7) and DNA sequencing (for loci SENTR4 to SENTR7 and SE-3) was used to determine g The Tandem Repeats Database was unable to identify a TR at this locus, presumably as there is only one copy in NCTC 13349.The TR sequence reportedly located at this locus in serovar Typhimurium [11] is not found in serovar Enteritidis.
the number of TRs in the first 100 strains tested.
Thereafter DNA sequencing was used only to confirm the number of TRs in novel amplified fragments identified by capillary electrophoresis.Amplification was performed on cell lysates in a monoplex PCR with the same primer sequences used for MLVA but with unlabelled forward primers.Sequencing was performed in one direction only using the forward primer for all loci except SENTR7, which was sequenced with the reverse primer.Loci of the 16 strains chosen as the reference strain panel were sequenced in both directions to determine the sequence of the TRs and flanking regions.
Sequencing data were imported into Bionumerics version 6.1 (Applied Maths, Belgium) as categorical data and the numbers of TRs at each locus calculated; only complete TRs were included in the analysis.Standard minimum spanning trees generated in Bionumerics using the single and double locus variance priority rules were used to visualise the relationships between strains.Alignments of TR sequences were performed and degree of sequence identity between copies of each TR calculated using BioEdit version 7.0.9.0 [13].

Comparison of published MLVA schemes for S. Enteritidis
The majority of loci targeted in published MLVA schemes for S. Enteritidis do not contain true TRs, as the repeat sequence varied in length and sequence within NCTC 13349; only loci SENTR4 to SENTR7 contained TRs that were 100% conserved (Tables 2 and  3).There was considerable overlap in the loci targeted between the scheme of Malorny et al. [8] and other published schemes, though with the exception of loci SE-3 and SE-7, different primers were used.No additional loci were therefore identified that could enhance the discriminatory capacity of the Malorny et al. scheme.

MLVA validation using DNA sequencing
Agarose gel electrophoresis and DNA sequencing were used to determine the number of TRs at each locus in the first 100 isolates analysed.This confirmed that fragment size always correlated with multiples of TRs.Subsequent strains were analysed by fragment analysis and sequencing was used to determine the number of TRs in any novel-sized fragments.The 16 reference strains were selected from the 298 strains examined to

Table 4
Reference strains for MLVA of Salmonella enterica serovar Enteritidis  GGAAGCGAACCTGTCGAACCGGC represent most of the alleles observed for each locus and the amplicons from each locus were sequenced (Table 4).Sequencing confirmed the number of TRs and alignments revealed that loci SENTR1, SENTR2, SENTR3 and SE-7 exhibit variation in the sequence of the TR unit within a strain and between strains (Table 3, Figure 1).Single nucleotide polymorphisms (SNPs) were identified in the TRs of reference strains HPA002, HPA005, HPA015 and HPA016 compared with NCTC 13349 (Figure 1, panels A-C), while reference strains HPA004 and HPA015, and HPA003, HPA014 and HPA016 shared the same number of TRs at locus SE-7 (seven and nine TRs, respectively) but the order of the TRs was different in each strain (Figure 1, panel D).The TR units at locus SENTR5 were identical in all strains examined except for reference strain HPA016 and strain H101440321; in these two strains the first three repeats were GACCAC-GACCAC-GGCCAT instead of the expected GACCAT.On the basis of sequencing data for locus SE-3, we propose that the TR unit sequence at locus SE-3 should be changed from <T>WATTG<G>BTTTCCW (where W=A or T, B=C, G or T and '< >' indicates nucleotides inserted into the TR) to TTTTCCATATTG.This TR unit is consistently 12 bp long and 100% conserved in strains carrying multiple copies, unlike the TR unit previously proposed (Table 2).The TRs at loci SENTR4, SENTR5, SENTR6, SENTR7 and SE-3 were identical in all strains and all copies were the same within a strain.The flanking regions were generally conserved, with SNPs being identified only in the 5' flanking region at locus SENTR3 in HPA016 and the 5' and 3' flanking regions of locus SENTR5 in HPA016.Fragment sizes obtained by capillary electrophoresis were consistently smaller than actual fragment sizes as determined by sequencing, ranging from a one base pair (bp) to a 19 bp difference.This difference was most pronounced in fragments amplified from loci harbouring the longer TR units (SENTR1, SENTR2, SENTR3 and SE-7).

Discussion
The ability to identify isolates belonging to an outbreak and differentiate them from concurrent sporadic isolates is essential for the investigation of communicable diseases.Without a discriminatory typing method coupled with good classical epidemiological data, it would be extremely difficult to identify the source and route of transmission of infection, thereby making it almost impossible to implement appropriate intervention strategies.This is particularly important for highly clonal bacterial groupings such as S. Enteritidis, where the heterogeneity between isolates is limited.However, lack of standardisation of the methodology and interpretive criteria for data analysis has meant that comparison of MLVA data between laboratories can be problematic, thereby hindering attempts at international surveillance and investigation of outbreaks involving more than one country.Given the multinational distribution of some food products, collaboration between countries can be crucial in identifying cases and in tracing the source of infection.
Previously published S. Enteritidis MLVA schemes have named profiles based on allele numbers that may or may not reflect the number of TRs at each locus; in addition, the schemes vary in their treatment of partial TRs.This lack of congruence means not only that data cannot be easily compared between laboratories using different naming schemes, but also that it is difficult to assess the true relationship between isolates exhibiting variation in their MLVA profiles where allele numbers do not accurately reflect the number of TRs.We chose to follow the same principle adopted by MLVA schemes for S. Typhimurium and other bacterial pathogens by naming alleles based on the number of TRs at each locus [11,16,17].We also recommend that only whole TRs are included and that partial TRs are rounded down to the nearest integer to simplify reporting of MLVA data.The length of partial TRs as determined by sequencing was constant; therefore including them in data analysis did not improve the discriminatory capacity of the technique.
Comparison of the published MLVA schemes for S.
Enteritidis did not identify any further loci that could be added to the scheme of Malorny et al. [8] to increase the discriminatory capacity.In addition, we recommend that loci SENTR1, SENTR2, SENTR3 and SE-7, which harbour the longest TR units in the scheme, are excluded due to the observed variation within TR unit sequence, which could not be detected reliably by capillary electrophoresis.Unless TR regions are sequenced, this variation would be overlooked and strains would be clustered incorrectly.Removal of primers targeting SENTR1 and SENTR2 would also prevent amplification of a 346 bp VIC-labelled fragment, which results from amplification with primers SENTR1-F and SENTR2-R, in a small number of strains (Table 4).Loci SENTR1, SENTR2 and SENTR3 previously showed low Nei's diversity indices (0.07), while SE-7 exhibited a higher diversity index of 0.63 [8].Removing these four loci before cluster analysis revealed that this had little effect on the number of profiles detected (65 profiles compared with 71 with nine loci) and had no major effect on clustering in the minimal spanning tree (Figure 2).Boxrud et al. also observed that removing low-diversity loci from their analyses had little effect on the discriminatory capacity of MLVA [5].Longer TR units have been suggested to serve as a molecular clock, leading to the possibility that sequencing of these loci could be used to determine phylogenetic relationships within S. Enteritidis, as has been proposed for Mycobacterium tuberculosis [5,18].We also propose that the TR unit sequence at locus SE-3 should be changed to TTTTCCATATTG; this TR unit is consistently 12 bp long and 100% conserved in strains carrying multiple copies, unlike the TR unit previously proposed (Table 2) [5,7,8].The MLVA scheme would therefore consist of five loci (allele string reported as SENTR7-SENTR5-SENTR6-SENTR4-SE3), which are of consistent length and 100% conserved.
MLVA fragment sizes obtained by capillary electrophoresis frequently differ from the sequenced length due to variations in the sequencer model, size marker and primer fluorophores used.DNA composition of PT14 outbreak PT8 outbreak PT4 outbreak the fragment also plays a part, as demonstrated in this study by locus SE-7 of reference strains HPA03, HPA014 and HPA016, which differed in size by up to 3 bp despite each having nine TRs (data not shown).A panel of 31 S. Typhimurium strains with fragment sizes verified by sequencing was compiled by the Statens Serum Institut, Denmark, and made available to allow laboratories with different set-ups to normalise their S.
Typhimurium MLVA data to the actual fragment sizes [11].We therefore recommend that laboratories use the set of 16 reference strains described herein (Table 4) to ensure compatibility of S. Enteritidis MLVA data between laboratories.
In this study, 71 different MLVA profiles were identified among the 298 strains, indicating that MLVA shows promise as a subtyping method for S. Enteritidis.MLVA was capable of subdividing isolates within a phage type, and in most instances multiple isolations of a phage type tended to cluster together by MLVA, as has previously been shown [8,19].However, isolates of different phage types may also share the same MLVA profile, as was shown here by 41% of isolates sharing profile 3-10-9-5-4-4-1-8-8 despite belonging to 39 different phage types.This is perhaps not surprising considering the two subtyping methods are determining strain diversity using two very different approaches, but highlights the importance of not relying on a single subtyping method and of combining laboratory data with accurate and meaningful epidemiological data when defining relationships between strains.We suggest that a combination of phage typing (where available) and MLVA may be useful for characterisation of S. Enteritidis isolates, as has previously been suggested [19].
We were concerned that TRs may evolve so rapidly that multiple types could emerge during the course of an outbreak.Isolates from 15 different outbreaks belonging to four different phage types were subtyped by MLVA to determine stability of the TRs.Previous studies have found that MLVA profiles remain stable during the course of an S. Enteritidis outbreak [5,8].The data presented here suggests that, as with S. Typhimurium [20], SLVs may occur sporadically during an outbreak (Figure 2).A DLV was identified among isolates from an outbreak caused by an unusual phage type, PT14, with strong epidemiological evidence to link this isolate to the outbreak.Only 16 isolates of PT14 have been identified since 1981, with the last report of two cases in 1997 (HPA Salmonella dataset).This indicated that DLVs may also be detected during an outbreak.Outbreak L was unusual in that two distinct MLVA profiles differing at six of the nine loci were identified, suggesting involvement of two different PT14b strains.This observation was confirmed by the two MLVA profiles belonging to strains with distinct PFGE profiles and exhibiting different antimicrobial resistance phenotypes (data not shown).On the basis of these data, the cut-off to allow classification of S. Enteritidis isolates as part of an outbreak could be defined as a difference of one TR at no more than two loci, with the analysis of more outbreaks needed to confirm this.
In conclusion, we propose an MLVA scheme for S. Enteritidis based on five loci (SENTR7, SENTR5, SENTR6, SENTR4 and SE-3) that show little or no variation in sequence length and diversity.A panel of reference strains has been developed that can be used by laboratories to normalise their raw fragment data to actual fragment sizes.Since this study was completed, two novel alleles have been identified at loci SENTR4 and SENTR6.These loci will be sequenced, the strains added to the reference panel and made available to laboratories on request.In addition, we encourage laboratories that have identified novel alleles to send us the strains, to add to the reference panel.We also provide here recommendations for analysing and interpreting data.We urge laboratories to consider implementing these guidelines, thereby allowing direct comparison of data between laboratories irrespective of the platform used for fragment analysis.MLVA profiles identified during outbreaks of S. Enteritidis may then be reported via the Epidemic Intelligence Information System (EPIS) of the European Centre for Disease Prevention and Control (ECDC) to public health laboratories.

Introduction
Avian influenza (AI) has received public attention since 1997 when human infections and thereof six fatal cases due to the highly pathogenic avian influenza A(H5N1) virus strain were confirmed in Hong Kong [1,2] and the pandemic potential of AI viruses was recognised [3].Since 2003, when avian influenza A(H5N1) reappeared, the World Health Organization (WHO) has reported 526 human infections with avian influenza A(H5N1), of which 311 were fatal, from Central Asian, European and African countries [4].In several areas, highly pathogenic AI in poultry has become endemic -with implications on human health, as exposure to sick or dead poultry is a risk factor for AI in humans [2,[5][6][7][8].Because of the pandemic potential of avian influenza A(H5N1), there is a great need for joint risk assessments and as a prerequisite for rapid international sharing of biological materials, reference reagents, epidemiologic data and other information when available, e.g. between WHO member states and WHO [9].
Unique efforts were made to share information on AI infections in humans, domestic poultry and wild birds [10,11], e.g. through the reporting of confirmed human cases under the International Health Regulations (2005), supported by the WHO Global Alert and Response System (GAR) [12].Case-based reports irrespective of the confirmation status have been mainly circulated by the Program for Monitoring Emerging Diseases of the International Society for Infectious Diseases (ProMED) [13].News agencies such as Reuters Alertnet [14], and public health authorities, including the European Centre for Disease Prevention and Control (ECDC) [15], the World Organisation for Animal Health (OIE)/Food and Agriculture Organization (FAO) network on animal influenza (OFFLU) [16], and the Global Initiative on Sharing Avian Influenza Data (GISAID) [17], have contributed in compiling and publishing updates on AI in humans and birds online.However, a uniform, case-based and thus statistically analysable epidemiological database of all human AI cases is not yet publicly available.
Germany, in need for timely information on the AI situation when Europe faced first avian influenza A(H5N1) cases in birds in 2005, established an AI monitoring system at the Robert Koch Institute (RKI) in October 2005, which captures case-based information on AI infections in humans, as well as animal cases with zoonotic potential, worldwide.This system proved particularly useful for situation updates, risk assessments and national risk communication from February 2006 onwards, when avian influenza A(H5N1) was detected in wild birds in Germany [18].Although the body of literature has continuously increased meanwhile, namely through WHO situation updates [2,[19][20][21][22][23], and virological or epidemiological studies [5,[24][25][26][27], the RKI AI monitoring system has been maintained to have a flexible database available for epidemiological evaluations.
With the aim to examine whether a systematic line list based on publicly available information on human AI cases would contribute to the understanding of the epidemiology of human AI, we assessed case characteristics, case fatality, and potential risk factors based on our established line list.

Monitoring system
The system established in October 2005, consists of a database, collecting events and reports in chronological order, and a line list of human cases.The present analysis is based exclusively on the line list and covers information on human AI cases reported between September 2006 and August 2010 and with a symptom onset date not earlier than September 2006.The monitoring followed a standardised operating procedure, defining information sources, intervals for screening the data and for the database management (as described below), and was maintained in Excel (version 11, Microsoft Corporation, Redmond, Washington, USA).

Information sources
All screened information sources for human AI cases were publicly accessible.They included WHO [12], ECDC [15], ProMED [13], as well as Reuters Alertnet [14].This range of sources was accessed to anticipate the extent of non-confirmed human AI and to assess the loss of information when ignoring them.All sources were screened on a daily basis (weekdays only).If an event was reported simultaneously by more than one source, and if there was conflicting information, WHO reports were ranked highest, followed by ECDC and ProMED.If an event was reported prior to a WHO report by another source, both the WHO and the initial report were recorded.

Line list
The line list covered demographic case information, namely the country to which the cases were assigned to in the initial reports, the patients' age (in years) and sex, date of symptom onset, date of hospitalisation, disease outcome, date of death, exposure to potentially infected poultry, as well as possible contact with infected individuals.Time intervals from symptom onset to hospitalisation, from hospital admission to outcome, the duration of hospitalisation, and the duration of illness were captured in days.The line list and a description of the variable set are provided online (http://www.rki.de/avian-influenza-linelist).

Case definitions
Cases were classified into three groups: confirmed cases, non-confirmed probable, and suspected cases, in a more simplified way than by WHO.Confirmed cases comprised avian influenza A(H5N1) human cases reported by WHO and with WHO confirmation, i.e. persons with defined clinical signs, epidemiological links and laboratory confirmation by an influenza laboratory accepted by WHO, as specified in the WHO case definition [28].
Other reported cases were (irrespective of their clinical presentation) considered as probable if they had exposure to WHO confirmed human cases, or to sick or dead poultry, or the AI virus infection was confirmed by the country or local institutions but not meeting WHO criteria.All other non-confirmed cases were defined as suspected cases.

Data analyses
The line list records were compared to the cumulative number of confirmed human cases of avian influenza A(H5N1) published by WHO [12].The delay (in days) between the date of WHO reporting, and the date of the first report by another source than WHO, was calculated for WHO confirmed cases.

Reported cases
In the study period, we captured 294 human AI cases in 12 different countries of which 235 (80%) were WHO confirmed, 35 (12%) were classified as probable, and 24 (8%) as suspected.The proportion of confirmed cases was highest in Egypt (98/99, 99%) and lowest in Indonesia (82/126, 65%).Numbers of reported WHO confirmed cases in our line list were largely congruent with cumulative case numbers published by WHO, except for Indonesia with 82 versus 102 cases, respectively (Table 1).This allowed for a close reproduction of WHO graphs on avian influenza A(H5N1) human cases by date of symptom onset and country, which reveal highest case numbers in the winter and spring season of the northern hemisphere (Figure 1).
The median delay from symptom onset to the initial report by any source was 11 days among 201 cases with available information (Table 1).Egypt had the shortest median delay of seven days.Fifty-two percent of the confirmed cases (123/235) were initially reported by another source than WHO in a median of three days prior to the WHO report (Table 1).The shortest median delay between the initial report and the WHO report was two days in China and Indonesia, whereas the longest median delay was nine days in Vietnam and the grouped remaining countries.

Demographic characteristics
Fifty-seven percent of confirmed cases (132/233 with available information) were women and 43% (101/233) men corresponding to a men-to-women ratio of 0.8.This ratio ranged from 0.6 to 1.3, with 0.6 in Indonesia, 0.8 in Egypt, 1.0 in the grouped remaining countries, 1.1 in Vietnam, and 1.3 in China.
The cases' median age was 18 years but was significantly higher in women than in men (21 years in women vs 14 years in men, p=0.04,Table 2).The median age differed markedly across countries.The lowest median age of six years was found in Egypt with significant difference between women and men (16.5 vs 4 years, respectively, p=0.002).In Egypt, the youngest age group (0 to 9 years) accounted for the highest number of cases with 53 of 98 cases (54%) and had the highest incidence of 284 cases per 10 million population of the same age group, over the four-year study period.
In contrast, Indonesia, China, and Vietnam had highest case numbers and incidences in the age group of 20 to 29 years (Figure 2).

Exposure to poultry
Ninety

Hospitalisation
All 228 cases with available information had been hospitalised.Patients were admitted to hospital in a median of four days after symptom onset (N=197, Table 3).The median time from symptom onset to hospitalisation ranged from two to five days, with two days in Egypt, two and a half days in the grouped remaining countries, four days in China and five days in Indonesia and Vietnam.No significant sex-specific differences were found in this delay (p=0.706).
A significant difference in time from symptom onset to hospitalisation between survivors and fatal cases was only found in Egypt (one day vs four and a half days respectively, p=0.001,Table 3).All 19 cases worldwide hospitalised eight days after symptom onset or later had died.

Discussion and conclusions
With this study, we summarised the current global AI situation in humans.It is, to our knowledge, the first study that not only analysed human AI cases worldwide on the basis of a line list collected over several years but in addition made these case-based data available online.We found that a longer delay from symptom onset to hospital admission and belonging to older age groups were associated with higher mortality in AI patients, and that the situation in Egypt differed markedly from other countries with highest AI incidences in children and lowest CFR.
With our line list, cumulative case numbers published by WHO [4] could be largely reproduced: 235 of 256 WHO confirmed cases (92%) and additional 59 unconfirmed cases were captured between September 2006 and August 2010.The identified median reporting   delay of 11 days after symptom onset may partly be explained by a deferred presentation to healthcare facilities as well as by the time needed for pathogen confirmation.About 52% of confirmed cases had been reported elsewhere in a median of three days prior to the WHO report.Because delays in availability of information could hamper investigations of the source of infection and of clusters of human cases [30], it could be beneficial to report and document probable cases in parallel with confirmed ones [31].
Confirmed cases had a median age of 18 years, which is consistent with earlier findings, although investigation periods and affected countries varied [2,19,21].The identified predominance of female cases in Indonesia and Egypt and the low age median among Egyptian cases support findings from previous studies [2,[23][24][25].Schroedl [32] examined the mean age of cases in Egypt over four seasons between August 2006 and July 2009 and found a declining age-based pattern over time, but did not address sex-specific differences.We found, in line with other studies, a significantly older age of female cases than male cases, whose proportion had increased since 2008 in Egypt [24,25].Chen et al., analysing AI cases worldwide before June 2006, also identified sex-specific differences in the agegroups of 4 to 6 years (95% male) and 25 to 30 years (83% female) [33].They assumed particularly high levels of exposure in pre-school boys playing outdoors and housewives taking care of fowl and frequenting live markets.Fasina et al. suggested a similar explanation for the situation in Egypt [25].
Ninety-six percent of the cases had reportedly direct or indirect contact to potentially infected poultry, recognised as the most important risk factor for humans AI [8,34].The WHO Clinical Case summary Form [35], where e.g."poultry" can be checked as "most likely source of infection" has enhanced the systematic collection of information since 2007.However, currently reported information yields little insights into the actual source of infection and the intensity and quality of exposure needed to infect humans [36][37][38].
The median time from symptom onset to hospitalisation was four days, which is remarkably stable when compared to earlier studies [19,21].If time to hospital admission is regarded as an indicator for monitoring case management and patients' awareness [31], no progress would be evident from a global perspective so far.
The cases' average CFR was 56%, which is widely consistent with findings from earlier investigation periods [2,19,23].Using a 19-month rolling CFR, we found a clear decrease in case fatality, which persisted when stratifying for Egypt and Indonesia.It could thus not simply be explained by a predominance of Egyptian cases since 2009.Regarding the decreasing CFR in Egypt, Schroedl [32] suggested that the circulating AI virus strain may have become less virulent and more apt to spreading among children.
Analytical results revealed lowest odds of dying for Egyptian cases, even when adjusted for age, sex and time to hospitalisation.Thus, the high proportion of survivors in Egypt cannot be entirely explained -as often assumed -by sex-specific differences in CFR [21,24] and the high proportion of children among AI patients in Egypt [5], as well as short delays from symptom onset to hospitalisation [25].
It cannot be ruled out, that different virus clades circulating in Egypt (clade 2.2) and Asia (clades 2.1 and 2.3) shape the country-specific epidemiological features [2,23].Differences in CFR across countries and changes over time might also partly be explained by differences in intensity and quality of exposure, health-seeking behaviour, reporting attitudes, overall performance of the surveillance system, and access to diagnostics and medical care [23,27,39,40], such as the time to start of oseltamivir treatment, the antiviral recommend by WHO for human infections with AI virus [2].However, country-specific details on its administration are widely unknown and it remains controversial up to how many days after symptom onset the application of the antiviral reduces mortality [30,41].In our study all patients hospitalised eight or more days after symptom onset died.This suggests a rather narrow time window for antiviral drug administration.
Our study was solely based on data from publicly available case reports and is subject to several limitations.Our monitoring instrument was only entirely implemented in August 2006 and thus trend analyses were not exploited to its full extent.Within the used reports, negative values, e.g."case not hospitalised", were not systematically mentioned, which may lead to biases.Time specifications, e.g. on dates of exposure or hospitalisation, needed for time-to-event analyses, were often incomplete.Case reports did not systematically contain details on medical care and specific antiviral treatment.Therefore, analyses were restricted to "hospitalisation" as general indicator for access to medical care.Given the sparse information on possible contact with infected individuals and clusters of human AI cases available from the serial reports within the investigated period, clusters could not be evaluated as initially planned.Other studies reporting on clustered cases had mostly accessed additional case-investigation reports and patient interviews [23,30].We based our analyses on WHO confirmed cases, although unconfirmed cases had been recorded in our line list, due to lacking information for probable and suspected cases.Including probable cases in our analyses did, however, not change the cases' sex ratio or CFR substantially when compared to confirmed cases only.
Our study points out that data extracted from the public domain already yields pertinent epidemiological information for assessing the current situation and developments of AI in humans.A line list format as provided would enhance the analysability of key data, their updating, and the evaluation of variables needed.
Several countries monitor the global AI situation, whether they currently face human AI cases, e.g.Egypt [25], or not, e.g.France [27].This indicates a common interest in data and if they were directly provided in such format, this would help to save time and resources for public health authorities and researchers.
A line list needs to be flexible in view of potential new information to be entered.New variables and parameter values might come up, when the minimum dataset suggested Bird and Farrar [31] on direct and indirect exposures to avian influenza A(H5N1) confirmed and non-confirmed poultry and human exposures would be implemented or when results from prospective studies involving exposed and unexposed individuals as designed by Kayali et al. [34] are available.
Unconfirmed cases would ideally be recorded as systematically as confirmed cases, either in a common or separate database as suggested by Bird and Farrar [31].
Presenting cases in the format of a line list is not a goal in itself, but a prerequisite for targeting surveillance and identifying risk factors, as well as a starting point for prospective studies, e.g.investigating potential human-to-human transmission, the transmissibility of avian influenza viruses, and host-related factors including age-dependent immunity in humans [33,42].
We would like to encourage that an anonymised casebased database for AI in humans is directly placed publicly and continuously updated, e.g. by an internationally renowned organisation such as WHO.Open access to analysable data might accelerate the identification and implementation of research questions and surveillance priorities and thus enhance our understanding of -still mostly fatal -AI in humans and permit the rapid detection of epidemiological changes with implications for human health.

Introduction
Accurate estimates of the number of individuals living with human immunodeficiency virus (HIV) infection are essential for the planning and monitoring of HIV prevention and care programmes.Studies of HIV prevalence in sentinel populations are one of the key strategies to monitor the epidemic [1], and one of the methods that has been widely used in sentinel populations is unlinked anonymous testing (UAT) [2].By 1987, the United States and the United Kingdom (UK) had already put in place UAT programmes to improve the understanding of the evolving epidemic in their countries.Over the years, UAT in pregnant women has been substituted by regular antenatal screening programmes in most European and North American countries and only few countries such as the UK and Spain still maintain this surveillance approach.
The UAT to monitor trends of HIV infection in women giving birth in Catalonia is performed annually on blood samples collected from newborns.The presence of HIV antibodies in the newborn reflects maternal infection due to the passive transfer of maternal antibodies to the infant.Since this testing is unlinked (prior to HIV testing the link between the specimen and the personal identifying information is removed) and anonymous (the health staff cannot identify an individual's test result), it is impossible to inform the women of the test results.
The use of sentinel populations to estimate prevalence is a common practice and UAT in these populations has been seen since the beginning of its use as a good tool to prevent participation bias associated with populations at risk (the higher the risk the lower the will to participate) [2] Catalonia UAT has proven to be an easy and cost-effective tool to monitor prevalence because of its association with other screening programmes that provide very good coverage of the population of women of childbearing age.The objective of this study was to describe the HIV epidemic and trends in women giving birth and those terminating pregnancy as an estimation of the HIV prevalence in pregnant women in Catalonia.

Methods
In the period from 1994 to 2009, we used samples from newborns of women living in Catalonia collected as part of an annual cross-sectional study.In addition, we analysed blood samples from women voluntarily terminating their pregnancy in three selected clinics in Catalonia in the period from 1999 to 2006.

Women giving birth
The Catalan Neonatal Early Detection Programme (NEDP) has been collecting blood spot samples from all newborns since 1994.These samples are used to determine hypothyroidism, phenylketonuria and cystic fibrosis in newborns.This screening is carried out annually by the Institute of Clinical Biochemistry (Institut de Bioquímica Clínica, IBC) and covers 99% of all infants born in Catalonia [3].
For 1994, we obtained samples for HIV antibody detection from this pool of the NEDP for the period between August and December.For all subsequent years until the end of 2009, we selected samples from every second month.The total sample obtained represents half of the yearly newborns in Catalonia [4].
Before determination of HIV antibody status, the samples from women giving birth were screened for neonatal metabolic disease.The remaining dried blood spots were used for the HIV antibody detection.This is an UAT programme to estimate HIV prevalence in pregnant women.Although this meant that the women could not be informed of the result, all of them were offered HIV testing as part of their routine screening during pregnancy, and women testing positive there were offered treatment.The annual number of samples needed to estimate a prevalence of between 1.8 and 2.8% with a 95% confidence interval and a precision of 0.06% is around 35,000 samples.The yearly mean of samples obtained during our period of study was 34,391 [5].

Women terminating pregnancy
The second source of information to monitor HIV prevalence in pregnant women were blood samples taken from women attending three specialised medical centres to terminate their pregnancies.Informed consent was required to obtain these samples.All dried blood spots from women terminating pregnancy were sent to the IBC for HIV antibody detection.
There were at least 11,000 voluntary interruptions of pregnancy annually in the three centres participating in the study.Testing all samples from these centres, we can therefore estimate a prevalence of 2 per 1,000 with a 95% confidence interval and a precision of 0.08%.
In women terminating their pregnancy, information on age was available for those sampled in the years 1999 to 2006.Mean age comparisons between women giving birth and those terminating pregnancy have been performed for this period of time.Information about country of origin was poor and discarded in the analysis of this set of samples.

Sample analysis
Sample collection and HIV antibody detection was done using dried blood spots.Two drops of blood were collected on filter paper discs (Schleicher and Schuell no.903TM, Dassel, Germany) and stored at 4 °C until used.HIV antibodies were determined using a modified Serodia IgG antibody-capture particle agglutination test (GACPAT) for HIV-1 (Fujirebio Diagnostics) [6].Positive samples were sent to the Microbiological Service of the University Hospital Germans Trias I Pujol (HUGTiP) to confirm the results using an IgG antibody capture ELISA for HIV-1 and HIV-2.Until 2001 this was done using the GACELISA test (Murex, UK) [7].In 2002 this confirmatory test was replaced with the Pasteur HIV-1/2 GenElavia Mixt ELISA (BioRad, Spain) after checking that normal and external valid values were similar for both tests [8].
Variables collected in the study were HIV status of the pregnant women, age and country or region of origin.Confidentiality for both data sets (women giving birth and those terminating pregnancy) was ensured by using a computer-aided coding process at the NEDP.
The results of HIV antibody testing could not be correlated with any patient identification number.
The annual HIV prevalence among women of childbearing age was computed as the number of HIVpositive samples divided by the total number of HIV-positive and HIV-negative samples tested each year, with 95% confidence intervals.Trends were analysed using the Cochran-Armitage test.Data were analysed using Stata SE 8.For the age variable, a comparison between women giving birth and those terminating pregnancy was done by non-parametric Mann-Whitney U-test.

Results
Among the 581,593 blood spot samples analysed, 549,689 were from infants born during the years 1994 to 2009 and 31,904 from women terminating their pregnancy during the years 1999 to 2006.We obtained 1,081 HIV positive results, representing a global prevalence of 1.85 per 1,000.Overall, we tested 54% of all women giving birth in Catalonia, ranging from 53% in  We observed an increasing trend in HIV prevalence between 2007 (1.6 per 1,000) and 2009 (3 per 1,000) among women born abroad, compared to lower prevalence rates and a decreasing trend from 1.3 per 1,000 to 1.1 per 1,000 among Spanish women in the same period.Prevalence was particularly high among those from Sub-Saharan Africa, reaching 6.9 per 1,000 in 2004 and 5.4 per 1,000 in 2009 (Figure 3).

HIV prevalence trends in women terminating pregnancy versus those giving birth
Information on women terminating pregnancy was available only for the period 1999 to 2006.We analysed samples from 31,904 women who interrupted their pregnancy in the three participating centres, representing 27% of all women who legally interrupted pregnancy in Catalonia.

Figure 2
HIV prevalence in women giving birth, by age, Catalonia, 1994-2009 (n=549,689) HIV: human immunodeficiency virus.HIV prevalence per 1,000 Women giving birth Women terminating pregnancy Year HIV prevalence during this time period did not differ between women terminating pregnancy and women giving birth (p=0.06), with 42 of 31,904 (13%) and 522 of 293,120 (18%) HIV-positive.samples,respectively.HIV-positive women terminating pregnancy were younger than those giving birth (average age 26.6 versus 30.6 years; p<0.0001) for the same time period.A non-significant decreasing trend in HIV prevalence was observed in women who voluntarily interrupted pregnancy (p=0.066)from 2.3 per 1,000 in 1999 to 1.0 per 1,000 in 2006 (Figure 4).

Discussion and conclusion
Unlinked anonymous surveillance of newborns and women interrupting pregnancy allowed us to estimate the HIV prevalence among pregnant women as a surrogate for HIV infection prevalence in women of childbearing age.We found this method to be feasible and reliable in Catalonia.Our study provides 16 years of meaningful information, if limited by covering only the variables age and country of origin.
Data from women voluntarily interrupting pregnancy were included with the objective of identifying any potential bias due to voluntary interruption of pregnancy among women with higher rates of HIV infection [9].However, their HIV prevalence was similar to the one found in women giving birth.Nevertheless, the small sample studied cannot guarantee the representativeness for all interrupted pregnancies performed in Catalonia, because important hospitals did not contribute data.The HIV prevalence rates followed a decreasing trend between 1994 and 2002, rose in the following three years (2003 to 2005), dropped in 2006 and then increased again in the years up to 2009.This rise was observed not only in Sub-Saharan African mothers but also in other European countries and Latin America.As expected, the seroprevalence observed in this study reflected the prevalence in the regions where the study population originated.For the decade 2000 to 2010, the HIV prevalence in Sub-Saharan Africa is reported as around 50 per 1,000, in Latin America around 5 per 1,000 for the same time period and in other European countries of around 2 per 1,000 [10,11].
Compared to other autonomous regions of Spain for which data are available, Catalonia has since the early 90s had one of the highest HIV prevalence rates [12,13], after the Canary and Balearic Islands.Over the period from 1995 to 1998 prevalence rates we observed in Catalonia decreased from 3.1 to 1.7 per 1,000.Other European countries such as Germany, Italy and the UK, where UAT has been used since the early 1990s, had different experiences in the same time period.In Italy [14,15] rates did not change significantly as well as in Scotland [15] and Germany [15].*Information available from the years 1999 to 2004, shows that HIV prevalence estimations from UAT in Catalonia followed a different trend than, for example, those in the UK [15] where the prevalence was systematically increasing over the years (Table ).
HIV prevalence among pregnant women in the World Health Organization European Region [16] has been monitored using three methods: seroprevalence studies based on UAT of either newborns or pregnant women, seroprevalence studies based on multiple data sources (for other sexually transmitted diseases such as syphilis or hepatitis), and systematic collection and reporting of the results of diagnostic testing carried out among pregnant women in antenatal care or at delivery.Most of these countries are nowadays prioritising the third method because of increased accessibility to testing through antenatal care and the establishment of national registers of pregnant women, thus making UAT potentially redundant.
In Catalonia, UAT of neonatal dried blood spots taken for metabolic screening has been carried out since 1994 and the policy of universal antenatal HIV screening was introduced in 1996 [17].However, to obtain prevalence rates through antenatal HIV screening, we would need information on the number of pregnant women tested for HIV, and in our country the systems to obtain this information are not yet in place.Therefore, UAT has been continued, mainly because data and sample collection are simple, cheap and have the added advantage of providing unbiased prevalence rates.On the other hand, UAT of blood taken from women voluntarily interrupting their pregnancy was stopped in 2007 due to small samples and low representativeness.
As in other regions of Spain, pregnant women in Catalonia are offered HIV screening in the first trimester of pregnancy and, if they are at risk of exposure, also during the third trimester of pregnancy [18].A survey of HIV testing coverage conducted in Catalonia in the year 2000 found that 89% of women were tested during pregnancy, which at the time was assessed as good coverage [19,20].Current policy aims at 100% coverage, and there is concern regarding subpopulations that never reach antenatal care because of low educational level, low interest or arrival to the country at the time of delivery.It is worth noting that between the years 2000 and 2009, the foreign population in Catalonia has increased from 2.9% to 15.9% of the total population [21].Targeted efforts to include foreign mothers are not in place or of dubious efficacy.Strengthening surveillance and promoting testing at voluntary counselling and testing sites may support the already existing and well functioning antenatal care programme.Another important use of the UAT data is to produce estimates of HIV infections in order to plan and monitor the HIV prevention and care programs.
In conclusion, since routine HIV surveillance does not provide data on undiagnosed infections and there is evidence that immigrants may not have access to prenatal care until delivery, data from UAT in Catalonia is still useful to complement the epidemiological data on this infection.Moreover, UAT among pregnant women is still the best available surrogate for HIV prevalence among the sexually active female population.

Figure 2
Figure 2Minimum spanning trees of MLVA of 298 Salmonella enterica serovar Enteritidis isolates based on data from (A) nine loci and (B) five loci

Figure 2 2
Figure 2Incidences and number of confirmed avian influenza A(H5N1) human cases by age group and country, September 2006-August 2010 (n=230)

Figure 3 3
Figure 3 Time from confirmed avian influenza A(H5N1) human cases' symptom onset to hospitalisation and case fatality rate stratified for Egypt and Asian countries, September 2006-August 2010 (n=197)

Table 2
Characteristics of tandem-repeat loci in published Salmonella enterica serovar Enteritidis MLVA schemes bp: base pair; MLVA: multilocus variable-number tandem-repeat analysis; NCTC: National Collection of Type Cultures; TR: tandem repeat.aNomenclatureaccording to Malorny et al.[8].bInNCTC 13349 (GenBank accession number AM933172); only complete repeats are included.c Second TR is 45 bp.d Third TR is 14 bp.e 29 bp conserved 5'-sequence, together with a 32 bp variable 3'-sequence.f Second TR is 174 bp.

Table 3
Characteristics of Salmonella enterica serovar Enteritidis loci targeted by the MLVA scheme described byMalorny et al. aOnly the 10 bases adjacent to the start and finish of the TR region in NCTC 13349; incomplete TRs are excluded from the TR region.X is the amplicon length as determined by sequencing, which may differ from the size determined by capillary electrophoresis.e The number of TRs may need to be rounded up if TR2 is lacking the 6 bp insert.f Includes a null variant where no fragment is amplified by PCR.g Differs from the TR sequence in the published scheme based on observations made during this study.

Table 1
Status and cumulative number of avian influenza human cases reported by the World Health Organization and captured by the Robert Koch Institute monitoring system, and delay in reporting confirmed cases, September 2006-August 2010 Number of confirmed avian influenza A(H5N1) human cases by date of symptom onset and country, as well as cumulative case fatality rate and 19-months rolling case fatality rates, September 2006-August 2010 (n=213) a reported as cumulative numbers by the WHO[2].b only WHO confirmed cases, the initial report is by any source.c data only available for cases reported initially by a different source than the WHO.d Number of cases with available information.e Bangladesh, Cambodia, Laos, Myanmar, Nigeria, Pakistan, Republic Korea, Thailand.f Bangladesh (N=1), Cambodia (N=4), Laos (N=2), Myanmar (N=1), Nigeria (N=1), Pakistan (N=3).g Nigeria (N=1), Pakistan (N=1), Republic Korea (N=1), Thailand (N=1).cCFR: cumulative case fatality rate.rCFR: 19-months rolling case fatality rate.
-six percent (132/235) of confirmed cases died.The CFR differed across countries ranging from 28%(27/98)in Egypt to 87% (71/82) in Indonesia.The cCFR and the 19-month rCFR indicated a decline in case fatality over the study period (Figure1).Whereas the cCFR was little affected by the outcome of new cases and had only slightly decreased, the rCFR had steeply declined in the period from April 2008 to April 2009.Until mid 2008, a large proportion of cases occurred in Indonesia (country with highest CFR) and shifted thereafter to Egypt (country with lowest CFR).Accordingly, country-specific rCFRs for Indonesia and Egypt were less steep than the overall rCFR.The 19-months rCFR was privileged as it was less affected by case-free periods than rCFRs calculated over shorter periods (not shown).
shows the CFR in function of the time from symptom onset to hospitalisation, stratified by Egypt and Asian countries (grouped).
The multivariable logistic regression revealed that odds of fatal outcome increased by 33% with each day that passed from symptom onset until hospitalisation (OR: 1.33, 95% CI: 1.11-1.60,p=0.002).In relation to children of 0-9 years, odds of fatal outcome were more