Development and application of MLVA methods as a tool for inter-laboratory surveillance

C A Nadon1, E Trees2, L K Ng1, E Møller Nielsen3, A Reimer1, N Maxwell2, K A Kubota4, P Gerner-Smidt (plg5@cdc.gov)2, the MLVA Harmonization Working Group5 1. National Microbiology Laboratory, Public Health Agency of Canada, Winnipeg, Manitoba, Canada 2. Division of Foodborne, Waterborne, and Environmental Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, United States 3. Statens Serum Institut, Unit of Gastrointestinal Infections, Copenhagen, Denmark 4. Association of Public Health Laboratories, Silver Spring, Maryland, United States 5. Members of the working group are listed at the end of the article


Introduction
Multiple-locus variable-number of tandem-repeats analysis (MLVA) has recently emerged as a powerful method for the subtyping of food-borne bacterial pathogens. The method is based on repetitive DNA elements organised in tandem ( Figure). DNA replication errors, such as slipped-strand mispairing, generate diversity in the number of tandem repeats observed among strains of the same species [1,2]. MLVA determines the number of tandem repeats, or copy units, at multiple variable-number tandem repeat (VNTR) loci within the genome. Typically, multiplex PCR amplification of the repeat and flanking regions is followed by amplicon sizing using capillary electrophoresis. The number of repeat copy units, or allele number, at each location is calculated from the measured amplicon size. The string of alleles from multiple loci forms the MLVA profile.
The recent development of MLVA protocols for subtyping food-borne bacterial pathogens, including Salmonella enterica serotypes Typhimurium and Enteritidis, and Shiga toxin-producing Escherichia coli (STEC) O157:H7 has facilitated the implementation and application of MLVA for the successful detection and investigation of a wide variety of food-borne disease outbreaks all over the world [3][4][5][6]. The early promise and success of MLVA triggered the independent development of multiple protocols by many different laboratories, leading to many different schemes for each organism. For example, six protocols have been described for STEC O157 [3,[7][8][9][10][11], six for S. Enteritidis [1,[12][13][14][15][16], and four for S. Typhimurium [17][18][19][20]. Differences in the choice of loci, nomenclature, amplicon sizing due to primer, platform and/or chemistry differences, and interpretation of incomplete or partial repeats have stymied and continue to stymie inter-laboratory comparisons and thus surveillance. A lack of standards for the development, validation and quality control/quality assurance of MLVA further contributes to problems in the comparison and interpretation of MLVA results.
The goal of any subtyping method is to characterise bacteria beyond the species (or subspecies) level and to group individual isolates together in a meaningful way. The ability to do this quickly and reliably is the cornerstone of laboratory-based surveillance [21]. Isolates that have indistinguishable subtypes are more likely to have originated from a common source than those with different subtypes. This concept forms the To be suitable for laboratory-based surveillance and outbreak detection, a subtyping method should be assessed against several key performance criteria [21]: typeability, reproducibility, discriminatory power and epidemiological concordance. These criteria must be assessed using an epidemiologically relevant panel of isolates from geographically as diverse a region as where the method is to be applied. Additional criteria to assess method feasibility include speed, throughput, cost, ease of use, objectivity, versatility and portability. The importance of these criteria is further emphasised for the successful application of a subtyping method to inter-laboratory surveillance. While no single method will have perfect performance when assessed against all criteria, MLVA performs well overall. It scores high in its performance against several key criteria including discriminatory power, robustness, portability, objectivity and throughput [21,22], but scores low in versatility, since most protocols are species or serotype specific. Comparatively, pulsed-field gel electrophoresis (PFGE), the current gold standard method for the subtyping of food-borne bacterial pathogens, scores high in discriminatory power and versatility, but medium in robustness and low in portability, objectivity and throughput [22].
The historical success of PFGE for the inter-laboratory surveillance of food-and waterborne bacterial pathogens was based on the standardisation of methodology and interpretation through an internationally coordinated approach. The future success of emerging technologies such as MLVA for inter-laboratory surveillance similarly hinges on the coordinated harmonisation of the methodology, nomenclature and interpretation.
In this paper, we describe an international consensus for the development, validation, nomenclature, and quality control for MLVA-based inter-laboratory surveillance based on a review of the current state of science. These consensus guidelines were developed following an expert consultation in Copenhagen, Denmark, in May 2011, organised by the United States (US) Centers for Disease Control and Prevention (CDC), the European Centre for Disease Prevention and Control (ECDC), the Association of Public Health Laboratories in United States, the Public Health Agency of Canada and the Statens Serum Institut, Denmark.

Selection of potential loci
The first step in the development of an MLVA method involves the selection of potential loci for inclusion Box 1 Standardised VNTR locus nomenclature for an MLVA protocol A VNTR locus is named based on its location on the chromosome on the prototype genome by the closest kilobase (kb). If located on a plasmid, the name of the plasmid is used instead of the prototype genome. Example: the standardised name of the Salmonella enterica serovar Typhimurium VNTR locus STTR6 [18] would be STM2730, i.e. STM is the designation for the Typhimurium prototype genome LT2 and 2730 is the closest kb location for the locus STTR6 on the LT2 genome.
MLVA: multiple-locus variable-number of tandem-repeats analysis; STEC: Shiga toxin-producing Escherichia coli; VNTR: variablenumber tandem repeat. in the protocol. Initial VNTR locus finding and identification is performed by querying whole genome sequences using specialised software. Some VNTRfinding software is available free of charge on the Internet, and include Tandem Repeats Finder [23] and TredD [24]. Commercial software is also available and includes GeneQuest (DnaStar Lasergene, Madison, WI, US) and CodonCode (CodonCode Corp., Dedham, MA, US). Tandem Repeats database [25] is a public repository of information on tandem repeats and also contains a variety of tools for their analysis.
There is no standardised naming of loci used in MLVA schemes. In order to create uniformity in this context, it is proposed to name the loci in relation to their positions in the prototype genome. The proposed standardised locus naming (Box 1) and its correlation with existing nomenclature for loci that overlap between most published protocols for STEC O157, and S. Typhimurium and S. Enteritidis are outlined in Tables 1-3, respectively.
When selecting loci (Box 2), as a rule of thumb, the shorter the repeat unit, the more variation is detected in terms of copy numbers [26]. However, repeat units shorter than five bp should not be included in a subtyping system due to the limitations in sizing reproducibility in capillary electrophoresis platforms. It is critical to avoid repeat units with insertion and deletions (indels) in order to facilitate consistent sizing and allele naming using copy numbers. Low-level base variation between repeat units does not usually have a negative impact as long as the unit length is consistent. However, perfect homogeneous repeats are always better and will usually also increase polymorphism through the effect of polymerase slippage [26]. Furthermore, only loci with 100% conserved flanking sequences in the target organism should be included.

Primer design
Once loci have been identified, primers for their PCR amplification need to be designed (Box 2). There are multiple choices for primer design software, both  such checking is not available in the free software. At the very least, primer design software should be used to verify that no secondary structures, such as hairpins or self-and cross-dimers are formed between any of the primers intended to be multiplexed in the same reaction.
When designing primers, a number of issues need to be considered. Firstly, primers should be placed as close to the VNTR array as possible since the projected fragment size should not exceed 600 bp, which is the upper limit of reproducible sizing in most capillary electrophoresis platforms. This is particularly critical for VNTR arrays with long repeat units and for arrays with shorter repeat units combined with high diversity, in which scenario dozens of repeat units may be possible.
If only a few prototype genomes are available, we suggest sequencing the flanking regions of each locus in 20 strains representative of the genetic diversity of the target organism in order to ensure that the primers are placed in conserved sequence. Secondly, the intended site of the primer should be targeted so that it falls in the most accurate region of the sequence, i.e. 80-150 bp away from the sequencing primer. Thirdly, the primers for all loci should have the same annealing temperature in order to facilitate easy multiplexing of targets in the same PCR reaction. Relatively high annealing temperatures of 55 °C to 65 °C should be aimed for to enable stringent amplification conditions for specific amplification. Generally, the melting temperature for primers should be 5 °C higher than the desired annealing temperature.

Assay optimisation
Once potential loci have been selected and primers designed, it is time to optimise the assays in the laboratory setting. This process includes testing the diversity of the loci selected and optimisation of the PCR reactions. This is an iterative process that is repeated until a set of loci with appropriate diversity have been selected and PCR conditions to amplify the loci reliably have been developed. Firstly, the VNTR loci should be screened for diversity using singleplex PCR reactions against a limited panel of 10 to 20 strains that are not related to each other and have been shown to be genetically diverse using other subtyping methods. At this stage, loci showing no diversity or minimal diversity are excluded from the assay. Also loci with poor amplification, multiple amplification products or background noise should be either excluded or the primers should be re-designed at this stage.
After the initial screen, the promising VNTRs are tested against a larger panel (100-150) of isolates. This panel should contain both outbreak-related (information about patient exposures required) and epidemiologically unrelated (sporadic, i.e. different geographical locations, no temporal associations) isolates. This second screen will focus the selection process on VNTRs that generate epidemiologically relevant data. It also gives the assay developer an idea of the fragment size ranges in each locus, which is information that is needed for designing multiplex assays. Representative alleles in each locus, i.e. the smallest allele, the largest allele and at least every third in between, should be sequenced at the development phase in order to verify the copy number and to ensure that the size differences observed between different strains are due to differences in repeat unit copy numbers and not due to other genetic events.

Design of multiplex PCR reactions
Once the set of VNTR loci has passed the initial screening process, multiplex PCR reactions must be designed to enable efficient amplification of all loci in as few reactions as possible. Since the multiplex PCR reactions should be as robust as possible, no more than four or five targets should be amplified in the same reaction. Targets with overlapping fragment sizes can be differentiated using different fluorescent labels. The same label can be used multiple times in the same PCR reaction as long as there is no overlap in fragment sizes. reaction. One of the dyes is always reserved for the DNA size standard. Since it is highly desirable that protocols could be easily converted from one platform to another by simply just re-labelling the forward primers, use of more than three fluorescent labels for targets in the same reaction is therefore not recommended.
Important parameters to consider when designing the multiplex PCR reactions are the annealing temperature, MgCl 2 concentration and primer concentration. Practical tips for approaches to optimise multiplex PCR reactions can be found in the literature [28].
All targets in the multiplex reaction should be easily detectable. The desired fluorescence intensity for PCR products on the Beckman Coulter platform is 5,000-80,000 units, on the Applied Biosystems 3130 platform 1,000-7,000 units and on the Applied Biosystems 3500 and 3730 platforms 2,000-9,000 units. Fluorescence intensity below the desirable level will result in unreliable detection of targets. Too high fluorescence intensity will cause fluorescence carry-over from one channel to another resulting in non-specific peaks that can interfere with the data analysis in downstream applications. If the same protocol is used in multiple laboratories, each laboratory typically needs to optimise the primer concentrations for their own laboratory since there are several laboratory-specific factors, such as the age of the primer stocks, the type and the calibration status of the thermocycler, which affect the amplification efficiency. Additionally, as the primer stocks age, there is a gradual drop in the fluorescence intensity, requiring further optimisation of primer concentrations over time, even within the same laboratory.

Internal validation
When a prototype of the MLVA protocol has been established, it needs to go through internal validation (Box 3). The purpose is to test the robustness and reproducibility and to establish the discriminatory power of the method when used in the laboratory (or laboratories) that developed it.
The internal validation should be comprised of two phases, which may be performed simultaneously: (i) testing of additional isolates by the protocol developers; (ii) testing of the protocol by other laboratories/ individuals within the developers' institutions for technical performance. The number of isolates to be tested during internal validation depends on the genetic diversity of the target organism, i.e. the higher the diversity, the more isolates are needed for adequate validation. Optimally 250 to 500 isolates, in addition to those that were tested during the development phase, should be tested. If the developing laboratory does not have access to such a large culture collection, the isolates must be acquired from collaborating laboratories. Insufficiently validated protocols should not be published in the scientific literature since they almost invariably will need further optimisation by future users. By analysing a large number of isolates using the proposed protocol, the robustness of the assay can be tested, along with its ability to consistently produce profiles from all strains and generate data that are epidemiologically relevant and easy to analyse. The strains used for the validation should include welldefined sets of both outbreak-associated isolates and sporadic isolates. The outbreak-associated isolates should also include 20 to 30 isolates from the same outbreak and ideally from multiple outbreaks of different types (monoclonal vs polyclonal, short lasting vs long lasting). Multiple isolates obtained through serial passaging of the same strain may also be included to test the reproducibility of the method and in vitro stability of the loci. If desired, the sporadic isolates and one representative from each outbreak can be used to calculate the diversity index for the method [29]. If the protocol is intended for global use, geographically representative isolates around the globe should be included in the validation set. Data generated with the proposed MLVA method should be compared with the epidemiological data in order to determine concurrence. Comparisons with the gold-standard method should also be made, if a gold standard exists for the target organism. In order to determine the technical performance, the protocol should be tested using multiple different equipment brands (thermocyclers, capillary electrophoresis instruments), different lots of reagents and by multiple individuals. All null alleles (= no amplification) should be confirmed using singleplex PCR reactions in order to rule out suboptimal multiplex conditions as a cause for amplification failure.

Calibration set and allele nomenclature
Inter-laboratory comparability, as mentioned before, is of critical importance if the subtyping method is to be used for international surveillance. Determining the number of repeats using different detection platforms without sequencing all amplicons is not reliable because of use of different reagents, chemistries and detection platforms may yield slightly but sufficiently

Box 3
Internal validation of an MLVA prototype protocol different fragment sizing results to hamper inter-laboratory comparisons [30,31]. Using different primers for amplification of the same loci will also invariably lead to lack of comparability of results generated in different laboratories. We propose to solve this problem by introducing organism-specific set of strains with wellcharacterised copy numbers at each locus that each laboratory implementing the method may use to calibrate the output of the protocol and detection platform they use (Boxes 4 and 5).
These strain calibration sets should be created both for existing MLVA protocols and for those developed in the future. The validation of such a calibration set for use with S. Typhimurium protocols is described in this issue of Eurosurveillance [32]. Each laboratory will use the calibration set to create a correlation table between the sequenced copy number and the observed fragment size for each allele at each locus using their preferred protocol and fragment-sizing platform. This way, the same allele type will always be assigned to the same fragment regardless of the primer sequences, reagents or capillary electrophoresis platform used to generate and size the fragment. The calibration should be repeated each time a laboratory changes any parameter in its MLVA set-up, such as using a different fluorescent dye for a primer or different type of polymer for capillary electrophoresis. The calibration set should cover representative alleles for all loci included in the new protocol, and in the case of the existing protocols, for those loci that overlap between the protocols that are already widely used. All VNTR loci should be sequenced for all isolates included in the calibration set in order to determine the actual copy number. All alleles should be included in the calibration set if the VNTR locus contains four or fewer alleles. If the VNTR locus contains five or more alleles it is proposed that at least the smallest and the largest alleles and every third allele in between should be included in the calibration set. All new alleles with unexpected fragment sizes (fragment sizes that do not fall within predicted sizes for new alleles based on the calibration set) must be sequenced, and, if needed, the calibration set should be amended.
If multiple peaks are detected in the same locus, the PCR needs to be repeated using a fresh DNA template made from a culture derived from a single colony in order to exclude the possibility of contamination, since this is the most common explanation for this phenomenon. If contamination is not the cause of the problem and the result with multiple peaks is reproducible, with the same peak always having the highest fluorescence intensity, then the allele type should be designated based on the most intense peak and the other peaks should be ignored if the locus cannot be excluded from the assay. If upon repeating the PCR the same peak does not always present with the highest fluorescence intensity, 10 colony picks should be tested from the culture. In this case, the allele type should be assigned based on the peak that has the highest fluorescence intensity in the majority of the colony picks.

Box 4
Proposed standardised allele nomenclature and reporting of allele profiles for an MLVA protocol Proposed standardised allele nomenclature for homogeneous VNTRs • The allele name is the actual sequenced copy number • Incomplete repeats: the copy number rounded down to the nearest complete copy number • Null alleles: the designated allele type '−2.0' • VNTR array missing, but the flanking region with the primerannealing sequences present and amplifies: the designated allele type '0' Proposed standardised allele nomenclature for heterogeneous VNTRs • Inclusion of loci with heterogeneous repeat units is discouraged in new protocols • Some existing protocols include heterogeneous loci, such as the locus STTR3 in the Salmonella enterica serovar Typhimurium protocol by Lindstedt et al. [19]. STTR3 consists of 27 bp and 33 bp repeat units. • Allele type should indicate copy numbers of all different length repeat units. o Example: for STTR3, the allele type 0208 corresponds to two copies of the 27 bp repeat unit and eight copies of the 33 bp repeat unit [36].

Proposed standardised reporting of allele profiles
• New protocols: reported in the order the loci are located in genome. Loci located on plasmids reported last. • Existing protocols: the currently most widely accepted reporting order for loci will be continued. o Example: the S. Typhimurium MLVA profile reported in the locus order STTR9-STTR5-STTR6-STTR10-STTR3: 3-8-13-14-0411 bp: base pair; MLVA: multiple-locus variable-number of tandemrepeats analysis; VNTR: variable-number tandem repeat.

Box 5
Calibration strain set for developing an MLVA protocol

External validation
When the method has passed the internal validation, it needs to be validated by the future external users. The purpose of external validation is to determine the robustness and performance of the methodology and thereby the feasibility of implementing it in multiple laboratories of end users (Box 6).
It is important that results from different laboratories in diverse geographical locations and with different skill levels are compatible and reproducible for international surveillance and outbreak detection and investigations. It is expected that different laboratories may use reagents from different suppliers. Often equipment in different laboratories is made by different manufacturers or different models from the same manufacturer are used. Although MLVA results are less prone to variability arising from subjective interpretation by trained laboratory staff, it is nevertheless important to take proficiency of data interpretation into consideration. In particular, the consistency of person-to-person interpretation of partial repeats and null alleles should be assessed, as should unpredicted results. In order to maintain consistency of results over time, quality assurance processes should also be considered after the external validation.
In selecting suitable laboratories to participate in the external validation, a survey containing questions in regard to testing capacity could be distributed to reference laboratories that have been performing PFGE or other molecular typing methods for cluster detection. Such a survey will also explore the global interest in using the method.
The aim of inter-laboratory comparison is to determine the variability of the results obtained by different laboratories using identical samples. Six to eight laboratories should be selected from different geographical locations that may have different endemic or outbreak strains with profiles determined using the gold-standard method and have the capacity to perform MLVA. These laboratories should cover the range of equipment platforms (including different manufacturers, models and analytical software) and reagents from different suppliers. It is preferable that the participating laboratories have trained microbiologists available who are knowledgeable in capillary electrophoresis for troubleshooting and interpretation of results.
The selected laboratories should initially test the calibration set of strains using the same procedures that have been internally validated to create the calibration table for standardised reporting. In addition, for comparing inter-laboratory compatibility, each laboratory needs to subtype a blinded set of at least 20 well-characterised strains supplied by the organising laboratory and covering the full spectrum of alleles at all loci, including alleles that are not present in the calibration set. The results from all the participating laboratories should be distributed and shared by the organising laboratory. The concordance is calculated for the study overall and for each individual laboratory. Discordant results must be resolved and recommendations on corrective actions to improve concordance be made. These corrective actions should be provided to future participants as part of quality assurance of the method. If the concordance was poor initially (discordant results generated for more than 5% of the isolates in more than 20% of the participating laboratories), the external validation may need to be repeated with any corrections to the protocol.
When good concordance has been achieved between the laboratories, each participant should test additional strains selected from its own culture collection that has been well characterised, ideally using the same gold-standard method, typically PFGE. These strains should be from diverse locations and epidemiological backgrounds. The number of strains will typically be between 50 and 100, depending on the diversity of the target organism. This panel should be well defined to evaluate typeability, i.e. the ability to amplify each locus, the discriminatory power and epidemiological concordance of the method [21]. It must include strains from human and non-human sources, and contain a mix of epidemiologically unrelated and related isolates. The MLVA testing should be evaluated for these criteria in comparison with the gold standard, if such a method exists.
If new alleles are encountered during the external validation, strains with these alleles should be shared with the developing laboratory for confirmation by sequencing. If necessary, the calibration set should be revised to ensure that the copy number of the new alleles can be determined reliably. The external validation

Box 6
External validation of an MLVA prototype protocol laboratories should also test the strains thus added to the calibration set, to update their correlation tables.

Quality assurance
The final step before an MLVA protocol may be implemented in routine surveillance in multiple laboratories is the establishment of a quality assurance programme for future users (Box 7). Quality assurance is divided into internal and external sections.
Internal quality assurance includes the use of appropriate controls for PCR and fragment analysis, quality control of new primer lots, maintenance and calibration of instruments, such as thermocyclers and pipettors, and appropriate record keeping for monitoring reagent lots, instrument performance and run-to-run accuracy of sizing. An internal training programme should be in place as part of the human resource succession or continuity plan and for surge capacity. Newly trained personnel should be assessed for proficiency prior to assuming routine testing and then assessed annually internally. Each laboratory should also participate in external quality assurance (EQA), if available.
EQA includes initial and annual quality checks performed by a laboratory/institute that has agreed to serve as a coordinating quality assurance body for the protocol in question. When a protocol is used in an international surveillance network such as PulseNet, new participants are certified for the laboratory procedure and the correct data analysis and reporting of the results for a limited set of well-characterised strains as part of the initial quality check. Once certified, each laboratory needs to pass a proficiency test at least annually to keep their certification status [22]. Valid certification is required from each laboratory in order to be able to upload data to the PulseNet databases. In PulseNet International, the coordinating laboratory in each region is responsible for the EQA in their respective region and the US CDC performs the EQA for the coordinating laboratories. ECDC has funded an external voluntary EQA scheme for MLVA of S. Typhimurium for the public health laboratories in the European Union and European Economic Area countries. This is a new quality assessment scheme in Europe that does not provide a formal certification status but serves as 'shelf-check' for the participants. The first results are expected to be available in 2013.
The developing laboratory typically selects a set of strains to be used for certification and proficiency testing. The number of strains used for certification of new users and proficiency testing of current users depends on the clonality of the organism. PulseNet US's certification sets for MLVA include eight isolates, and proficiency testing is performed by testing only a single isolate in the same test run with each laboratory's routine isolates. The generated data are evaluated not only for correct patterns but also for the overall quality of data, e.g. non-specific peaks, primer-dimers and optimisation of PCR reactions.
Successful implementation of a new MLVA protocol may be facilitated through training of new users. This training needs to include the use of the detection platform the participants will use in their own laboratory, to make them familiar with the protocol in a setting as close as possible to the one they will use in the future.

Concluding remarks
It is our hope that the guidelines and recommendations presented here will help solve some of the problems hampering the inter-laboratory comparisons of MLVA subtyping results, provide clarification of the relationships between the multiple protocols currently available for STEC O157, S. Enteritidis and S. Typhimurium, and facilitate the development and validation of new MLVA protocols for organisms not covered by currently available protocols.

Box 7
Quality assurance and proficiency testing of an MLVA prototype protocol Quality assurance • Purpose: to ensure consistent high quality of the results generated • Control strains should be included for PCR and fragment analysis in each run • Multiple reference strains should be run as a quality control check when new primer lots are introduced or after any major maintenance or repair of the instrument • Records of reagent lots and accuracy of fragment sizing for control strains should be maintained for each run • An internal training programme should be in place for new personnel Proficiency testing • If available, participation in an external quality assurance programme is mandatory • Newly trained personnel must pass an initial test for proficiency and be tested annually thereafter • Assessment of proficiency includes generation of correct allele profiles and overall quality of data, e.g. presence of non-specific peaks, primer-dimers and other PCR artifacts MLVA: multiple-locus variable-number of tandem-repeats analysis; PCR: polymerase chain reaction.