Meningococcal vaccine antigen diversity in global databases

The lack of an anti-capsular vaccine against serogroup B meningococcal disease has necessitated the exploration of alternative vaccine candidates, mostly proteins exhibiting varying degrees of antigenic variation. Analysis of variants of antigen-encoding genes is facilitated by publicly accessible online sequence repositories, such as the Neisseria PubMLST database and the associated Meningitis Research Foundation Meningococcus Genome Library (MRF-MGL). We investigated six proposed meningococcal vaccine formulations by deducing the prevalence of their components in the isolates represented in these repositories. Despite high diversity, a limited number of antigenic variants of each of the vaccine antigens were prevalent, with strong associations of particular variant combinations with given serogroups and genotypes. In the MRF-MGL and globally, the highest levels of identical sequences were observed with multicomponent/multivariant vaccines. Our analyses further demonstrated that certain combinations of antigen variants were prevalent over periods of decades in widely differing locations, indicating that vaccine formulations containing a judicious choice of antigen variants have potential for long-term protection across geographic regions. The data further indicated that formulations with multiple variants would be especially relevant at times of low disease incidence, as relative diversity was higher. Continued surveillance is required to monitor the changing prevalence of these vaccine antigens.


Introduction
Neisseria meningitidis, the meningococcus, a Gramnegative diplococcus, is a globally important causative agent of meningitis and septicaemia (severe sepsis), accounting for a significant amount of morbidity and mortality worldwide.However, it is frequently carried harmlessly in the human nasopharynx and can be considered part of the normal human commensal microbiota.Currently, no comprehensive vaccine exists against meningococcal disease due in large part to the structural similarity of serogroup B polysaccharide to polysaccharides associated with the human neural cell adhesion molecule (NCAM).This is thought to account for the poor human immune response against group B polysaccharide and also raises safety concerns [1].Many subcapsular vaccine antigen candidates, especially proteins, have therefore been investigated, with the intention of producing serogroup B substitute formulations.Several such antigens have been incorporated into vaccine formulations that are in various stages of development.
First developed in the 1980s, outer membrane vesicle (OMV) vaccines were created to counter higher levels of disease incidence caused by particular serogroup B meningococci.These OMV vaccines contained the respective epidemic antigen variants of the outer membrane protein PorA of these meningococci as major immunogens and were successfully deployed in Norway (MenBvac), Cuba (VA-MENGOC-BC), and New Zealand (MeNZB) [2][3][4].In the last decade however, many high-income regions such as Europe and North America have experienced a period of relatively low incidence of serogroup B meningococcal disease [5,6].In such periods, when disease incidence is lower, but caused by more diverse meningococci, vaccines should ideally contain several components in order to attain the widest possible strain coverage.An example is the proposed NonaMen (RIVM, the Netherlands) vaccine formulation comprising nine PorA variants corresponding to the most prevalent disease-associated strains [7].Alternatively, Bexsero, developed by Novartis, is a supplemented OMV vaccine which contains four components: PorA P1.7-2, 4, fHbp subvariant 1.1, NHBA variant 2, and NadA-3.8subvariant [8].This was licenced in Europe in 2013 and in the United States (US) in 2015 and has been included in the infant immunisation schedule in the United Kingdom (UK) since September 2015.The rLP2086 vaccine, Trumenba developed by Pfizer, which was licenced by the Food and Drug Administration in the US in 2014, is a bivalent recombinant vaccine based on two fHbp antigens from subfamily A and B (subvariants 3.45 and 1.55 respectively) [9].
Over the past two decades, sequence-based molecular typing has become an intrinsic part of meningococcal disease surveillance and standardised typing methods and schemes have allowed for more comparability across reference and research laboratories in different countries [10][11][12].For example, the European surveillance system (TESSy) of the European Centre for Disease Control (ECDC) (http://www.ecdc.europa.eu/en/activities/surveillance/Tessy) and the European Meningococcal Epidemiology in Real Time (EMERT) database (http://emgm.eu/emert/)include two typing antigens which are also vaccine candidates, PorA and FetA.Following the advent of whole genome sequencing (WGS) and its rapidly reducing costs, comprehensive investigations of the likely and actual impact of available or potential interventions may be made more easily.Publicly accessible online resources such as the Neisseria PubMLST database (http://pubmlst.org/neisseria/) and the Meningitis Research Foundation Meningococcus Genome Library (MRF-MGL) (http:// www.meningitis.org/research/genome),which contain molecular typing information from single genes up to many hundreds, and for many thousands of isolates, allow fine-scaled analyses, including investigation of the distribution of vaccine components.In this study, the PubMLST and MRF-MGL databases were used in an investigation of the distribution of vaccine components in Bexsero, Trumenba, NonaMen, MenBvac, MeNZB and VA-MENGOC-BC.

Methods
This study made use of the public Neisseria PubMLST database http://pubmlst.org/neisseria/and the MRF-MGL http://www.meningitis.org/research/genomewhich is hosted within it.The databases were accessed in August 2014.The MRF-MGL contained 1,344 N. meningitidis isolates which were all from England and Wales, covering all culture-confirmed cases of invasive meningococcal disease (IMD) from the epidemiological years 2010/11 to 2012/13.From the PubMLST database, we included 1,717 N. meningitidis isolates within the database which had assembled sequence data of at least 0.5 Mbp.This is the minimum amount of assembled sequence data that allows as complete an analysis of vaccine antigen distribution as possible.This  The presence of components of each of the fHbp-containing vaccines, were analysed in the collections [8,9].Before the development of a unified nomenclature scheme in which each unique allele is assigned a unique numerical identifier [13], separate schemes were developed which divided fHbp into either two subfamilies (subfamily A and B) or three variant families (variant families 1, 2 and 3) according to nomenclature system [14,15].These schemes can be cross-referenced online (http://pubmlst.org/neisseria/fHbp/).Briefly, subfamily B is equivalent to variant family 1 and subfamily A incorporates both variant families 2 and 3. Peptides are then numbered with the variant family/subfamily name e.g.fHbp 1.1 is variant family 1 peptide 1 [13].This is the fHbp nomenclature used throughout this paper.
Invasive meningococcal disease isolates from England and Wales from epidemiological years 2010/11 to 2012/13 inclusive of all genogroups and genogroup B only.
this analysis is was assumed that such meningococci either had the capacity to express a capsule or had a very closely related ancestor which could [18].It should be noted that OMV vaccines include other potentially immunogenic proteins not assessed here, although WGS allows for such analyses.
Simpson's index of diversity (D) was used to determine the diversity of each Bexsero vaccine antigen by age group in the MRF-MGL.The value of the index ranged from 0 to 1, with values nearer to 1 indicating greater diversity.Calculation of the index was performed as described previously [19].The 95% confidence intervals (CIs) for the index were calculated as described previously [20].

Cross-protection among fHbp variant family 1 and
NadA-1, NadA-2 and NadA-3 variant family members has been described [22,23], prompting an analysis of their distribution.There were 873 fHbp variant family 1 peptide variants and these were found across the whole time span of the collection (77 years) from 1937 to 2014 and on all continents.Of these, 349 (40.0%) were ST-11cc, 98 (11.2%) were ST-41/44cc, and unassigned sequence types (STs) accounted for 92 (7.4%) of the isolates.Of the 873, 241 were serogroup W, 159 were serogroup B, 113 were serogroup C and 76 were serogroup A. There were 709 isolates that were NadA-1, 2 and 3 variant family members.These spanned 51 years of the collection from 1963 to 2014 and were found on all continents.The majority were ST-11cc (n = 568; 80.1%), 368 were serogroup W and 135 were serogroup C.

Discussion
Since its introduction in the 1990s, sequence-based molecular typing has established a role in the clinical microbiology laboratory, replacing or complementing existing phenotypic typing methods.WGS is the latest sequencing technology and, as costs continue to decrease, will become more commonplace in clinical and reference laboratories [24][25][26].WGS provides definitive sequence-level resolution with widespread applications including molecular epidemiology, surveillance, vaccine design and vaccine implementation monitoring [27].To be useable by physicians and public health specialists, databases will need to use uniform nomenclature and be interoperable and compatible with other databases such as those that contain phenotypic information [28].
As WGS databases such as PubMLST are generic and scalable, they enable detailed deduction of potential coverage and preliminary assessment of the impact of given vaccine formulations on the meningococcal population, thus informing further work such as phenotypic assays [29].This requires the assembly of representative collections of meningococcal isolate genomes.The MRF-MGL is an exemplar of a representative, contemporary, curated, publicly accessible database containing many hundreds of genomes and was expressly established as a resource for the meningococcal research and public health communities.It is embedded within the PubMLST database, which has been running for many years as a community resource  and repository for isolate and characterisation data.
The MRF-MGL is the most comprehensive epidemiological sample of meningococcal WGS currently available, allowing an assessment of vaccine antigen distribution among disease cases.
Molecular epidemiology using phenotypic or genotypic data has been used to inform vaccine design: tailor-made OMV vaccines were designed to contain the respective outbreak strain PorA variant [2][3][4]; and the broad-spectrum multivalent Hexamen/Nonamen formulation was based on the most prevalent PorA serosubtypes documented in the Netherlands at the time of its development [30].One of the earliest uses of genome data in vaccine design was in the discovery of novel meningococcal vaccine candidates by 'reverse vaccinology' based on a single isolate genome [31].
Several of these genome-derived antigens are components of vaccines at various stages of clinical development and deployment at the time of writing [8,9].
The temporal and geographic spread of antigen variants and combination of variants, with some existing for long time periods and across several continents, demonstrated the stability of antigen clonal complex relationships and therefore the potential longevity of appropriate vaccine formulations [32,33].Previous work using the PubMLST database collection [32] demonstrated the longevity of PorA VR types and their association with clonal complex which we here extend to other antigens available from WGS.For example, the most frequent strain type (PorA:FetA:cc) P1.5,2:F3-6:cc11 had a minimum lifespan of 49 years and was found on three continents (Europe, North America and South America) [32].In the present analysis, multicomponent vaccines exhibited more potential to protect against isolates represented in the MRF-MGL than vaccine formulations containing one or a few components, although this did not take into account any potential cross-protection that may be offered by any particular vaccine antigen.Europe and North America are experiencing low rates of meningococcal disease at present [5,6], and in the absence of a dominant epidemic clone accounting for disease, a multicomponent vaccine formulation would be required to cover most disease [34,35].Since a differential distribution of clonal complex and antigens has been demonstrated among different age groups and those at highest risk (the under one year-olds) are less at risk from lineages that affect the next peak of disease incidence, late adolescents, multicomponent vaccines are likely to be most appropriate, especially in a period of low incidence [36].Therefore, comprehensive molecular epidemiology and surveillance is required in order to maximise the coverage of a given vaccine formulation.Continuous surveillance will be required to track changes in epidemiology that may need vaccine reformulation.
While genotypic data can provide valuable information on the potential utility of vaccines, the evaluation of antigen expression and potential cross-reactivity is fundamental to gauging the actual success of a given

Table 3
Characteristics of Bexsero and Trumenba antigens in PubMLST Neisseria database formulation.Assays have been developed and expression studies carried out that attempt to predict the coverage of various meningococcal vaccine antigens in the population [14,37].One assay, the ELISA-based meningococcal antigen typing system (MATS) was developed to predict the strain coverage of the Bexsero vaccine [37].Based on a panel of invasive serogroup B-associated meningococcal isolates from several European countries it was estimated that it could protect against 78% of serogroup B cases and against a panel of serogroup B invasive isolates from Greece up to 90% [38,39].One of the features of the PubMLST database is that phenotypic information such as MATS data can be added to isolate or allele data so that phenotypic and genotypic information may be associated allowing further analyses.

Conclusion
Highly variable pathogens require detailed characterisation to appropriately tailor clinical and public health responses such as treatment, immunisation, outbreak control, and novel vaccine design.This is especially true for organisms such as meningococci, in particular those that express the serogroup B polysaccharide, given that a universal capsular vaccine is unavailable.
Well-characterised isolate collections can easily be investigated for any number of vaccine formulations and vaccine candidates when they are housed within databases embedded with analysis tools which can handle phenotype and genotype data including WGS.This high level of characterisation and molecular epidemiology provides a foundation for further phenotypic analyses so that a fuller picture of potential vaccine effectiveness can be seen.Detailed characterisation and monitoring particularly relevant in periods of low incidence, such as experienced in high-income regions at present, as multivalent vaccines may be most appropriate and also most adaptable should changes in the meningococcal population occur.This rationale for vaccine formulation using molecular epidemiology may be applied to any pathogen and will become more readily applicable as well characterised datasets like the MRF-MGL and PubMLST become increasingly available.A combination of detailed genotypic characterisation and phenotypic investigation offer the best hope of producing vaccines with the widest possible coverage.

Figure 1
Figure 1Distribution of clonal complexes in meningococcal disease isolates in the MRF-MGL collection with exact peptide matches to at least one antigen of various vaccines, England and Wales, 2010/11-2012/13 (n = 1,344) Percentage distribution of age groups in meningococcal disease isolates in the MRF-MGL collection with exact peptide matches to at least one antigen of various vaccines, England and Wales, 2010/11-2012/13 (n = 1,344) The recombinant vaccine Trumenba contains two fHbp antigen subvariants: peptide 45 (Pfizer nomenclature A05, subfamily A/variant family 3) and peptide 55 (Pfizer nomenclature B01, subfamily B/variant family 1).As well as Bexsero and Trumenba, other vaccine formulations analysed were: NonaMen, which contains nine plexes and patient age groups in the MRF-MGL was based on an exact match of deduced peptide sequences ('sequence match') to at least one component of each vaccine formulation investigated.The analysis was carried out on isolates of all genogroups (organisms with a cps region, encoding a capsule) and genogroup B isolates only (those containing a cps region encoding the group B polysaccharide capsule).For the purposes of Figure 2