Eurosurveillance banner


Eurosurveillance invites authors to submit papers for a special issue on HIV/AIDS and other sexually transmitted infections (STI) in men who have sex with men (MSM). The topic is in line with the main theme of World AIDS Day 2009 events organised by the European Centre for Disease Prevention and Control and aims at drawing attention to the epidemiological importance of MSM in HIV and other STI and directing the ECDC activities to focus on main risk groups.

Eurosurveillance is planning to publish a special issue on Socio-economic determinants and infections diseases in Europe in spring 2010. For this reason Eurosurveillance invites interested scientists who have research findings in the area to submit papers for review and possible publication. The submission deadline now is 15 November.

The data from 27 European Union countries plus Iceland, Liechtenstein and Norway show that considerable progress has been made in preventing and controlling the disease. The number of newly diagnosed cases and the overall notification rate declined continuously in the past decade, and the notification rate in 2007 was 12% lower than in 2003. In spite of this decline, a total of 84,917 new cases of TB were registered in 2007 and a number of challenges hamper the progress towards the elimination of TB in the EU.

A number of bacterial and viral infections in pregnant women can have serious effects on the unborn child leading to impaired mental and physical health later in life. This week’s issue of Eurosurveillance is dedicated to infectious diseases in pregnancy.

The emergence and spread of antimicrobial resistance (AMR) is a growing problem in many European countries. To mark the very first European Antibiotic Awareness Day, on 18 November, the scientific journal Eurosurveillance runs a series of articles to highlight main aspects of the AMR problem in Europe. They will be published in two issues on 13 and 20 November 2008.

In preparation for the coming influenza season 2008-9, Eurosurveillance publishes a special issue on prevention of influenza by vaccination. Seasonal influenza poses a serious public health threat because of associated serious morbidity and mortality. In Europe, estimates suggest that influenza is responsible for around 40,000 to 220,000 excess deaths, depending on the severity of the epidemic.

Today Eurosurveillance is publishing a special issue dedicated to the widespread advances made in Europe in estimating the real number of newly acquired HIV infections based on an innovative approach called STARHS

To tie in with World Hepatitis Day on 19 May, the scientific journal Eurosurveillance is today publishing a special issue on viral hepatitis, highlighting issues and challenges related to hepatitis B and C.

On 17 April 2008, Eurosurveillance is publishing a special issue with articles on the measles situation in Europe. The publication is linked to European Immunisation Week which runs from 21-27 April.

World Tuberculosis Day on 24 March commemorates the date in 1882 when Robert Koch presented his findings of the causing agent of tuberculosis (TB) – Mycobacterium tuberculosis. In the run up of this day Eurosurveillance publishes a special issue on the situation of TB in Europe.

Today (6 March, 2008), Eurosurveillance, the European peer-reviewed journal of infectious diseases, publishes a special issue on meningococcal disease. It includes two in-depth articles and an editorial by the European Centre for Disease Prevention and Control (ECDC).


In this issue


Home Eurosurveillance Edition  2009: Volume 14/ Issue 21 Article 4 Printer friendly version
Back to Table of Contents
Previous Download (pdf) Next

Eurosurveillance, Volume 14, Issue 21, 28 May 2009
Rapid communications
Cluster analysis of the origins of the new influenza A(H1N1) virus
  1. Physics Department, Princeton University, Princeton, United States
  2. Center for Infection and Immunity, Mailman School of Public Health, Columbia University, New York, United States
  3. Department of Biomedical Informatics, Center for Computational Biology and Bioinformatics, Columbia University College of Physicians and Surgeons, New York, United States

Citation style for this article: Solovyov A, Palacios G, Briese T, Lipkin WI, Rabadan R. Cluster analysis of the origins of the new influenza A(H1N1) virus. Euro Surveill. 2009;14(21):pii=19224. Available online: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=19224
Date of submission: 27 May 2009

In March and April 2009, a new strain of influenza A(H1N1) virus has been isolated in Mexico and the United States. Since the initial reports more than 10,000 cases have been reported to the World Health Organization, all around the world. Several hundred isolates have already been sequenced and deposited in public databases. We have studied the genetics of the new strain and identified its closest relatives through a cluster analysis approach. We show that the new virus combines genetic information related to different swine influenza viruses. Segments PB2, PB1, PA, HA, NP and NS are related to swine H1N2 and H3N2 influenza viruses isolated in North America. Segments NA and M are related to swine influenza viruses isolated in Eurasia.


Introduction

Influenza A virus is a single stranded RNA virus with a segmented genome.  When different influenza viruses co-infect the same cell, progeny viruses can be released that contain a novel mix of segments from both parental viruses. Since the first reported pandemic in 1918, there have been two other pandemics in the 20th century. In both cases, the pandemic strains presented a novel reassortment of genome segments derived from human and avian viruses [1-3]. The origins of the 1918 strain are so not clear, although different analyses suggest that this virus had an avian origin [4,5].

When and where pandemic reassortments happen remains a mystery. Avian viruses often undergo reassortment events among different subtypes. Several reports suggest that reassortments are also frequent between human viruses [6,7]. Swine have been found frequently with co-infections and reassortment of swine, human, and avian viruses has been reported [8-10,3].  In addition, cell surface oligosaccharide receptors of the swine trachea present both, a N-acetylneuraminic acid-alpha2,3-galactose (NeuAcalpha2,3Gal) linkage, preferred by most avian influenza viruses, and a NeuAcalpha2,6Gal linkage, preferred by human viruses [11].  Co-infection combined with co-habitation of swine and poultry on small family farms all over Asia, and the presence of avian as well as human receptor types in pigs have led to the “mixing vessel” conjecture [12,13] that suggests that most of the inter-host reassortments are produced in pigs.

Recently, a new A(H1N1) subtype strain has been identified initially in Mexico, then rapidly reported in all continents. As of 27 May, 12,954 cases of the new influenza A(H1N1) virus infection, including 92 deaths have been reported to the World Health Organization [14,15]. Several approaches have been used to understand the origins of this strain. Searches in public databases containing influenza A genomes using sequence alignment tools indicated that the closest relatives for each of the eight genomic segments are from viruses circulating in swine for the past decade [16-19]. These include genome segments derived from “triple reassortant” swine viruses that combined in the late 1990s genome segments from viruses previously identified in humans, birds, and swine [20]. Similar conclusions were drawn by the application of phylogenetic techniques [16,21].

Here we present a cluster analysis using Principal Component Analysis and unsupervised clustering.  Clustering methods are particularly robust under changes in the underlying evolutionary models. Our results substantiate previous reports [16,21], and demonstrate that for each of the genome segments of the new influenza A(H1N1) virus the closest relative was most recently identified in a swine, compatible with a reassortment of Eurasian and North American swine viruses (Figure 1).

Figure 1. Origins of the new influenza A(H1N1) virus

Materials and methods

Influenza sequences were obtained from the National Center for Biotechnology Information (NCBI) [22] in the United States. We performed a search using Basic Local Alignment Search Tool (BLAST) for each of the eight A/California/04/2009(H1N1) segments separately, recording the 50 best matches. Then we constructed the union of all these matches, taking the sequences for all their segments available in the database. We aligned these sequences using the stretcher algorithm as implemented in the EMBOSS package.

After the alignment we translate the sequences into the binary data, comparing them to the reference sequence site by site. A mutation maps to 1, while a nucleotide identical to that in a reference sequence maps to 0. Whenever there are masks, they map to the corresponding fractional numbers. Gaps are not counted as polymorphisms. Therefore, if there are the S sequences restricted to the P polymorphic sites, these data translate to the SxP matrix. Each row of this matrix can be thought of as a vector in a P-dimensional space, and it represents one of the sequences.

We perform the Principal Component Analysis (PCA) in order to determine the most significant coordinates in this P-dimensional space. After this we leave the principal components which capture 85% of the total variance, discard the remaining ones and project the data onto this relevant coordinate subset.

This procedure is followed by the consensus K-means clustering. Namely, if one targets for K clusters, one repeats the K-means clustering procedure N times, and forms the matrix n whose elements nij (i,j=1,…,S) represent the number of times out of the N trials when the i-th and j-th sequences were clustered together. In our analysis we set N≥100. The matrix of the distances between the samples is:
 

 
One then performs the standard hierarchical clustering with this matrix, targeting for the K clusters. This procedure does not depend on any assumptions made by the phylogenetic models. Note that these techniques can be used for inferring phylogenies as well [23], though this is beyond the scope of the present note.

Results

Sequence comparison of available sequences of the new A(H1N1) virus (as of 27 May 2009) did not identify significant sequence variation, except for a few point mutations. Hence A/California/04/2009(H1N1) was chosen as the representative for further analyses. There are many different phylogenetic techniques, each of them with their own assumptions about evolutionary models that vary in the way of computing genetic distances, probabilities, etc. As opposed to phylogenetic techniques, cluster methods do not have a need for evaluation of a tree, which is a more complicated structure than a set of clusters. Clustering techniques do not provide a detailed phylogenetic structure because they analyse group features of the sequence data. That is why the clustering analysis is more robust to the assumptions we make, for instance, the choice of genetic distance. Unsupervised methods provide a way of identifying clusters without relying on previous information about the origins, host and time isolation.

Figures 2a-2h show the data projected onto the first two principal components with the corresponding percentage of variation. The figures clearly show that in all cases the new virus sequences clustered with those of swine viruses.  The closest matches for each of the segments are summarised in the Table.

Our analyses support the hypotheses whereby the 2009 pandemic influenza A(H1N1) virus derives from one or multiple reassortment(s) between influenza A viruses circulating in swine in Eurasia and in North America. It is schematically illustrated in the Figure 1.

Figures 2 a-h. Cluster analysis of the new influenza A(H1N1) virus


















Table.
Closer clusters to the new influenza A(H1N1) virus

Supplementary Tables 1 to 8 show the results of the clustering for each of the eight segments (PB2, PB1, PA, HA, NP, NA, M NS): 

http://www.eurosurveillance.org/public/public_pdf/Table_1_Cluster_analysis_HA.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_2_Cluster_analysis_NA.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_3_Cluster_analysis_PB2.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_4_Cluster_analysis_PB1.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_5_Cluster_analysis_PA.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_6_Cluster_analysis_NP.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_7_Cluster_analysis_MP.pdf
http://www.eurosurveillance.org/public/public_pdf/Table_8_Cluster_analysis_NS.pdf

Acknowledgements

The work of T. Briese, G. Palacios and W. I. Lipkin was supported by National Institutes of Health awards HL083850 and AI57158 (Northeast Biodefense Center - Lipkin). The work of A. Solovyov has been supported by grant NSF PHY-0756966.


References

1. Webster RG, Laver WG. Studies on the origin of pandemic influenza. I. Antigenic analysis of A 2 influenza viruses isolated before and after the appearance of Hong Kong influenza using antisera to the isolated hemagglutinin subunits. Virology. 1972;48(2):433–444.
2. Y Kawaoka, S Krauss, and R G Webster, Avian-to-human transmission of the PB1 gene of influenza A viruses in the 1957 and 1968 pandemics. J Virol. 1989;63(11): 4603–4608.
3. Scholtissek C, von Hoyningen V, Rott R. Genetic relatedness between the new 1977 epidemic strains (H1N1) of influenza and human influenza strains isolated between 1947 and 1957 (H1N1). Virology. 1978;89(2):613–617.
4. Taubenberger JK, Reid AH, Lourens RM, Wang R, Jin G, Fanning TG., Characterization of the 1918 influenza virus polymerase genes., Nature. 2005;437(7060):889-93.
5. Rabadan R, Levine AJ, Robins H., Comparison of avian and human influenza A viruses reveals a mutational bias on the viral genomes. J Virol. 2006 Dec;80(23):11887-91.
6. Rabadan R, Levine AJ, Krasnitz M. Non-random reassortment in human influenza A viruses. Influenza Other Respi Viruses. 2008;2(1):9-22.
7. Nelson MI, Viboud C, Simonsen L, Bennett RT, Griesemer SB, St George K, et al. Multiple reassortment events in the evolutionary history of H1N1 influenza A virus since 1918. PLoS Pathog. 2008 Feb 29;4(2):e1000012.
8. Zhou NN, Senne DA, Landgraf JS, Swenson SL, Erickson G, Rossow K, et al. Genetic reassortment of avian, swine, and human influenza A viruses in American pigs. J Virol. 1999;73(10):8851-6.
9. Webby RJ, Swenson SL, Krauss SL, Gerrish PJ, Goyal SM, Webster RG. Evolution of swine H3N2 influenza viruses in the United States. J Virol. 2000;74(18):8243-51.
10. Lindstrom SE, Cox NJ, Klimov A. Genetic analysis of human H2N2 and early H3N2 influenza viruses, 1957-1972: evidence for genetic divergence and multiple reassortment events. Virology. 2004;328(1):101-19.
11. Ito T, Couceiro JN, Kelm S, Baum LG, Krauss S, Castrucci MR, et al. Molecular basis for the generation in pigs of influenza A viruses with pandemic potential. J Virol. 1998;72(9):7367-73.
12. Scholtissek C. Pigs as the ‘mixing vessel’ for the creation of new pandemic influenza A viruses. Med Princip Prac. 1990;2:65–71.
13. vanReeth K. Avian influenza in swine: a threat for the human population. Verh K Acad Geneeskd Belg. 2006;68(2):81-101.
14. World Health Organization (WHO).  Influenza A(H1N1). Available from: http://www.who.int/csr/disease/swineflu/en/index.html
15. Centers for Disease Control and Prevention (CDC). H1N1 Swine flu. Available from: http://www.cdc.gov/h1n1flu/
16. Trifonov V, Khiabanian H, Greenbaum B, Rabadan R. The origin of the recent swine influenza A(H1N1) virus infecting humans. Euro Surveill. 2009;14(17):pii=19193. Available from: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=19193 
17. Garten RJ, Davis CT, Russell CA, Shu B, Lindstrom S, Balish A, et al. Antigenic and Genetic Characteristics of Swine-Origin 2009 A(H1N1) Influenza Viruses Circulating in Humans. Science. 22 May 2009 [Epub ahead of print] DOI: 10.1126/science.1176225
18. Novel Swine-Origin Influenza A (H1N1) Virus Investigation Team. Emergence of a Novel Swine-Origin Influenza A (H1N1) Virus in Humans. N Engl J Med. 22 May 2009. [Epub ahead of print].
19. Trifonov V, Khiabanian H, Rabadan R. Geographic Dependence, Surveillance, and Origins of the 2009 Influenza A (H1N1) Virus, New England Journal of Medicine, NEJM. 27 May 2009. [Epub ahead of print] DOI: 10.1056/NEJMp0904572.
20. Shinde V, Bridges CB, Uyeki TM, Shu B, Balish A, Xu X, et al. Triple-Reassortant Swine Influenza A (H1) in Humans in the United States, 2005-2009. N Engl J Med. 22 May 2009. [Epub ahead of print]
21. Rambaut A. Human/Swine A/H1N1 Influenza Origins and Evolution. 3 May 2009. Available from: http://tree.bio.ed.ac.uk/groups/influenza/
22. National Center for Biotechnology Information. Influenza virus resource, information, search and analysis. Available from: http://www.ncbi.nlm.nih.gov/genomes/FLU/FLU.html
23. Alexe G, Satya RV, Seiler M, Platt D, Bhanot T, Hui S, et al. PCA and clustering reveal alternate mtDNA phylogeny of N and M clades. J Mol Evol. 2008;67(5):465-87.

 



Back to Table of Contents
Previous Download (pdf) Next

Disclaimer:The opinions expressed by authors contributing to Eurosurveillance do not necessarily reflect the opinions of the European Centre for Disease Prevention and Control (ECDC) or the Editorial team or the institutions with which the authors are affiliated. Neither the ECDC nor any person acting on behalf of the ECDC is responsible for the use which might be made of the information in this journal.
The information provided on the Eurosurveillance site is designed to support, not replace, the relationship that exists between a patient/site visitor and his/her physician. Our Website does not host any form of commercial advertisement.

Eurosurveillance [ISSN] - ©2008 All rights reserved
 

This website is certified by Health On the Net Foundation. Click to verify. This site complies with the HONcode standard for trustworthy health information:
verify here.