The spatiotemporal characteristics of influenza A and B in the WHO European Region: can one define influenza transmission zones in Europe?

We aimed to assess the epidemiology and spatiotemporal patterns of influenza in the World Health Organization (WHO) European Region and evaluate the validity of partitioning the Region into five influenza transmission zones (ITZs) as proposed by the WHO. We used the FluNet database and included over 650,000 influenza cases from 2000 to 2015. We analysed the data by country and season (from July to the following June). We calculated the median proportion of cases caused by each virus type in a season, compared the timing of the primary peak between countries and used a range of cluster analysis methods to assess the degree of overlap between the WHO-defined and data-driven ITZs. Influenza A and B caused, respectively, a median of 83% and 17% cases in a season. There was a significant west-to-east and non-significant (p = 0.10) south-to-north gradient in the timing of influenza activity. Typically, influenza peaked in February and March; influenza A earlier than influenza B. Most countries in the WHO European Region would fit into two ITZs: ‘Western Europe’ and ‘Eastern Europe’; countries bordering Asia may be better placed into extra-European ITZs. Our findings have implications for the presentation of surveillance data and prevention and control measures in this large WHO Region.


Introduction
The World Health Organization (WHO) European Region includes 53 countries covering a total population of nearly 900 million inhabitants. Influenza has a substantial medical and economic burden every season in the World Health Organization European Region (WHO/ Europe) [1][2][3][4], and the reduction of influenza-related morbidity and mortality has long been recognised as a priority health objective in Europe.
Influenza viruses spread rapidly and their transmission can be favoured by anthropogenic factors such as the increase in international travel and commuters' mobility [5][6][7][8]. The WHO European Region has become increasingly interconnected, especially since the end of the Cold War in 1991 and the eastward enlargement of the European Union (EU), and it is widely accepted that efficient and timely influenza surveillance must be coordinated at national and supranational level. Countries in the west of Europe have been sharing epidemiological and virological data via the European Influenza Surveillance Scheme (EISS) since 1996 [9,10], and this collaborative project became in 2008 the European Influenza Surveillance Network (EISN) coordinated by the European Centre for Disease Prevention and Control (ECDC). The WHO Regional Office for Europe extended the surveillance activities of EISS to all countries of the WHO European Region in 2008 [10].
Influenza epidemics typically peak during the northern hemisphere winter (November to March) in the WHO European Region [11,12]. Earlier research found that the timing of influenza activity moves across Europe, frequently travelling from west to east and, less frequently, from south to north [13,14], suggesting that there may be some heterogeneity in the timing of influenza epidemics among countries of the WHO European Region.
The The ITZs were defined as large supranational areas encompassing "countries, areas or territories with similar influenza transmission patterns" [15]. As far as we know, no study has been conducted to verify whether the partitioning of the WHO European Region is justified from an epidemiological and/or virological standpoint. The aim of the present study was therefore to assess the epidemiology and spatiotemporal patterns of influenza A and B in the WHO European Region, and evaluate the validity of its partitioning into five ITZs as proposed by WHO.

Source of data
FluNet is a publicly available, web-based database maintained by the WHO Global Influenza Surveillance and Response System since 1995, in which National Influenza Centres (NICs) from countries around the world enter epidemiological and virological data on influenza on a weekly basis [16]. On 12 October 2015, we downloaded the weekly number of laboratory-confirmed influenza cases reported to the national surveillance systems of all countries in the WHO European Region from week 1/1999 onwards, broken down by virus type (influenza A, B), subtype (H1N1, H1N1pdm09, H3N2, not subtyped) and lineage (Victoria, Yamagata, not characterised).
Because seasonal influenza epidemics typically occur in the winter months in temperate countries of the northern hemisphere, we opted to use 'country season' as the unit of analysis, defined as the period between 1 July of a year and 30 June of the following year in a given country. Therefore, we finally included in the analyses the data from week 27/1999 to week 26/2015.

Descriptive analysis
For each country and season, we determined the proportion of laboratory-confirmed influenza cases that were caused by each virus type, subtype and lineage, and calculated the corresponding median value for all countries in the WHO European region. We then calculated the percentage of country seasons in which a given virus type, subtype and lineage accounted for 50% or more of all reported influenza cases. In order to increase the reliability of the results, country seasons with fewer than 100 overall reported influenza cases were excluded from this analysis. This number was chosen as a trade-off between the necessity to have a sufficiently high number of cases to estimate key epidemic parameters (including the timing of the epidemic peak), and the requirement to include as many countries as possible, which was important given the main objective of our analysis.

Spatiotemporal patterns of influenza epidemics
Data before 2009 were not used for the study of spatiotemporal patterns of influenza epidemics in the WHO European Region because they were not complete for most countries. The pandemic season 2009/10 was also excluded as it was an atypical season with the introduction of a novel pandemic strain (influenza A(H1N1)pdm09) [17], hence not suitable for seasonal analyses. Spatiotemporal analyses were therefore performed from week 27/2010 to week 26/2015. Based on [15]. Blue: Northern Europe; orange: South West Europe; green: Eastern Europe; red: Central Asia; violet: Western Asia.
For each site, we first de-trended the time series with a quadratic polynomial. We then generated a periodic annual function (PAF) of each time series by summing up the annual, semi-annual and quarterly harmonics as obtained by Fourier decomposition [18,19]. The timing of the primary peaks of the PAF was extracted and compared between sites, as based on their latitude (defined as the latitude of the NIC; in countries with more than one NIC, we chose the one situated in the largest city). The timing of the primary peak indicates the period when the maximum intensity of disease burden usually takes place. Primary peaks were also extracted separately for influenza A and B.

Influenza transmission zones
We chose a cluster analysis approach to obtain datadriven ITZs using the country season as the unit of analysis. Several algorithms are available to group objects into a set of mutually exclusive and exhaustive clusters. Here, there was no a priori reason to prefer any specific cluster analysis technique; therefore, we selected a procedure whereby the outputs of several cluster models were compared with one another in order to identify groups of countries that were consistently (i.e. across different models) assigned to the same cluster. We used a multiple cluster approach to draw robust conclusions and not be dependent on a single clustering methodology or a set of inputted parameters.
A common requirement for most cluster analysis algorithms is that there must not be any missing values for the variables that are used for the analysis. In our analysis, this implies that all included countries must have influenza surveillance data for all seasons. Because of this requirement, we limited the database used for the cluster models to data from four consecutive seasons (2011/12 to 2014/15) in 37 countries (see below for details). This selection was made as there was a substantial reduction in the number of countries (from  In each season and country, we calculated the start and the peak of the influenza season, defined retrospectively as the first week in which at least 10% of all reported cases had occurred [20], and, respectively, the week with the highest number of reported cases. As the epidemics caused by the different influenza virus types and subtypes may differ in timing in the same country and season, the week of start and peak were calculated for all influenza cases taken together and separately for influenza A(H1N1)pdm09, A(H3N2) and B (if there were fewer than 100 cases for a given virus (sub)type, the start and peak of the epidemic caused by that virus (sub)type were assumed to coincide with those of the overall influenza season). We then calculated the median start and peak week of the influenza season (overall and by virus (sub)type) across seasons.
We fitted several cluster analysis models by varying the statistical method and the variables inputted in the model. In terms of the clustering algorithm, we used complete linkage, average linkage and k-means clustering. Concerning the model parameters, we hypothesised that the ITZs may differ with regard to the timing of the influenza season, the influenza virus mixing by season, or both. Accordingly, we initially fitted cluster models that included 'timing' parameters only (seasonspecific or median start and/or peak of the influenza season, for all influenza cases or by virus (sub)type), 'virus mixing' parameters only (percentage of influenza cases caused by each virus (sub)type), or both sets of parameters. We present here results generated by the models that included the timing parameters only, as the other models did not yield consistent and epidemiologically meaningful results (i.e. the geographical clusters were too diverse or varied).
Overall, the cluster analysis was repeated 18 times using different models. As there was no a priori criterion to prefer any single model over the others, we opted to summarise the results by calculating, for each pair of countries, the proportion of the 18 cluster models in which both countries fell into the same cluster ('proportion of agreement') and used the following algorithm to identify data-driven ITZs.

Definition of the core cluster of countries in an influenza transmission zone
A core cluster was defined when it included at least three countries and was identified according to two criteria: (i) The first (or 'internal') criterion states that all core countries of an ITZ must have a proportion of agreement of 80-100% between one country and another. This criterion ensures that all countries in the ITZ fit in the same cluster. (ii) The second (or 'external') criterion states that all core countries of an ITZ must have a proportion of agreement < 70% with all countries belonging to a different ITZ. This criterion ensures that none of the countries in the ITZ fit into another ITZ. Together, these two criteria ensure that the ITZs are mutually exclusive, i.e. they rule out the possibility that a country may belong to more than one ITZ. Importantly, the separation of clusters was enhanced by our decision to impose a 10% buffer between the inclusion (≥ 80%) and the exclusion (< 70%) criterion.

Figure 4
Partitioning of countries of the WHO European Region into two cluster models-derived influenza transmission zones, WHO FluNet database, July 2011-June 2015 WHO: World health Organization.
Blue: 'Western' zone; green: 'Eastern' zone; grey: countries not assigned to any influenza transmission zone; red: countries not included in the analysis.  Not reported if there were fewer than five seasons of data.

Expansion of existing influenza transmission zones by adding non-core countries
The attribution of the remaining countries to an existing ITZ was made according to a relaxed version of the two criteria above. Namely, each remaining country was assigned to an existing ITZ if its proportion of agreement was 70-100% with all countries in that ITZ, and < 70% with all countries in a different ITZ.
Countries not allocated to an influenza transmission zone All of the remaining countries were considered not allocated to any ITZ.
As the countries in the WHO European Region fall into five ITZs according to the WHO, we initially set the number of clusters in the models to five. As the results were not satisfactory (see below), we then modified the model's settings by progressively reducing the number of clusters to four, three and two.

Statistical software
The EPIPOI software [19] was used to study the spatiotemporal patterns of influenza epidemics. We used Stata version 14 (Stata Corp, College Station, United States) to conduct the cluster analysis. Maps were prepared using freely available software (http://mapchart.net/).

Results
The  Figure  2 and Figure 3.
We found a notable coincidence in peak times: all countries (except the UK) had their primary peaks in February and March. Influenza epidemics usually peaked at the end of January in the UK -earlier than in the remaining countries. There was a non-significant longitudinal gradient in the timing of the primary peak, with countries in the west peaking earlier than those in the east (the typical timing of the primary peak fell at the end of January in the UK, and in mid-March in Ukraine). The p value was 0.125 when regressing the timing of the primary peak against the country's longitude; however, the gradient became statistically significant (p = 0.001) when ignoring Kazakhstan, Uzbekistan and Iceland, which behaved as highly influential points in the model because of their geographical position. There appeared to be a slight, non-significant (p = 0.100) latitudinal gradient as well, with southern countries peaking a bit earlier than those countries with progressively higher latitudes in the north (for instance, mid-February in Spain and early March in Sweden). The time period between the earliest and latest country-specific influenza peaks was two months in the WHO European Region (Figure 2). Considering that influenza viruses circulate for two weeks before and after the peak is reached in any given country [21], it appears that a typical influenza season lasts an average of three months in the WHO European Region. Influenza A peaked earlier than influenza B in most countries (Table 2): the average time period between the peak of influenza A and B was 1.6 weeks.

Influenza transmission zones
For the cluster analysis, we included 290,915 influenza cases reported from July 2011 to June 2015 in 37 countries in the WHO European Region (all those included in the previous analyses except Bosnia and Herzegovina, Czech Republic, Kyrgyzstan, Malta, Slovakia and Uzbekistan).
The output of models with a five-cluster setting was largely inconsistent both between models and with respect to the ITZs proposed by the WHO. Results were highly dependent on the methodology used to derive the clusters and the parameters inputted into the model. Upon calculating the proportion of agreement and applying the algorithm described above, it was possible to identify a single ITZ, which included only seven countries (Austria, Denmark, Estonia, Georgia, Greece, Hungary and Republic of Moldova) that were largely non-contiguous with each other.
The models' outputs became progressively more consistent between one another when the number of clusters was reduced to four and three, although the ITZs were still small and not entirely sensible from a geographical standpoint as they were partly formed by nonneighbouring countries. Models assuming two clusters led to the identification of two data-driven ITZs which we have named 'Western Europe' and 'Eastern Europe' (Figure 4), although these labels were to some extent inaccurate: Albania, Bulgaria and Israel were assigned to the Western Europe ITZ and Denmark was assigned to Eastern Europe. The non-core countries were Ireland, Norway and the UK in the Western ITZ, and Estonia and Ukraine in the Eastern ITZ. The assignment of Greece and Poland to the 'Eastern Europe' ITZ, and of Slovenia to the 'Western Europe' ITZ, was not possible because their proportion of agreement with one country in the other ITZ was ≥ 70%. For the other non-assigned countries (Croatia, Georgia, Germany, Kazakhstan, Netherlands and Turkey), the inclusion and exclusion criteria were not met in two or more cases.
Influenza epidemics started and peaked 2-3 weeks earlier in the Western than in the Eastern Europe ITZ (median week of start: 2 vs 4; median week of peak: 5 vs 8). There were no statistically significant differences in the median percentage of influenza cases that were caused by each virus (sub)type in countries belonging to the two ITZs (data not shown). Nine countries could not be assigned to either ITZ, some of which (the Netherlands, Germany, Slovenia, Croatia, Greece and Turkey) form a border or line between the two zones in a direction from the north-west to the south-east. Ireland in Western Europe, Georgia in the Caucasus region, Kazakhstan in Central Asia, and Poland were also non-classified countries.

Discussion
We investigated the epidemiology and spatiotemporal patterns of influenza in the WHO European Region and evaluated whether the allocation of countries of this large world region into five ITZs (as proposed by WHO) could be confirmed from an epidemiological standpoint. Influenza A(H3N2) was most frequently the dominant virus in the study period , followed by influenza A(H1N1) and influenza B. Epidemic peaks were distributed over a period of two months, with longitudinal (west-to-east) and latitudinal (south-to-north) gradients of timing. The peak of influenza B epidemics typically occurred later than those for influenza A, in agreement with earlier findings [22][23][24][25].  [22,26] and highlight the role of influenza B as an important contributor to the total burden of disease of influenza.
We investigated the seasonal patterns of influenza circulation across a large range of latitudes and longitudes in Europe and part of western Asia. Because all countries are in the temperate region of the northern hemisphere, they all share the same winter timing and their seasonal patterns of influenza circulation were similar. The differences in the timing of influenza epidemics appeared to be smoothly distributed along a continuum in this large world area, without any clean break between countries or group of countries. The overall period of influenza activity can be estimated at about three months in a typical season, and there were longitudinal (west-to-east, significant) and latitudinal (south-to-north, not significant) patterns in the timing of seasonal peaks. Our findings suggest that the WHO European Region is not homogeneous with regard to the spread of influenza epidemics, though probably not so fragmented as to justify its partitioning into five ITZs. however, the limited availability of data for these countries does not allow definitive conclusions. The two data-driven ITZs differ from one another in the timing of epidemics but not in terms of circulating virus (sub) types, therefore the term 'influenza transmission zone' does not appear to be entirely appropriate and might be reconsidered.
We believe that establishing data-driven ITZs in the WHO European Region has important public health implications and can serve multiple purposes. Information on the course of influenza seasons could be developed and communicated at ITZ level in addition to the national level. Preparedness planning of seasonal influenza activity could be coordinated among countries included in the same ITZ. Also, the distribution of sentinel sites on the territory of countries within each ITZ could be redesigned so as to optimise the influenza surveillance activities in the ITZ as a whole. Because of the west-to-east and south-to-north gradients of spread, it may be worthwhile to evaluate whether, and to what extent, countries in the south-west of Europe could serve as sentinel sites for the rest of the WHO European Region (or at least for the Western Europe ITZ). Finally, by merging countries with similar patterns of influenza transmission, and in particular, with synchronised timing of influenza epidemics, the ITZs could be also seen as vaccination zones [27], i.e. groups of countries for which the timing of influenza vaccination campaigns could benefit from harmonisation.
A major strength of our study is the use of surveillance data from most countries in the WHO European Region for several consecutive influenza seasons. We used a range of complementary statistical techniques to study spatiotemporal patterns of influenza epidemics. As far as we know, this is the first study to assess the validity of the WHO-defined ITZs in a defined world Region of WHO. We chose to average the outputs from cluster analysis models with varying model specifications to increase the robustness of our results. However, because of the exploratory nature of our analytical approach, the limited number of seasons included in the cluster analysis, and some inconsistencies in the results, further analyses using alternative methods are warranted to confirm or refute our findings. For instance, this is the first study which has tried to define ITZs by averaging models from multiple clustering techniques, and we had no guidance on what thresholds we should use to define an ITZ. Also, different definitions to determine the start of an influenza season are available [28] and these may lead to different results. In addition, taking into account other parameters of an influenza season may help improve the partition of the WHO European Region into different ITZs.
We recommend that our cluster analysis for the WHO European Region is repeated within 3-4 years (with twice the amount of data) and the investigation is extended to bordering Regions. By including, for instance, Northern Africa and the Middle East, one may be able to categorise countries such as Turkey and Georgia that were not assigned to an ITZ in our analysis. This will allow a better definition of the ITZs for the WHO European Region and world-wide.