Eurosurveillance banner




Announcements
Read our articles on the ongoing Ebola outbreak in West Africa

Follow Eurosurveillance on Twitter: @Eurosurveillanc


In this issue


Home Eurosurveillance Monthly Release  2002: Volume 7/ Issue 12 Article 5 Printer friendly version
Back to Table of Contents
en es fr
Previous Next

Eurosurveillance, Volume 7, Issue 12, 01 December 2002
Surveillance report
Real-time modelling of influenza outbreaks - a linear regression analysis

Citation style for this article: Mooney JD, Holmes E, Christie P. Real-time modelling of influenza outbreaks - a linear regression analysis. Euro Surveill. 2002;7(12):pii=390. Available online: http://www.eurosurveillance.org/ViewArticle.aspx?ArticleId=390

J.D. Mooney1, E. Holmes2, P. Christie1

1 Scottish Centre for Infection & Environmental Health, Glasgow, United Kingdom
2 University of Strathclyde, Glasgow, United Kingdom


Seasonal outbreaks of influenza exert a considerable burden on health services, and are notorious for their variability from year to year. Making use of historical data from the Scottish sentinelle surveillance since 1972, a potential candidate model has been derived based on simple linear regression. It was applied with a measure of success in the 1999–2000 winter season.
 

Introduction

Influenza outbreaks are notoriously difficult to predict, even when a seasonal outbreak is underway, both in their likely time course and severity at an individual and a population level (1). Even two subsequent annual outbreaks caused by an identical strain of the virus can have very different impacts on both the timing and the levels of resulting illness in the population.

The major real time indicator of influenza activity in Scotland comes from the sentinel network of volunteer general practices (2). This spotter scheme currently involves 90 practices in 12 health board areas covering a total of 10% of the Scottish population. Participating practices submit weekly totals for the approximate number of consultations for ‘flu-like illness’ from which can be derived a consultation rate per 100,000 based on population projections from the sample reporting. Although essentially a voluntary set-up, in which not all health boards are represented, the flu-spotter network has proven to be a consistently reliable early indicator of the onset of seasonal influenza illness, since the scheme’s inception in 1972.

As well as serving to illustrate the wide between season variability of influenza outbreaks in both timing and magnitude, an examination of cumulative plots for spotter data from past seasons reveals classic sigmoid curves (see Figure 1). It was postulated that the rate of increase at the midpoint of the outbreak, where the rise in reporting for flu like illness is greatest, may be used to predict the likely total number of cases for that season as estimated by the cumulative flu spotter totals. From the cumulative plot, the best approximation that can be measured for the midpoint of a seasonal outbreak would be the maximum rate of increase between any two consecutive weeks.

Methods

Data on consultations for influenza like illness was available from Scottish GP spotter practices for the years 1972 to 1999. Estimates for the total numbers of cases seen in each week were derived during the flu spotter season (weeks 40 to 20 of the following year) by multiplying the overall Scottish rate per 100 000 by 51.2 (population 5.12 million).

The differences between the numbers of cases one week and the next week were calculated for each week of the season and the maximum increase for each year was noted.

The dataset was log-transformed and a linear regression model was fitted to the total number of cases vs. the maximum increase seen for each season between any two consecutive weeks. A 95% prediction interval was calculated for the expected total numbers of cases dependent on the maximum increase. The resulting model was then used to provide weekly estimates of the total numbers of expected cases by week 20 during the ongoing flu seasons of 1999–2000 and 2000–2001.

Results

Performing a simple linear regression with the total estimated cumulative cases for each season versus the maximum increase (d) (corresponding to the sharpest rise in the rate) (both log transformed) gives rise to a significant positive correlation (p < 0.005, R2 = 72%), which can be described as follows (with 95% prediction interval*) (Figure 2):

log (expected total) = 7.5134 + 0.4693 x log (max. increase)

Giving

Expected total = exp(7.5134) x max increase 0.4693

Upper / lower PI = exp (7.1534+/-1.96*0.1998) + max increase 0.4693

[*95% PI based on the residual standard deviation about the fitted line].

Application of the model

The utility of the model was then investigated for the winter flu season of 1999/2000. The sharpest increase in the GP spotter rates occurred between week 52 and 53 and gave rise to an expected total of 169 057 consultations with 95% prediction interval ([114 277–250 096]) for the whole season. At the end of the season, the actual estimate based on cumulative figures from week 40 to week 20 was 175 787, less than 5% difference from the predicted total. Since there is no way of knowing in advance what the maximum change will be, the estimate of total likely consultations was revised weekly throughout the season, based on the extent of change over the previous week (see appendix 3). The continuously revised estimate made it possible to expect by week 53 that 1999–2000 was likely to be more severe than the flu season of 1998–99 with a probability of 84% (based on the standard deviation of the prediction interval), where the total estimated cases at the end of the season was 137 336.

A revised model (incorporating the results of the 1999–2000 season – revised expression:

Log (expected total) = 7.526 + 0.468 x log (max. increase),

giving:

Expected total = exp(7.526) x max increase 0.468

was then applied during the 2000/01 season, a winter that saw the lowest flu activity since 1972, and spotter rates that rarely exceeded the baseline threshold level of 50 consultations per 100000 population (5). Even at this very low level of activity, the final cumulative total for consultations (54 033) was still within the predicted range (Predicted total = 46 556; 95% PI = 7089,305775).

Discussion

Seasonal outbreaks of influenza are difficult to predict for a number of reasons. The continual antigenic changes between seasons, the introduction of new viral strains, the high proportions of sub-clinical infections and continuing controversy over factors which affect transmission all combine to frustrate attempts to model or define a ‘typical’ influenza outbreak. Since even modest influenza outbreaks can exert additional pressures on health services however, the benefits for planning and healthcare purposes of a model that is simple to apply and has some capacity to predict the course of an ongoing outbreak are self-evident.

The main drawbacks to the above model as a predictive tool are firstly the very wide prediction intervals which accompany the estimated eventual size of the outbreak and secondly, like all linear regression models, it becomes less reliable at the extreme ends of the range of the source data on which it is based (3). Since in prediction intervals, the scatter of the individual data about the fitted line becomes more directly relevant, they are invariably much wider than the equivalent confidence interval for the fitted values (6). In theory it should be possible also to refine the model with each additional season, although the nature of prediction intervals means again that their likely reduction will be small. The increasing availability of rapid virological testing also makes it possible to identify quickly the underlying virus types that are contributing to an increase in illness presentation (eg: A alone, B alone or A + B). The well established differences in severity and population health impact between A and B strains (7) may mean that introducing interaction terms to the regression, according to the epidemic type as suggested by Dab et al, could improve the predictive capability of the model (8). A model which took account of virus type may also be able to begin to address the likely time course of an ongoing outbreak, often as important a consideration with regard to health service planning as overall population attack rate.

Although the limitations of the model prevent its adoption as a definitive predictive tool, its usefulness relates more to the capacity to provide a dynamic weekly revisable estimate of the likely severity of an ongoing flu outbreak. While the current model does not specifically address the timing of any peak, large increases in consulting rates are likely to be followed with higher workloads in secondary health services. Additionally, although consulting patterns are not by any means the only indicator of influenza activity, they are certainly the timeliest and sentinel practice networks like that in Scotland are used widely throughout Europe 9. Variations of the presented model may also therefore be of interest to other countries that have a significant historical dataset.

Conclusion

Tillet and Spencer have previously highlighted the potential of cumulative totals of GP consultations, among other indicators, for describing the extent of influenza outbreaks in England and Wales (4). The model presented here demonstrates that it is possible to describe the relationship between cumulative total numbers of consultations and the maximum weekly increase for seasonal outbreaks of influenza using simple linear regression, allowing predictions for the eventual size of an outbreak to be revised as the winter season progresses. The wide ranging prediction interval seen during the exceptionally mild influenza season of 2000–01, although in keeping with the diminishing applicability of regression models at the extremes of their range, is probably not a serious practical limitation in that the main use of the model would be to flag up potentially large epidemics as early as possible. The increased availability of rapid virological testing may make it possible to further refine models such as that presented here, on the basis of the type(s) of influenza in circulation in any one season.

Annex 1. Winter season

Semaine N° /
Week no.

94-95

95-96

96-97

97-98

98-99

*99-00

*00-01

40

1863.68

1709.568

1336.931

500

1204.164

823.36

1389.42

41

4290.56

3850.752

2791.19

1139.648

2037.816

1698.18

2315.7

42

6656

6475.264

4548.035

3631.856

4132.238

2727.38

4219.72

43

9057.28

10698.24

6777.797

6036.067

5603.994

4065.34

5660.6

44

11822.08

14717.44

8470.831

8044.551

7369.072

5454.76

7307.32

45

14510.08

20331.01

10296.12

10598.51

10214.81

7152.94

9005.5

46

17146.88

31774.21

12739.44

14304.15

12499.63

8645.28

10343.46

47

20413.44

46856.19

15036.1

17013

15257.89

10600.76

11629.96

48

23470.08

66041.34

18284.25

20868.9

18726.29

13791.28

12659.16

49

26101.76

86018.56

23800.25

24574.53

21674.95

16981.8

14048.58

50

29178.88

105393.2

30764.85

28399.04

25771.17

23157

15540.92

51

32460.8

124280.8

40738.82

31496.42

29270.45

38955.22

17393.48

52

34949.12

138060.3

59248.99

34459.48

32563.89

69934.14

19709.18

1

37698.56

151779.8

93457.54

37703.01

40797.49

113109.1

22590.94

2

41216

162918.4

132469.9

42823.79

56698.63

142389.8

25781.46

3

44661.76

169642

162121.6

46930.3

72239.55

155666.5

28354.46

4

48030.72

174107.1

179232.1

50527.35

87008.57

162407.8

31030.38

5

51456

177203.2

192917.9

53084.92

97763.71

165649.7

33140.24

6

55183.36

179746.8

203368.4

57175.99

106203.1

168325.7

36227.84

7

59361.28

182132.7

210563

61014.9

111915.2

170075.3

38800.84

8

64184.32

183786

215819.6

67087.18

116392.2

171207.4

41270.92

9

68198.4

185470

220106.8

73097.71

119325.4

172185.2

43072.02

10

72576

186652.7

222397.3

78444.4

122104.3

172802.7

45542.1

11

77317.12

187955.2

224594.6

83600.7

124780.2

173265.8

47857.8

12

82150.4

189530.6

226985.9

89348.78

127713.4

173523.1

49658.9

13

86579.2

191117.3

228600.2

94067.66

129514.5

173986.3

50996.86

14

90341.38

192132.1

229484.3

98179.31

130955.4

174397.9

51974.6

15

92318.21

193004

230396.7

101570.5

132087.5

174706.7

52900.88

16

93951.49

193682.9

231032.8

103634.1

132910.9

175118.4

53312.56

17

95152.64

194412

231993.5

105491.8

133837.2

175324.2

53569.86

18

96152.58

195476.5

232564.7

106989.3

135483.9

175427.1

53827.16

19

96593.41

195809.8

233001.1

107622.2

136513.1

175633

53930.08

20

97392.13

195993.6

233397.9

108250

137336.4

175787.4

54033

*saisons 99/00 et 00/01 (consultations cumulées d’après les données du système de surveillance) également montrées / 99/00 and 00/01 seasons (cumulative consultations from Spotter data), also shown.

Annex 2. Data and table: Regression line and constituent values used for model

Année / Year

Nombre estimé de cas / Total est.cases

Augmentation maximale / Max. increase

Limite inférieure de l’IP / Lower limit PI

Valeur prévue / Predicted value

Limite supérieure de l’IP / Upper limit PI

72

199731

28313

104826.0

220514.1

336202.2

73

231526

45824

171721.9

290533.8

409345.7

74

213606

25190

92354.9

208026.4

323698.0

75

466125

64972

239432.7

367099.2

494765.7

76

193434

14438

48155.9

165033.4

281910.8

77

299162

48742

182386.6

302201.7

422016.8

78

244378

19610

69659.7

185714.2

301768.7

79

222925

20941

75121.1

191036.3

306951.6

80

197274

21299

76584.9

192467.8

308350.8

81

396083

33945

126897.8

243034.3

359170.7

82

312576

28825

106854.8

222561.4

338268.0

83

202035

37376

140082.2

256753.5

373424.8

84

162560

15514

52666.3

169335.9

286005.4

85

247091

24371

89056.9

204751.6

320446.3

86

182989

18637

65648.4

181823.5

297998.7

87

134349

7424

18292.1

136987.1

255682.1

88

165120

18022

63104.8

179364.4

295624.0

89

242125

60621

224497.0

349701.2

474905.5

90

103475

7015

16526.4

135351.7

254177.0

91

169011

20429

73023.8

188989.1

304954.3

92

151589

21569

77687.5

193547.5

309407.4

93

243968

45210

169460.9

288078.6

406696.4

94

97392

4833

7062.3

126626.7

246191.1

95

197433

19978

71172.7

187185.7

303198.7

96

232208

38815

145553.7

262507.5

379461.4

97

109114

6073

12449.7

131585.0

250720.3

98

134420

15902

54288.1

170887.3

287486.6

Annex 3. Season 1999 / 2000 – Predicted total consultations based on weekly changes

Semaine N° / Week N°

Total

Total

Taux pour / Rate per

Total des cas / Total cases

Hausse semaine précédente /

*Total prévu pour la saison

IC inférieur

IC supérieur

Cumulé / cumulative

NO.

Cas / cases

Denom.

100,000

Taux x 52 /

Rate x 52

Increase over prev. week

*Pred. total for season

Lower CI

Upper CI

À ce jour / to date

40

48

456278

10.5199

547

547

41

68

430612

15.79148

821

274

25531

17258

37769

1368

42

72

427399

16.84609

876

55

25531

17258

37769

2244

43

86

434944

19.77266

1028

152

25531

17258

37769

3272

44

118

448722

26.29691

1367

339

28213

19071

41737

4639

45

117

441591

26.49511

1378

10

28213

19071

41737

6017

46

144

440840

32.66491

1699

321

28213

19071

41737

7716

47

127

437530

29.02658

1509

-189

28213

19071

41737

9225

48

164

430929

38.05731

1979

470

32888

22232

48654

11204

49

263

427391

61.53616

3200

1221

51478

34798

76155

14404

50

272

436715

62.28318

3239

39

51478

34798

76155

17643

51

536

447541

119.7656

6228

2989

78360

52969

115922

23871

52

1394

454703

306.5737

15942

9714

136243

92096

201552

39813

53

2291

380284

602.4445

31327

15385

169057

114277

250096

71140

1

3478

410038

848.2141

44107

12780

169057

114277

250096

115247

2

2501

439569

568.9664

29586

-14521

169057

114277

250096

144833

3

1134

440012

257.7202

13401

-16185

169057

114277

250096

158234

4

570

435466

130.8943

6807

-6595

169057

114277

250096

165041

5

276

435990

63.3042

3292

-3515

169057

114277

250096

168333

6

228

440821

51.72167

2690

-602

169057

114277

250096

171023

7

148

440627

33.5885

1747

-943

169057

114277

250096

172770

8

96

432945

22.17372

1153

-594

169057

114277

250096

173923

9

79

424330

18.61759

968

-185

169057

114277

250096

174891

10

54

435489

12.39985

645

-323

169057

114277

250096

175536

11

40

449283

8.903074

463

-182

169057

114277

250096

175999

12

23

427132

5.384752

280

-183

169057

114277

250096

176279

13

39

420583

9.272843

482

202

169057

114277

250096

176761

14

33

436850

7.55408

393

-89

169057

114277

250096

177154

15

25

430433

5.808105

302

-91

169057

114277

250096

177456

16

36

429079

8.390063

436

134

169057

114277

250096

177892

17

15

346809

4.325147

225

-211

169057

114277

250096

178117

18

10

411464

2.430346

126

-99

169057

114277

250096

178243

19

12

334341

3.58915

187

60

169057

114277

250096

178430

20

10

336056

2.975695

155

-32

169057

114277

250096

178585

*Total prévu d’après l’augmentation maximale à ce jour.


References

1. Cliff AD. Statistical modelling of measles and influenza outbreaks. Statistical Methods in Medical Research 1993; (2): 43-73.

2. Christie P, Mooney J. Surveillance Report on Flu Spotters data 1999-2000. SCIEH Weekly Report 2000;34(36):218-219

3. Kirkwood B.R. Correlation and Linear Regression. Chapter 9 in Essentials of Medical Statistics. Blackwell Science Ltd. Oxford 1998; p57-64.

4. Tillett HE, Spencer IL. Influenza surveillance in England and Wales using routine statistics. Development of 'cusum' graphs to compare 12 previous winters and to monitor the 1980/81 winter. J Hygiene 1982 Feb;88(1):83-94.

5. Christie P, Mooney J, Smith A. Surveillance Report on Flu Spotters data and SERVIS scheme 2000-2001. SCIEH Weekly Report 2001; 35(24): 154.

6. Altman D. Relationship between two continuous variables. Chapter 11 in Practical Statistics for Medical research. Chapman & Hall. London 1995; p277-234.

7. Monto AS. Individual and community impact of influenza. Pharmacoeconomics 1999;16 Suppl 1:1-6

8. Dab W, Quenel P, Cohen, JM, Hannon C. A new influenza surveillance system in France: the Ile-de-France "GROG".2. Validity of indicators (1984-89). Eur J Epidemiol 1991; 7(6):579-87.

9. Zambon M. Sentinel surveillance of influenza in Europe, 1997/1998. Eurosurveillance 1998; 3: 29-31.

 



Back to Table of Contents
en es fr
Previous Next

Disclaimer:The opinions expressed by authors contributing to Eurosurveillance do not necessarily reflect the opinions of the European Centre for Disease Prevention and Control (ECDC) or the editorial team or the institutions with which the authors are affiliated. Neither ECDC nor any person acting on behalf of ECDC is responsible for the use that might be made of the information in this journal.
The information provided on the Eurosurveillance site is designed to support, not replace, the relationship that exists between a patient/site visitor and his/her physician. Our website does not host any form of commercial advertisement.

Eurosurveillance [ISSN] - ©2007-2013. All rights reserved
 

This website is certified by Health On the Net Foundation. Click to verify. This site complies with the HONcode standard for trustworthy health information:
verify here.