Predictive performance of telenursing complaints in influenza surveillance : a prospective cohort study in Sweden

T Timpka (toomas.timpka@liu.se)1,2,3, A Spreco2,3, O Eriksson3, Ö Dahlström4, E A Gursky5, M Strömgren6, E Holm6, J Ekberg1,2, J Hinkula7, J M Nyce8, H Eriksson3 1. Department of Public Health, Östergötland County Council, Linköping, Sweden 2. Department of Medical and Health Sciences, Linköping University, Linköping, Sweden 3. Department of Computer and Information Science, Linköping University, Linköping, Sweden 4. Linnaeus Centre HEAD, Department of Behavioural Sciences, Linköping University, Linköping, Sweden 5. National Strategies Support Directorate, ANSER/Analytic Services Inc, Arlington, Virginia, United States of America 6. Department of Social and Economic Geography, Umeå University, Umeå, Sweden 7. Department of Clinical and Experimental Medicine, Linköping University, Linköping, Sweden 8. Department of Anthropology, Ball State University, Muncie, Indiana, United States of America


Background
Data source alternatives to mandatory reporting by microbiological laboratories and sentinel physician practices have been sought to improve the timely detection of influenza outbreaks [1,2].Telenursing is defined as call centres staffed by registered nurses who perform counselling and patient triage as a means of augmenting self-care support and regulating patient access to medical services.This strategy is rapidly expanding in many countries, with prominent examples in Sweden, the United Kingdom (UK) and Canada, [3] and telenursing call data have been regarded as a promising source of syndromic surveillance [4][5][6].However, a study published in 2009 reported low validity of current telenursing data for monitoring and predicting influenza outbreaks [7].Several possible reasons for this shortcoming were identified, such as a lack of specificity due to broad definitions of influenzalike illness (ILI) and the use of suboptimal evaluation methods.Cough and high fever with rapid onset have been known as the symptoms best discriminating influenza infection [8], but recently it has been observed that when aggregated at the population level, the incidence of influenza symptoms may differ between age groups, status (hospitalised or outpatient), or (sub) type of influenza virus [9].In a patient cohort in the United States (US) with confirmed influenza A(pH1N1) pdm09 virus infection, the most common symptoms were fever (94%) and cough (92%), followed by sore throat (66%), diarrhoea (25%), and vomiting (25%) [9].In contrast with a parallel Singapore cohort of A(pH1N1) pdm09 virus -infected patients, similar proportions with cough (88%), fever (79%), and sore throat (54%) were reported, but fewer patients described vomiting (1.1%), and diarrhoea (0.7%) [10].In the latter study it was also reported that there were differences in symptom patterns between patients presenting with seasonal influenza (H3N2, H1N1, and B) and the pandemic A pH1N1 influenza.
This study examines the use of data from calls to telenursing services as population-level predictors of influenza season activity and its progression.It employs data from the Swedish telenursing service Healthcare Direct/1177 and an electronic health data repository covering an entire county population [11].The electronic repository collects data from all calls made by the county residents to the nation-wide telenursing service, and data from all healthcare episodes provided in the county at primary and secondary levels.Specifically, the aim of the study is to examine the prospective performance of chief complaints documented during telenursing calls, hereafter referred to as telenursing chief complaints, in predicting influenza activity on a daily and weekly basis, respectively, during a three-year period.

Methods
This observational study uses an open cohort design based on the total population in a Swedish county.A detection algorithm was calibrated using retrospective data from two years (covering influenza winter seasons 2007/08 and 2008/09) and then prospectively evaluated during a three-year period (covering the pandemic 2009/10 and influenza winter seasons 2010/2011 and 2011/12).The study was based on administrative public health databases established for the purpose of systematically and continuously developing the quality of service.In accordance with Swedish legislation (SFS 2008:355), personal identifiers were removed from the records.The study design was approved by the Regional Research Ethics Board in Linköping (2012/104-31).
The study was performed in Östergötland (population 427,000) located in south-eastern Sweden.The daily and weekly rates of clinical cases are used as measures of influenza activity.An account of the age-stratified influenza activity in the county has previously been reported [12].Annual aggregated data on the sex, age, and residence of the population were collected from Statistics Sweden [13].Data from Östergötland residents who had contacted the telenursing service or had been clinically diagnosed with influenza were identified from the electronic health data repository associated with the county-wide electronic health record systems at the County Council.Data from the clinical laboratories, however, were only collected for this study during the period from 1 January 2009 to 15 September 2010.Influenza cases were identified from the electronic health data repository by the International Classification of Diseases version 10 (ICD-10) codes for influenza (J10.0,J10.1, J10.8, J11.0, J11.1, J11.8) [14].For individuals having received an influenza diagnosis at both primary and secondary levels of care, the diagnosis code recorded at the first contact was used for the analyses.If the codes were recorded at the same day, only the secondary-level diagnosis code was used.ILI-related telenursing call cases were identified by the chief complaint codes associated with influenza symptoms (dyspnoea, fever (child, adult), cough (child, adult), sore throat, lethargy, syncope, dizziness, and headache (child, adult)) from the fixed-field terminology register.

Case data validation
The influenza case data defined by clinical diagnoses were validated against case data from the microbiological laboratories for the period 1 January 2009 to 15 September 2010.In these analyses, both data sets were separately adjusted for weekday effects on care resource utilization.The correlations between the number of cases reported each day in the clinical and laboratory data were analysed with 0-6 day lag.The results showed a strong correlation between the number of clinically diagnosed influenza cases per day and the corresponding number of cases verified daily by microbiological analyses during the validation period.The correlation with largest strength (r=0.625;p<0.001) was observed between the clinically and the microbiologically verified cases with a two-day lag.

Retrospective algorithm calibration
A calibration procedure [15] was used for determining the telenursing chief complaint grouping that retrospectively demonstrated the best predictive accuracy with regard to the influenza case rates.Data from two influenza seasons (2007/08 to 2008/09) were collected and used to determine the influenza activity prediction (IAP) grouping having the strongest correlation with case rates and the best-performing threshold for alerts.Initial calibrations using correlations have been suggested to complement the application of a threshold or scan statistic in the analyses of surveillance performance [16].Correlations between chief complaints documented during telenursing calls and influenza case rates were, therefore, first examined for all possible combinations of complaints with a preceding time lag to physician diagnosis starting from 0 days and until the correlations started to decay, but at least up to 14 days.The three groupings of chief complaints with the strongest correlation to the influenza case rate for each time lag were listed.The chief complaint grouping with the largest correlation strength was chosen as the influenza activity prediction (IAP) grouping to be used in the following analyses of predictive performance.
In the second calibration step, a detection algorithm was used to examine the prospective performance of the selected IAP grouping.A Shewhart-type algorithm [17], where the signal decisions depend on the observed measure of activity from the current time period was used.The baseline temporal trend of calls to Healthcare Direct due to complaints included in the IAP grouping was estimated in the retrospective data set using the formula b0+b1t, where b0 is the intercept, b1 the slope, and t is time (day).The actual number of calls to Healthcare Direct was then identified for each day and the estimated baseline level value for that day was subtracted from the recorded number.If the difference was positive, the value was saved, and if it was negative it was set to 0. This transformation yielded an adjusted set of telenursing data.To detect outbreaks on a daily basis we calculated a moving average for the adjusted data set (the value for day 8 is the average number of calls for days 1-7, the value for day 9 is the average number of calls for days 2-8, etc.).The threshold levels for signalling an alert were determined using Receiver Operating Characteristic (ROC) curves.The area under the ROC curve (AUC) calculated from plots of the sensitivity and 1-specificity of the outbreak predictions and the positive predictive value (PPV) of these predictions on a daily and weekly basis, respectively, were used as performance indicators [18].The limit for start and end time of influenza outbreaks was set to 1.8 cases/100,000 during a floating sevenday period [19].

Statistical data analysis
The IAP grouping and threshold were prospectively evaluated using data from three subsequent seasons (2009/10 to 2011-12).During this period, the telenursing data were adjusted to the baseline temporal trend with the same methods as used in the retrospective algorithm calibration.For the evaluation, correlations with influenza case rates were calculated and estimates of the AUC and PPV computed as performance indicators.The level of statistical significance was set to p<0.05.To denote the strength of correlations, limit values were applied as suggested by the Cohen Scale [20].This scale defines small, medium and large effect sizes as 0.10, 0.30, and 0.50 respectively.The limits for interpreting the AUC (or c-statistic) were set to 0.90, 0.80, and 0.70, denoting very strong (outstanding), strong (excellent), and acceptable discriminatory performance, respectively [21].The analyses were performed

Results
The highest incidence of influenza cases during the study period was recorded for the influenza winter season 2011/12 in Östergötland county (1.9 cases/ day/100,000; 8.2 cases/day in the county) (Table 1).The average number of telenursing calls recorded during an influenza winter season or pandemic with a chief complaint in the ILI category increased from 20.0 calls/day/100,000 (84.4 calls/day) during the B and A H1 influenza winter season in 2007/08 to 27.9 calls/day/100,000 (120.4 calls/day) during the H3N2 influenza winter season in 2011/12.Correspondingly, the calls with a chief complaint in the ILI category during the intermittent periods increased from 15.4 calls/ day/100,000 (65.3 calls/day) in May-November 2008 to 21.1 calls/day/100,000 (90.9 calls/day) in May-December 2011.

Retrospective calibration
The grouping of chief complaints with the largest correlation strength on a daily basis (r=0.66;p<0.001) and longest lead time (14 days) to influenza case rates in the retrospective data was fever (child, adult) and syncope (Table 2).On a weekly basis, the strength of the correlation was larger (r=0.91;p<0.001), while the lead time remained at two weeks.The chief complaints cough (child, adult), lethargy, dizziness, and headache (child) were included in groupings showing large correlations strength (r>0.5) with influenza case rates.The chief complaints not included in any grouping reaching the level of statistical significance were sore throat and common cold.Based on these observations, fever (child, adult) and syncope were chosen as the IAP complaint grouping for use in alerts.The alerting threshold was determined to a moving average of 0.9 calls/day/100,000 above the baseline level (Figure 1).The performance of alerts on a daily basis (14 days lead time) was very strong (AUC=0.94;PPV=0.92); the specificity was 0.94 and the sensitivity was 0.85.For alerts on a weekly basis, the threshold was determined to 4.7 calls/week/100,000 above the baseline level.
Also for the weekly alerts (two weeks lead time), the retrospective performance was very strong (AUC=0.93;PPV=0.96); the specificity was 0.97 and the sensitivity 0.87.

Prospective evaluation
The strength of the correlation between telenursing call rates for the IAP grouping and influenza case rates was, on a daily basis, slightly smaller than observed in the retrospective analysis but still large (r=0.59;p<0.001).The correlation strength was smaller during the first part of the period (July 2009 to June 2010) including the 2009 influenza pandemic (r=0.56;p<0.001) than in the second part of the period (July 2010 to April 2012) including only winter influenza seasons (r=0.64;p<0.001) (Figure 2).Similarly, the weekly correlation strength was smaller than observed from the retrospective data (r=0.80;p<0.001).Here it was also smaller during the first part of the evaluation period (r=0.76;p<0.001) than in the later part including only influenza winter seasons (r=0.86;p<0.001).
The AUC for the 14-day predictions on a daily basis was 0.87 (PPV=0.75)for the entire prospective evaluation period; the specificity was 0.88 and the sensitivity was 0.67 (Figure 3).The performance was acceptable for the part of the evaluation period including the 2009 influenza pandemic (AUC=0.84;PPV=0.58), while it was strong (AUC=0.89;PPV=0.93) for the remaining period including only influenza winter seasons.On a weekly basis, the AUC was strong 0.81 (PPV=0.90)for the entire prospective evaluation period; the specificity was 0.94 and the sensitivity was 0.68.Also on a weekly basis, the performance of predictions was acceptable for the pandemic outbreak (AUC=0.78;PPV=0.79) and strong for the influenza winter seasons (AUC=0.83;PPV=1.00).

Discussion
This is the first study of the predictive performance of telenursing data in influenza surveillance based on recommended standard statistical outcome measures for evaluations of methods for forecasting infectious disease activity [22].The complaint grouping found in retrospective analyses of two consecutive influenza winter seasons to have the longest lead time and strongest correlation to variations in influenza case rates was fever (child, adult) and syncope.Influenza-like illness (ILI) complaints included in the analysis were dyspnoea, fever (child), fever (adult), cough (child), cough (adult), sore throat, dizziness, lethargia, syncope, headache (child), and headache (adult).P<0.001 for all correlations.
The prospective correlation with weekly influenza case rates over three consecutive influenza seasons was found to have slightly greater strength (r=0.80;p<0.001) than the retrospective median correlation (r=0.74 (range 0.34-0.89))reported from a state-level US study [7].The latter study did not include optimisation with regard to alternative chief complaint groupings or prospective evaluation, but it reported that the correlation between influenza case rates and viral isolate data was strong.In our prospective evaluation, the performance of daily telenursing complaint data in 14-day predictions of influenza case rates was found to be strong (AUC=0.87;PPV=0.75).The performance was poorer during the first part of the prospective evaluation period including the 2009 influenza pandemic (AUC=0.84;PPV=0.58)than during the remaining period, including only influenza winter seasons (AUC=0.83;PPV=1.00).The poorer performance can be explained both by the fact that the symptom patterns during pandemic influenza outbreaks differ from the corresponding patterns of seasonal influenza and also by differences in healthcare utilisation and health seeking behaviour between pandemic outbreaks and winter seasons [23,24].This implies that predictions based on telenursing data from influenza winter seasons can be assumed to be less accurate when applied during pandemic s.Our results also confirm exploratory findings reported from other settings.In a study performed within the NHS Direct telenursing service in the United Kingdom, alerting thresholds defined as 9% fever complaints in the age group 5-14 years and 1.2% 'cold/flu' complaints of all complaints were derived using Poisson regression modelling [25].In a pragmatic prospective evaluation, the thresholds were found to provide up to 14 days advance warning of seasonal influenza activity.Similarly, a retrospective study from Canada using data on total call rates from the Telehealth Ontario telenursing service and case rate data on respiratory illnesses showed strong correlations and indicated that, if threshold levels had been set for the start of outbreaks, it would have been possible to provide up to 15 days advance warnings of emergency department visits [26].No prospective evaluation was reported from the Canadian telenursing setting.
In previous studies involving telenursing service users and clinical outpatients, fever has been found to be an early correlate to influenza case rates [27,28].Requiring fever as part of the case definition has been shown to increase the specificity of influenza diagnosis among clinical ILI cases [29,30].One explanation of the predictive performance of fever in influenza surveillance could be that the symptom mediates timely outbreak detection particularly when the predominant circulating influenza strain does not initially cause significant levels of illness among those infected, while only later and different complaints and complications necessitate medical care.Cough was not included in the IAP grouping in this study, although this symptom has commonly been reported from studies of the clinical presentation of influenza [8¬-10].Cough is among the most common reason for seeking medical care, and most episodes result from a self-limited acute viral upper respiratory tract infection [31].However, particularly among older adults, the symptom can also be caused by a number of other disorders, such as gastro-oesophageal reflux disease, upper airway cough syndrome, and asthma [32].In our study, cough was included in several groupings of chief complaints showing large correlations with influenza case rates.Nonetheless, unlike for fever, in some seasons, cough was mainly reported from children and during other seasons from adults.Variations were also found in the age distribution of clinical case rates as indicated by the relative illness ratio (unpublished data).It is thus reasonable to assume that the predictive performance of telenursing chief complaint groupings in influenza surveillance, as these groupings provide a selective representation of those infected with symptoms, is both populationand season-dependent.Therefore, fever appears to be a common denominator among the chief complaints with regard to predictive performance.More research on the clinical presentation of influenza as well as the grouping of telenursing chief complaints for prospective use in influenza surveillance is warranted.
This study has several limitations that should be considered when interpreting the results.First, influenza cases were defined by clinical diagnosis, and microbiological validation was restricted to a limited period of the study.However, the strength of the correlation between the microbiological and clinical diagnosis rates was large during the validation period, and similar findings have also been reported from other settings [30,33].Second, the telenursing data were based on chief complaint codes defined for Sweden.Some complaints, such as fever and cough, were coded as age-specific syndromes, while other complaints had an age-neutral coding.Internationally standardised telenursing complaint codes would facilitate valid and reliable recording and comparisons between systems.The World Health Organization framework for influenza preparedness [34] could provide a forum for implementing such a process.Moreover, the epidemiological context for interpreting telenursing data has not been established.The majority of calls to telenursing systems are about infections, such as colds, influenza or diarrhoea [25,35].A study of the Telehealth Ontario telenursing service in Canada showed that the call volume was weighted the 0-4 years age group (49%), while the outpatient visits during the same period were mainly from those 18-64 years old (44%) [26].An early Swedish study reported that about every second call to telenursing service is made by a third party on behalf of the ill person, mostly by a spouse or parents of preschool-aged children, and that another large group of callers is young adults living independently [36].According to a recent Canadian study, the overrepresentation of younger age groups among telenursing callers can be explained by both epidemiological and social factors, that is, the incidence of acute respiratory infections is high among young people and that first-time parents without previous parenting experience make more calls for their children [37].In order to further develop the performance of telenursing data in infectious disease surveillance, the biases associated with using these data in epidemiological analyses have to be better understood.Schemes such as the Behavioral Risk Factor Surveillance System (BRFSS) supplied by the Centers for Disease Control and Prevention in the US are needed for longitudinal collection and analysis of standardised data on health behaviours during and between seasons [38].

Conclusions
In this first prospective study based on standardised outcome measures, the telenursing complaints fever and syncope were found to be strongly correlated to influenza case rates and the complaints grouping showed strong performance in predicting winter influenza seasons.The method performed poorer during the 2009 pandemic outbreak when health behaviours did not follow anticipated patterns.This paper has presented data from Sweden, but the results have international relevance, as telenursing services are rapidly expanding worldwide [3].We recommend the use of telenursing data in surveillance of seasonal influenza.The relationship between the utilisation of the service in population subgroups during winter influenza using SPSS version 19, R Statistical Software version 2.15.2, and Minitab Statistical Software version 16.1.1.

Figure 1
Figure 1Retrospective data used for algorithm calibration from Östergötland County, Sweden, January 2008-June 2009

Figure 2
Figure 2Daily numbers of influenza cases and telenursing calls for fever and syncope above the baseline temporal trend for calls to the telenursing service in Östergötland County, Sweden, during the prospective evaluation July 2009-April 2012

Figure 3
Figure 3Receiver operating characteristic (ROC) curves based on retrospective and prospective data for prediction of influenza case rates from telenursing calls for fever and syncope, Östergötland County, Sweden, January 2008-April 2012

Table 1
Numbers of daily influenza cases and telenursing influenza-like illness calls per 100,000, Östergötland country, Sweden, winter influenza seasons including the 2009 pandemic, and intermittent periods 2007-2012 ILI: influenza-like illness.

Table 2
Best performing telenursing complaint groupings in retrospective analysis displayed by lead time to physicians' diagnosis of influenza, Östergötland, Sweden, winter influenza seasons including the 2009 pandemic and intermittent periods 2007-2012