Estimation of COVID-19 vaccine effectiveness against hospitalisation in individuals aged ≥ 65 years using electronic health registries; a pilot study in four EU/EEA countries, October 2021 to March 2022

By employing a common protocol and data from electronic health registries in Denmark, Navarre (Spain), Norway and Portugal, we estimated vaccine effectiveness (VE) against hospitalisation due to COVID-19 in individuals aged ≥ 65 years old, without previous documented infection, between October 2021 and March 2022. VE was higher in 65–79-year-olds compared with ≥ 80-year-olds and in those who received a booster compared with those who were primary vaccinated. VE remained high (ca 80%) between ≥ 12 and < 24 weeks after the first booster administration, and after Omicron became dominant.

Supplement S1. Summary of the study protocol

Study period and setting
The study covers 8-week rolling study periods for VE monitoring over time according to the following Four study sites participate in the study: Denmark, Navarre (Spain), Norway and Portugal, all with population-based registries covering the entire population in their territories. This is a data-linkage study that involves different databases in each of the participating sites, all of which can be deterministically linked using individual identifiers.
In Portugal, information sources used for this study cover approximately 7.5 million inhabitants, 2.4 million over 65 years of age. Information on hospitalization is extracted from the BI-SINAVE +BIMH databases, while BIMH episodes are coded at discharge, so there is some delay in coding, and only public hospitals are covered. Date of laboratory confirmation of SARS-CoV-2 is obtained from BI-SINAVE and date and cause of hospital admission are obtained from BIMH. Information on vaccination is extracted from the VACINAS, a nationwide EHR with data on COVID-19 and other vaccines administered in Portugal, containing information on brand and date of uptake of each dose of COVID-19 vaccine and other vaccines. Information on covariates is obtained from the NHSU database, which is used as the central source system, therefore all users included in the study come from this source. Data on sex, date of birth, address and date of death are obtained from NHSU. The number of chronic conditions is extracted from Primary Care Information System (SIM@SNS) database, that allows to identify comorbidities including Anaemia, Dementia, Diabetes, Cardiac disease, neuromuscular disease, rheumatologic disease, obesity, tuberculosis, stroke, pulmonary disease except asthma, liver disease and hypertension belong to the group without immunosuppression, and HIV, Renal disease and Cancer belong to the group with immunosuppression. Data on the number of comorbidities is available, but due to data protection issues data on each individual comorbidity is not accessible. European deprivation index for Portugal at municipality level (Ribeiro AI et al 2018) is obtained from the Census 2011, but will be updated when the Census 2021 will be available.
In Denmark, information sources used for this study cover approximately 5 million inhabitants, around 1.18 million over 65 years of age. Information on hospitalizations will be extracted from the Danish National Patient Register (DNPR) and cross-matched with the Danish Microbiology Database (MiBA). In LPR the cause of hospitalization as COVID-19 or not COVID-19 related can be coded based on COVID-19 specific ICD-10 codes. Vaccination data will be extracted from the Danish Vaccination Registry (DDV), where information on uptake of other vaccines is also available. The LPR also has information on comorbidities to be used for adjustment and the Danish Civil Registration System (CPR) provides demographic information.
In Navarre, information sources used for this study cover approximately 650.000 inhabitants, around 123.000 over 65 years of age. Information on hospitalization is extracted from the Hospital admissions database, where a Public Health medical doctor reviews all hospitalized patients with a COVID-19 positive result to determine if admission was due to COVID-19. Laboratory test results database, that includes the microbiology Database is also available. Information on vaccination is available at the Vaccination Registry and the Administrative record database contains the sociodemographic data. Primary healthcare records allow to extract comorbidities for adjusting the estimates.
In Norway, information sources used for this study cover approximately 4.3 million inhabitants, around 1 million over 65 years of age. Hospitalization data is extracted from the Norwegian Intensive Care and Pandemic Registry, where Hospitalization with Covid-19 as main cause is available. Information on vaccination is available at SYSVAK -National Immunisation Register. Sociodemographic information, including country of birth is available at the National Population Register, information on comorbidities is extracted from the Norwegian Patient Registry (NPR): individual level data from all public specialist healthcare services in Norway, and residents in nursing homes can be identified through the Institution register from Norwegian Labour and Welfare Administration.
Successful linkage between the vaccination database and the administrative population database was reported for close to 100% in all four study sites.

Study design
A retrospective cohort study using data collected in electronic health records databases with individual deterministic linkage. The risk of occurrence of study outcomes will be compared between individuals with different vaccination status.

Study population
The study population includes individuals in the National Vaccination Plan and/or the reference population registries fulfilling the following criteria during the different study periods: • Resident in the EU/EEA country performing the study (not excluded in Portugal) • Not resident in nursing homes • Belonging to the group who was universally vaccinated as recommended by age.
• No previous infection: No previous positive SARS-CoV-2 test recorded at the date of follow-up period (first day of follow-up for each individual).

Exposure: Vaccination status
The vaccination status is based on vaccine doses administered up to the date in which vaccination status is assessed (as a time-changing exposure), and individual will be classified as follows: • Non-vaccinated: has not received any vaccine dose. Induction period for each dose will be considered as separate exposure category. Any individual who received at least one dose of vaccine but do not fulfil the definition of complete vaccination with primary series of COVID-19 vaccines will be considered incompletely vaccinated. This category will be analysed, but will not be reported as an Exposure category for this study.
Individuals who received the second dose within less than 19 of the first vaccine dose, or that receives a subsequent vaccine dose (any vaccine dose after complete vaccination) that does not fulfil the definition criteria of a booster will be considered information errors and be dropped from the risk set. Also, persons with vaccine brands or vaccination schedules not included in the national vaccination programme or unknown will be dropped.

Time since booster vaccination
Time since booster dose administration (plus induction period) was computed at each point in time by constructing a time-dependent variable between date of last booster dose administration + induction period and the assessment date. Time since booster was categorized in three periods, although the number of individuals in the last category will be scarce in the first months of this study: • From 7 or 14 days up to ≤84 days (i.e. ≤ 11 weeks) • days 85 -168, both included (i.e. weeks 12-23) • ≥169 days (i.e. ≥24 weeks)

Outcome: Hospitalisation due to COVID-19
Hospitalisation due to COVID-19: laboratory-confirmed infection with admission to hospital 24 hours before (48hours in Denmark) or up to 3 weeks after the positive test or symptoms onset (2 weeks in Denmark), in which admission or discharge criteria is compatible with SARI (based on similar criteria as in SARI surveillance, ICD coding or similar).

Other variables
Age group Age will be calculated at the beginning of each study period using the date of birth, and categorised into 5-years bins to adjust models. For reporting stratified results by age-group the following groups will be used 65-79, 80+. Alternative age groups may be discussed upon needs.

Estimation of vaccine effectiveness
Groups to be compared The analysis will be done in people classified as not having previous SARS-CoV-2 infection. Vaccine effectiveness will be estimated by comparing different exposure groups to fulfil all the study objectives. The relevant comparisons include: It is recommended that contrasts that use the same reference group are performed using a single model with exposures defined by categories of a single variable. For instance, group comparisons numbered as 1 and 2 above could be done in a single model, with a variable that equals 0 for non-vaccinated, 1 for complete vaccination with primary series and 2 for first booster. Similarly, contrasts 3, 4 and 5 should be performed in a single model.

Subgroup analyses
Estimations, both crude and adjusted will be performed separately in two age groups: 65-79 and 80+ years of age, and disaggregated by time since the booster dose, as previously defined.

Crude vaccine effectiveness
Each individual will enter the study in the different exposure groups on the date they are first classified into that group. This will be date of the beginning of the study, except for individuals that change exposure groups throughout follow-up, which will be censored without event in the group that they leave and are recorded as a delayed entry in the group where they are newly classified. End of follow-up will be established at the time of occurrence of any reason for censoring, and will be marked as event=1 if the reason for censoring is the event of interest, or event=0 otherwise. The time of start and end and whether the follow-up ended in event or not will be provided to the survival command.
Vaccine effectiveness will be estimated using hazard ratio (HR) of defined outcome(s) in individuals with different exposure categories, as defined above, within the population study. Survival Cox models for the estimation of HRs will be fit with calendar time as the underlying time scale, thus assigning time 0 to the first day of the observation period.

Adjusted vaccine effectiveness
The regression to estimate HR will be adjusted by fixed or time-changing confounders, as appropriate, and as previously defined. First, partially adjusted HR will be estimated, adjusting by age group (5 yearbins), sex and region in the country, if appropriate. Second, a fully adjusted HR or RR estimate will be produced adjusting by variables related to socioeconomic condition, comorbidities and health-seeking behaviour, as relevant at each study-site.

Methods for pooling estimates
The crude effect, the basic adjusted effect (age, sex, region) and the fully adjusted effect (adding the rest of available covariates), will be compared to assess the degree of confounding by different factors. The fully adjusted estimate was pooled across sites to draw a single estimate.
Country-specific HRs and standard errors for the effect of COVID-19 vaccination obtained from the study studies, were combined using meta-analysis methods, both using fixed-effects and for random-effects.
In the fixed-effects approach, a weighted average of the sites' estimate was computed, together with its 95% Confidence Interval. This has the advantage of not being influenced by the low number of study sites included in the pilot phase.
The approach incorporating random effects was more plausible, since both measured and unmeasured country-specific factors are expected to influence vaccine effectiveness. However, the low number of sites included in this pilot makes this approach less efficient. Moreover, it provides the average effect between studies with relatively low influence of the size of the different studies (as compared to how they influence pooled results in the fixed-effects model), thus representing to a lower degree the average effect taking into account the size of the country reporting. Under the random-effects, the country-specific exposuredisease effects (HRs) were weighted by the inverse of their marginal variances (generic inverse variance method). The marginal variance is the sum of the individual study-specific variances and the variance of the random study effects (τ2). This gives the pooled HR and a standard error. We calculated the confidence interval around the pooled effect (the range of values that contain the true average HR with 95% certainty). τ2 and I2 were used to describe between-studies heterogeneity, along with the p-value of the heterogeneity test. Potential factors or specific pilot sites characteristics that could be the source of qualitative heterogeneity were described and discussed.
The country-specific HR and their confidence intervals, along with the pooled HR, were presented graphically in a forest plot. Pooled estimates were obtained overall for each of the exposure categories and subgroups outlined in this protocol. For each pooled estimate, only sites contributing to that estimate were used.