How to estimate vaccine impact?

Researchers establish a mathematical model to estimate the vaccine impact on populations.

July 07, 2023

Researchers at the Max Planck Institute for Infection Biology established a new method to estimate the impact of vaccination programs. Using statistical models, the team around research group leader Matthieu Domenech de Cellès created counterfactual outcomes (what would have happened without the introduction of a vaccine?), compared them to the actual situation after vaccine rollout, and estimated the impact of pneumococcal conjugate vaccine (PCV) childhood immunization. The team tested different modelling methods on simulated pneumonia hospitalization data and found that the synthetic control approach implemented using a machine learning algorithm—LASSO regression—was advantageous compared to other approaches. This method could be applied to other vaccines and diseases, facilitating the evaluation of large-scale vaccination campaigns. The results have now been published in the American Journal of Epidemiology.

It can be hard to untangle the effects of a vaccine from all the other factors that may influence disease incidence in a population. Public health measures, like mask mandates during the COVID-19 pandemic, or even climatic changes, such as less harsh winters, can influence incidence numbers of infectious diseases. All these factors call for thought-through methods of estimating the vaccine impact—the reduction in infectious disease incidence by the vaccine in the whole population. To do this, researchers need a so-called counterfactual outcome to compare with the actual situation after vaccine roll-out: What would have happened without the introduction of a vaccine?

In medical research, this question can be answered by control groups—usually patients who do not receive a respective treatment. But for large-scale vaccination programs, covering whole nations, one cannot refuse to vaccinate a certain group, rendering them potentially unprotected against an infectious disease.

This is the point at which Matthieu Domenech de Cellès—research group leader at the Max Planck Institute for Infection Biology—comes into play with his research. To study vaccine effectiveness without relying on “real” control groups, the team around Domenech de Cellès is creating counterfactual outcomes by employing statistical models. There are several modelling approaches to create these “alternative realities” in which no vaccine has been introduced.

In their latest research project, the team compared modelling methods based on data of a vaccine against pneumococcus, a leading cause of pneumonia. In addition to three established commonly used methods, the researchers also tested a fourth model known as LASSO regression, which had not been previously used for constructing the synthetic control.

"One of the easiest and most commonly used methods to create a counterfactual outcome is called interrupted time series. In this model, you simply expect a known trend to continue, but all too often this doesn't reflect reality very well and therefore doesn't help to study the impact of vaccines," explains Anabelle Wong, PhD student in Domenech de Cellès’ research group and first author of the study, "Alternative models provide a more nuanced and realistic picture by creating a so-called synthetic control”.

To establish this synthetic control, the researchers used hospitalization data that cover the time before and after the pneumococcal vaccine was introduced. The dataset contained hospitalization cases for various diseases and allowed to capture population-wide trends: If a health policy or climate change had an impact on these diseases, it would be reflected in the hospitalization rates.

By using the machine learning model LASSO regression, Domenech de Cellès’ team could identify the diseases behind hospitalization cases that followed trends closest to the cases of pneumonia hospitalization. Even after pneumococcal infections where greatly reduced by the vaccine, these diseases are themselves not affected by the vaccine, allowing the researchers to estimate the counterfactual pneumonia cases from the data.

To confirm the validity of their estimate, the researchers used real-life pneumonia hospitalization data to simulate completely new data sets, allowing them to create scenarios in which the vaccine impact is pre-determined. By using these scenarios, the researchers could verify their model and compare its accuracy and precision against other approaches. While some models that are also based on synthetic controls did rather well, the interrupted time series method at times produced inaccurate or biased results.

According to Wong this is particularly important because interrupted time series analysis is still widely applied in epidemiology, often due to its ease of use. "However, our paper showcases the power of LASSO regression in comparison and highlights the importance of considering more than just intuition when selecting a method," Wong emphasizes.

Although the researchers tested their model on pneumococcal vaccine data, it can be adapted to other vaccines and diseases. The LASSO approach, proposed by the team of Domenech de Cellès is easy to apply so that public health officials could use it to improve their vaccination program evaluation. Using the LASSO approach, expensive long-term and large-scale vaccination programs could be steered more effectively in the future.