How to Better Evaluate the Spread of SARS-CoV-2
10 April 2020
Coordination: Josselin Garnier[1]
Summary
A virus-screening test campaign is proposed, using random, unbiased
samples representative of the general population, to significantly improve our
understanding of the present and future epidemic situation through better-calibrated
mathematical infectious disease models.
Assessment
Various mathematical models have been proposed to predict the
progression of the Covid-19 epidemic at the national scale. Most of these
models consider the temporal evolution of the epidemic in terms of a population
divided into compartments associated with different possible disease statuses:
susceptible, infected, or recovered. These models can take into account
stratification, for example, by age or by region. The laws defining the evolution
of the distribution are written in the form of coupled differential equations,
which represent the mechanisms and phenomena occurring, and which are deduced
from epidemiological data. These models are generally calibrated, meaning that
the free parameters (for example, the infection rates) are adjusted such as to
reproduce the available data (detected cases and deaths). The models are then
used to make predictions, for example, the date of the infection peak, or the
impact of policy measures, such as confinement or physical distancing (see,
among other publications those of Imperial College[2] or of the Université de
Bordeaux.[3]
Statistical studies reveal, however, that the predictions of such models
can be very unreliable. Indeed, often different sets of the free parameters can
be defined that are all compatible with the available data, but which lead to
very different predictions. Even with very simple models (for example, one
national compartment for each category), the uncertainties are very large,
implying that epidemiological predictions should be used with extreme caution. However,
uncertainty quantification procedures (known and used for quantifying the
quality and robustness of large digital code simulations) make it possible to highlight
these prediction uncertainties. They also make it possible to identify, through
sensitivity analysis, the critical model parameters. These are the parameters
against which the models are sensitive, and which cannot be extracted from the
available data. For example, the proportion of asymptomatic carriers is not
measured even though it is a critical parameter.[4]
Objectives
In order to build mathematical models capable of more robust
predictions, it is necessary to obtain information about the critical model
input parameters. Using sensitivity analysis, many parameters can be set to
their most likely values, allowing us to focus on the most important unknowns.
Some of them, such as the proportion of asymptomatic carriers, can best be
estimated by means of a specific test campaign, carried out on a random and
unbiased sample of the population. These data would complement those already
available (detected cases and deaths), allowing us to improve the robustness of
the models and to strengthen their predictive capability. Uncertainty could be
significantly reduced by better estimating the proportion of asymptomatic
carriers, by integrating a spatial dimension, and by determining the immunity
already acquired.
Feasibility
Such test campaigns can be organized immediately. They are being carried
out in three departments of the Ile-de-France region. It is not necessary to
test the whole population (which would be ideal but is impractical at present),
nor to make full individual diagnoses of a small segment of the population.
Instead, our strategy is to obtain statistical
information on the current state of susceptibility of the population. Two types of tests –
one detecting the viral load, the other, the anti-bodies – provide two
different pieces of information, which can be integrated into the models to
refine the predictions. The information from tests carried out at different
times can also be integrated into the models, because they describe the temporal
evolution.
Reliability
The reliability of tests is defined by their sensitivity and by their specificity. Sensitivity measures
the occurrence of false negatives and specificity the occurrence of false
positives. If this
is well-known, this information can be used to improve statistical information
on the state of the whole population. The impact of test reliability is
different in the case of statistical sampling compared to the case of individual diagnoses, because such
sensitivity and specificity information can be integrated into the statistical
processing and use of the data. In addition, if we control for sensitivity and
specificity, a technique (known as group testing) makes it possible to perform
pooled tests, reducing the number of tests which must be performed compared to
the number of samples.
Implementation
The implementation of such a test campaign requires:
1) Performing viral-load tests, and, if possible, simultaneous
serological tests; the tests with very high specificity should be favored. It
is important to have sufficient information on the patient cohorts used to
assess the specificity and sensitivity properties.
3) Setting up specific procedures for collecting
consent forms, submitting a medical questionnaire and taking samples. Ideally,
random samples with anonymized results should be taken, but with spatial (at
the scale of the municipality) and temporal (at the scale of the sample date)
tags to be taken into account sequentially in the models. Mettre
en place une logistique spécifique pour recueillir les consentements et faire
les prélèvements.
This proposal is made by a group of mathematicians from the Saclay
plateau (ENS Paris Saclay, Inria Saclay and Polytechnique).
[1] Ecole polytechnique, Centre Cournot, LabEx
Hadamard
[2] Imperial College COVID-19 Response Team, Estimating
the number of infections and the impact of nonpharmaceutical interventions on
COVID-19 in 11 European countries, 2020, 30 mars.
[3] Magal P., Webb G.,
Predicting the number of reported and unreported cases for the COVID-19
epidemic in South Korea, Italy, France and Germany, 2020, https://doi.org/10.1101/2020.03.21.20040154
[4] A Bayesian estimate of the parameters of the models
proposed in the literature shows that the a posteriori distribution of
this parameter, knowing the current data, is always almost equal to its a
priori distribution. In other words, it cannot be estimated, although the
predictions (of the intensity of the peak, for example) strongly depend on it.
Other types of data are therefore needed to estimate it.
[5] The basic idea is to draw representative samples from
the general population. To illustrate, in the simplest framework, if we expect
a rate of positive test results of the order of p in a stratum, then we must take samples from a sample of size N=1/(pa²) to have a
precision relative to the estimation of p (for example, if p =1% and a=20%, then
N=2500). We can also draw fewer individuals in certain strata or sub-strata
where more positive tests are expected, in order to decrease the variance of
the estimates (this is a classic method in stratified sampling). We can also do
group testing (where we test mixtures of samples), which increase the number of
samples while remaining with a significantly reduced number of tests performed
when p is low (this is also fairly standard, known as
group testing).
No comments:
Post a Comment