Thursday, 9 April 2020

Project to test for the SARS-CoV-2 virus on a representative sample of the population


29 March 2020

Project coordinated by Josselin Garnier, Ecole polytechnique,
Centre Cournot, LabEx Hadamard

Summary
A virus-screening test campaign is proposed, using random, unbiased samples representative of the general population, to significantly improve our understanding of the present and future epidemic situation through better-calibrated mathematical infectious disease models.

Assessment
Various mathematical models have been proposed to predict the progression of the Covid-19 epidemic at the national scale. Most of these models consider the temporal evolution of the epidemic in terms of a population divided into compartments associated with different possible disease statuses: susceptible, infected, or recovered. These models can take into account stratification, for example, by age or by region. The laws defining the evolution of the distribution are written in the form of coupled differential equations, which represent the mechanisms and phenomena occurring, and which are deduced from epidemiological data. These models are generally calibrated, meaning that the free parameters (for example, the infection rates) are adjusted such as to reproduce the available data (detected cases and deaths). The models are then used to make predictions, for example, the date of the infection peak, or the impact of policy measures (such as confinement or physical distancing rules).[[i],[ii]]

Statistical studies reveal, however, that the predictions of such models can be very unreliable. Indeed, often different sets of the free parameters can be defined that are all compatible with the available data, but which lead to very different predictions. Even with very simple models (for example, one national compartment for each category), the uncertainties are very large, implying that epidemiological predictions should be used with extreme caution. However, uncertainty quantification procedures make it possible to identify, through sensitivity analysis, the principal model parameters that determine model outputs. Currently, however, some of these parameters cannot be reliably extracted from the available data. A prime example is the fraction of carriers that are asymptomatic, a critical parameter that is very poorly known.[iii]

Objectives
In order to build mathematical models capable of more robust predictions, it is necessary to reduce the uncertainty in the critical model input parameters. Using sensitivity analysis, many parameters can be set to their most likely values, allowing us to focus on the most important unknowns. Some of them, such as the proportion of asymptomatic carriers, can best be estimated by means of a specific test campaign, carried out on a random and unbiased sample of the population. These data would complement those already available (detected cases and deaths), allowing us to improve the robustness of the models and to strengthen their predictive capability. Uncertainty could be significantly reduced by better estimating the proportion of asymptomatic carriers, by integrating a spatial dimension, and by determining the immunity already acquired.

Feasibility
Such test campaigns can be organized immediately. It is not necessary to test the whole population (which would be ideal but is impractical at present), nor to make full individual diagnoses of a small segment of the population. Instead, our strategy is to obtain statistical information on the current state of susceptibility of the population. France plans to increase its testing capacity in the coming weeks; we therefore propose devoting a few thousand of these tests to a statistically-designed campaign on a representative sample of the general population.

Reliability
The reliability of tests is defined by their sensitivity and by their specificity. If this is well-known, this information can be used to improve statistical information on the state of the whole population. The impact of test reliability is different in the case of statistical sampling compared to the case of individual diagnoses, because such sensitivity and specificity information can be integrated into the statistical processing and use of the data. In addition, if we control for sensitivity and specificity, a technique (known as group testing) makes it possible to perform pooled tests, reducing the number of tests which must be performed compared to the number of samples.

Two types of tests exist, detecting either viral load or antibodies, producing different information. These two types of information can be integrated into the same model to refine the predictions.

Implementation
The implementation of such a test campaign requires:

  • Performing viral-load tests, and, if possible, simultaneous serological tests; the tests with very high specificity should be favored. It is important to have sufficient information on the patient cohorts used to assess the specificity and sensitivity properties.
  • Recruiting researchers capable of constructing the random cohorts in order to make the samples (depending on the number and reliability of the tests available) and to process the data.
  • Setting up specific procedures for collecting consent forms and taking samples. Ideally, random samples with anonymized results should be taken, but with spatial (at the scale of the municipality) and temporal (at the scale of the sample date) tags to be taken into account sequentially in the models.

This proposal is made by a group of mathematicians from ENS Paris, INRIA and Polytechnique.



[i] Magal P., Webb G., Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany, 2020,  https://www.medrxiv.org/content/10.1101/2020.03.21.20040154v1.full.pdf+html 
[ii] Imperial College COVID-19 Response Team, The Global Impact of COVID-19 and Strategies for Mitigation and Suppression, https://www.imperial.ac.uk/media/imperial-college/medicine/sph/ide/gida-fellowships/Imperial-College-COVID19-Global-Impact-26-03-2020.pdf
[iii] A Bayesian estimate of the parameters of the models proposed in the literature shows that the a posteriori distribution of this parameter, knowing the current data, is always almost equal to its a priori distribution. In other words, it cannot be estimated, although the predictions (of the intensity of the peak, for example) strongly depend on it. Other types of data are therefore needed to estimate it.




No comments:

Post a Comment

Labor and employment dynamics during the coronavirus crisis: an update

Confirmations, clarifications and a mystery Bernard Gazier Among the data and statistical analyses published between April 30 and May 15, 20...