Josselin Garnier (Ecole polytechnique)
March 27, 2020
We
consider the model proposed in P. Magal and G. Webb [2020], Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany to quantify the uncertainty
of the predictions. We apply a general methodology that propagates
the uncertainties of the a posteriori distribution of the most
important parameters
of the model in a Bayesian framework. The methodology could be
extended to other models.
1
Model
The model proposed in [1] has the form
S'(t)
= −τ(t)S(t)[I(t)
+ U(t)], (1)
I'(t)
= τ(t)S(t)[I(t)
+ U(t)]
− νI(t), (2)
R'(t)
= v1I(t)
−
ηR(t), (3)
U'(t) = v2I(t) − ηU(t), (4)
U'(t) = v2I(t) − ηU(t), (4)
where
S(t)
is the number of individuals susceptible to infection at time t, I(t)
the number of asymptomatic infectious individuals at time t,
R(t)
the number
of reported symptomatic infectious individuals at time t,
and U(t) the
number of unreported symptomatic infectious individuals at time t.
The fraction f of asymptomatic infectious individuals becomes reported as symptomatic infectious individuals, and the fraction 1 − f becomes reported as unreported symptomatic infectious individuals. The asymptomatic infectious rate becomes reported as symptomatic individuals: v1 = fν. The asymptomatic infectious rate becomes reported as unreported symptomatic individuals: v2 = (1 − f)ν.
Reported
symptomatic individuals are infectious for an average period of
1/η
days,
as are unreported symptomatic individuals. The
daily number of reported cases from the model can be obtained by computing
the solution of the following equation:
DR'(t)
= v1I(t)
−
DR(t). (5)
The
transmission rate τ(t)
at time t
is
parameterized as
τ(t) = τ0 exp − [−µ(t − N)+]. (6)
τ(t) = τ0 exp − [−µ(t − N)+]. (6)
[This
means: τ
is
constant and equal to τ0 till
time N;
it then decays exponentially
with the rate µ.]
We will also consider the parameterized model
τ(t) = τ0 [1 − µ(t − N)+]+. (7)
τ(t) = τ0 [1 − µ(t − N)+]+. (7)
[This
means: τ
is
constant and equal to τ0 till
time N;
it then decays linearly with the rate µ,
until it reaches 0 at time N
+
1/µ,
and finally it
stays at 0.]
2 Strategy
We fix some of the parameters as prescribed by [1]: ν = 1/7, η = 1/7, f = 0.1, and we consider the exponential form (6) for the transmission rate. We will study later the impact of f and the form of the transmission rate.
1) We calibrate the parameters of the model by least-squares on the daily number of reported cases (with Poisson noise). See Figure 1 for the fit between the observed data and the model data with the estimated parameters.
2) We can predict (I(t), R(t), U(t)) with these estimated parameters. See Figure 2. So far, this looks like [1], with a few more days in the data set.
3) We find that µ and N are the important parameters. We fix all other parameters to their estimated values, and we perform a Bayesian estimation (with uniform prior) of the pair (µ, N). See Figure 3.
4) We propagate the uncertainty of the a posteriori distribution of (µ, N) onto the predictions of (I(t), R(t), U(t)). See Figures 4-5.
3
Discussion
- The a posteriori distributions of µ and N are anticorrelated, while the overall result on the maximal value of I is very sensitive to µ. This may explain why the result in [1] looks so bad for Germany, because the estimated µ is two times smaller than the one for France, while, in fact, the estimation is not robust.
Figure
1: Observed daily number of reported cases in France from February 25
to March 26 (dots) and predicted daily number of reported cases with the
calibrated model (blue solid line).
-
There is a lot of uncertainty in the predictions! Of course, this is
not surprising for such models (with exponential growth with
estimated growth rates).
- The parameter f (the ratio of the umber of reported symptomatic infectious cases over the total number of symptomatic infectious cases) is rather large (the results on I are essentially inversely proportional to it). Unfortunately, it cannot be calibrated from the data. If it is included in the Bayesian analysis; its a posteriori distribution is its prior distribution. If we consider the two models with f = 0.1 and f = 0.4, the Bayes factor is close to 1.1, in favor of the first one, which is not significant. In Figures 6-10, we give the results obtained with f = 0.4. We need other types of data to estimate this parameter (for instance, a survey from a representative sample of the general population).
- The time-dependent form of the transmission rate is important. If we impose the linear form as in (7), then the results are very different from the ones obtained with the exponential form (6). In Figures 11-15, we give the results obtained with f = 0.1 and the linear form (7). The Bayes factor is, however, 2.2, in favor of the exponential model: the data set favors the exponential rather than the linear model for the transmission rate.
Figure
2: Predictions of the calibrated model: the blue line is the number of
asymptomatic infectious individuals I(t),
the red line the number of reported symptomatic infectious
individuals R(t),
the green line the number of
unreported symptomatic infectious individuals U(t).
Figure 4: Predictions of the calibrated model: the blue line is the number of asymptomatic infectious individuals I(t), the red line the number of reported symptomatic infectious individuals R(t), the green line the number of unreported symptomatic infectious individuals U(t). The solid lines are the median values; the dashed lines are the mean values. The median values are close to the maximum a posteriori values.
Figure 5: Predictions of the calibrated model: the blue line is the number of asymptomatic infectious individuals I(t), the red line the number of reported symptomatic infectious individuals R(t), the green line the number of unreported symptomatic infectious individuals U(t). The solid lines are the median values; the dashed lines are the 10% and 90% quantiles.
Figure 6: Observed daily number of reported cases in France from February 25 to March 26 (dots) and predicted daily number of reported cases with the calibrated model (blue solid line). Here f = 0.4.
Figure 7: Predictions of the calibrated model: the blue line is the number of asymptomatic infectious individuals I(t), the red line the number of reported symptomatic infectious individuals R(t), the green line the number of unreported symptomatic infectious individuals U(t). Here f = 0.4.
Figure
9: Predictions of the calibrated model: the blue line is the number of
asymptomatic infectious individuals I(t),
the red line the number of reported symptomatic infectious
individuals R(t),
the green line the number of
unreported symptomatic infectious individuals U(t).
The solid lines are the
median values; the dashed lines are the mean values. The median
values are
close to the maximum a posteriori values. Here f = 0.4.
Figure 10: Predictions of the calibrated model: the blue line is the number of asymptomatic infectious individuals I(t), the red line the number of reported symptomatic infectious individuals R(t), the green line the number of unreported symptomatic infectious individuals U(t). The solid lines are the median values; the dashed lines are the 10% and 90% quantiles. Here f = 0.4.
Figure
11: Observed daily number of reported cases in France from February 25
to March 26 (dots) and predicted daily number of reported cases with the
calibrated model (blue solid line). Here f = 0.1
and τ(t)
have the linear form
(7).
Figure
12: Predictions of the calibrated model: the blue line is the number of
asymptomatic infectious individuals I(t),
the red line the number of reported symptomatic infectious
individuals R(t),
the green line the number of
unreported symptomatic infectious individuals U(t).
Here f = 0.1
and τ(t)
have the linear form (7).
Figure
13: A posteriori distribution of (µ, N).
The dot is the maximum a posteriori.
Here f = 0.1
and τ(t)
have the linear form (7).
Figure 14: Predictions of the calibrated model: the blue line is the number of asymptomatic infectious individuals I(t), the red line the number of reported symptomatic infectious individuals R(t), the green line the number of unreported symptomatic infectious individuals U(t). The solid lines are the median values; the dashed lines are the mean values. The median values are close to the maximum a posteriori values. Here f = 0.1 and τ(t) has the linear form (7).
Figure
15: Predictions of the calibrated model: the blue line is the number of
asymptomatic infectious individuals I(t),
the red line the number of reported symptomatic infectious
individuals R(t),
the green line the number of
unreported symptomatic infectious individuals U(t).
The solid lines are the
median values; the dashed lines are the 10% and 90% quantiles. Here f = 0.1
and τ(t)
have the linear form (7).
References
References
[1] P. Magal and G. Webb, Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Germany, https://www.medrxiv.org/content/10.1101/2020.03.21.20040154v1