Research article Special Issues

Bayesian inference in epidemics: linear noise analysis


  • This paper offers a qualitative insight into the convergence of Bayesian parameter inference in a setup which mimics the modeling of the spread of a disease with associated disease measurements. Specifically, we are interested in the Bayesian model's convergence with increasing amounts of data under measurement limitations. Depending on how weakly informative the disease measurements are, we offer a kind of 'best case' as well as a 'worst case' analysis where, in the former case, we assume that the prevalence is directly accessible, while in the latter that only a binary signal corresponding to a prevalence detection threshold is available. Both cases are studied under an assumed so-called linear noise approximation as to the true dynamics. Numerical experiments test the sharpness of our results when confronted with more realistic situations for which analytical results are unavailable.

    Citation: Samuel Bronstein, Stefan Engblom, Robin Marin. Bayesian inference in epidemics: linear noise analysis[J]. Mathematical Biosciences and Engineering, 2023, 20(2): 4128-4152. doi: 10.3934/mbe.2023193

    Related Papers:

    [1] Giacomo Ascione, Enrica Pirozzi . On a stochastic neuronal model integrating correlated inputs. Mathematical Biosciences and Engineering, 2019, 16(5): 5206-5225. doi: 10.3934/mbe.2019260
    [2] Virginia Giorno, Serena Spina . On the return process with refractoriness for a non-homogeneous Ornstein-Uhlenbeck neuronal model. Mathematical Biosciences and Engineering, 2014, 11(2): 285-302. doi: 10.3934/mbe.2014.11.285
    [3] Meng Gao, Xiaohui Ai . A stochastic Gilpin-Ayala mutualism model driven by mean-reverting OU process with Lévy jumps. Mathematical Biosciences and Engineering, 2024, 21(3): 4117-4141. doi: 10.3934/mbe.2024182
    [4] Buyu Wen, Bing Liu, Qianqian Cui . Analysis of a stochastic SIB cholera model with saturation recovery rate and Ornstein-Uhlenbeck process. Mathematical Biosciences and Engineering, 2023, 20(7): 11644-11655. doi: 10.3934/mbe.2023517
    [5] Katrine O. Bangsgaard, Morten Andersen, James G. Heaf, Johnny T. Ottesen . Bayesian parameter estimation for phosphate dynamics during hemodialysis. Mathematical Biosciences and Engineering, 2023, 20(3): 4455-4492. doi: 10.3934/mbe.2023207
    [6] Zhihao Ke, Chaoqun Xu . Structure analysis of the attracting sets for plankton models driven by bounded noises. Mathematical Biosciences and Engineering, 2023, 20(4): 6400-6421. doi: 10.3934/mbe.2023277
    [7] Xiaochen Sheng, Weili Xiong . Soft sensor design based on phase partition ensemble of LSSVR models for nonlinear batch processes. Mathematical Biosciences and Engineering, 2020, 17(2): 1901-1921. doi: 10.3934/mbe.2020100
    [8] Yongxin Gao, Shuyuan Yao . Persistence and extinction of a modified Leslie-Gower Holling-type Ⅱ predator-prey stochastic model in polluted environments with impulsive toxicant input. Mathematical Biosciences and Engineering, 2021, 18(4): 4894-4918. doi: 10.3934/mbe.2021249
    [9] M. Nagy, M. H. Abu-Moussa, Adel Fahad Alrasheedi, A. Rabie . Expected Bayesian estimation for exponential model based on simple step stress with Type-I hybrid censored data. Mathematical Biosciences and Engineering, 2022, 19(10): 9773-9791. doi: 10.3934/mbe.2022455
    [10] Thomas Torku, Abdul Khaliq, Fathalla Rihan . SEINN: A deep learning algorithm for the stochastic epidemic model. Mathematical Biosciences and Engineering, 2023, 20(9): 16330-16361. doi: 10.3934/mbe.2023729
  • This paper offers a qualitative insight into the convergence of Bayesian parameter inference in a setup which mimics the modeling of the spread of a disease with associated disease measurements. Specifically, we are interested in the Bayesian model's convergence with increasing amounts of data under measurement limitations. Depending on how weakly informative the disease measurements are, we offer a kind of 'best case' as well as a 'worst case' analysis where, in the former case, we assume that the prevalence is directly accessible, while in the latter that only a binary signal corresponding to a prevalence detection threshold is available. Both cases are studied under an assumed so-called linear noise approximation as to the true dynamics. Numerical experiments test the sharpness of our results when confronted with more realistic situations for which analytical results are unavailable.



    Computational models in epidemics are commonly relied upon to estimate the disease spread at fairly large spatial- and temporal scales, often referred to as scenario generation. With the increasing volumes and improved resolution of data from, e.g., mobile apps and disease testing time series from hospitals and nursing homes, predictive data-driven models formed from first principles are within reach. The accuracy of such models is ultimately limited by the specifics of the available disease surveillance data. In this paper we attempt to gain a qualitative understanding of how Bayesian inference of epidemiological parameters may be expected to perform and what the limiting factors are.

    Epidemiological models are typically formed by postulating laws for the flow of individuals between different compartments in a large population and have been studied in this form for a long time. When connected with data in the form of observations, the associated inference problem is also a fairly mature field, see, e.g., [1,2,3]. With the increasing qualities and quantities of data, the various incarnations of data-driven modeling allow for substantially higher modeling resolution compared to traditional macroscopic approaches [4,5,6,7,8]. For example, individual-level contact tracing has been used to study disease spread models at various population sizes [9,10,11,12,13]. Data-driven models have allowed epidemic and endemic conditions to be investigated at a level of detail not previously possible [14,15,16].

    The identification of the epidemiological parameters from data falls under the scope of problems formally studied in System Identification [17]. However, a 'system' viewpoint of epidemiological modeling is not yet standard, and identification of parameters is rather more often approached through calibration of residuals [5,16,18], sometimes also blending in aspects of Bayesian arguments. Fully Bayesian approaches are rarer, albeit with some exceptions [8,19], typically due to the technical difficulties with formulating suitable (pseudo-)likelihoods and the slow convergence associated with the conditioning of the problem. Although Bayesian inference is notably well-posed thanks to its use of prior distributions, the posterior distribution itself is often a computationally ill-conditioned object whenever strong parameter correlations are present, e.g., resulting from nearly singular maps from parameters to observables. These conditions, together with the societal importance of this modeling domain, make it relevant to reason around the limits of fully Bayesian techniques.

    Fundamental questions concerning Bayesian convergence in general settings, including infinite-dimensional ones, have been treated [20,21], and have also been revisited with specific tools and applications in mind, e.g., Gaussian processes and other machine learning algorithms [22,23]. "Brittleness" [24,25], or high sensitivity to small perturbations have been proposed to be a problematic phenomenon under certain conditions. For applications, this points to the importance of striking a balance between the granularity of the model and the information content and level of detail of the available data.

    With the specific aim of reaching qualitative conclusions for Bayesian modeling in epidemiology, we will connect some usual modeling approaches in infectious disease spread with a basic linear stochastic differential equation. We analyze the pre-asymptotic Bayesian posterior in the high-quality data regime as well as the estimate's convergence under weakly informative disease measurements. This is motivated by current trends in disease spread monitoring through for example sewage water analysis and symptoms data collection using smartphones [26,27].

    The rest of the paper is organized as follows. In §2 we suggest the Ornstein-Uhlenbeck process as a meta-model of epidemiological models, covering various Susceptible-Infected-Recovered (SIR)-type models locally in time, during endemic as well as under epidemic conditions. In §3 we briefly analyze the posterior convergence when accurate full state measurements are available. The main results are found in §4 where we more fully develop an analysis for the case of poorly informative data. We offer some examples of relevance in §5 and a summarizing discussion is found in §6.

    We connect in this section the Ornstein-Uhlenbeck process with some basic epidemiological models via linearization and quasi steady-state arguments. We also briefly discuss, by means of a backward analysis, how continuous and discrete time in this setting can be connected in probability law under a certain parameter map.

    Epidemiological modeling typically involves ordinary differential equations (ODEs), e.g., the SIR-model which is often used to model the annual flu [1,28],

    S(t)=(βΣ1)S(t)I(t)I(t)=(βΣ1)S(t)I(t)γI(t)R(t)=γI(t)} (2.1)

    in terms of Susceptible, Infected, and Recovered individuals, respectively, and for a total population size ΣS+I+R. The model parameters β and γ define the transmission and recovery rates, and the basic reproduction number is then given by R0=β/γ, that is, the expected number of new infections resulting from a single index case [1].

    Simplifying the SIR-model by removing the R-compartment one obtains the SIS-model, where the recovered state has been removed and hence effectively identified with the susceptible state,

    S(t)=(βΣ1)S(t)I(t)+γI(t)I(t)=(βΣ1)S(t)I(t)γI(t)} (2.2)

    and with the same R0 as the SIR-model. It is sometimes useful to add an environmental compartment expressing the infectious pressure φ, i.e., the amount of infectious substance per unit of space. This defines the SISE-model [29], where the E stands for the Environmental compartment,

    S(t)=βS(t)φ(t)+γI(t)I(t)=βS(t)φ(t)γI(t)φ(t)=Σ1I(t)ρφ(t)} (2.3)

    where the indicated governing equation for φ is just a basic example. The SISE-model is convenient to adapt to spread over a network and to detection situations involving sampling the environment [19]. Due to the indirect transmission the basic reproduction number now scales as a square root, with R0=β/(γρ).

    The use of ODEs can be justified for epidemiological models in sufficiently large populations. In smaller populations, e.g., networks of small communities, stochastic variants are necessary to properly capture the underlying dynamics of the spread [3,30]. For example, a stochastic differential equation (SDE)-version of (2.2) was analyzed in [31] which essentially replaces βdt by a Brownian diffused version βdt+ηdB(t). This has the effect of lowering the basic reproduction number to R0=(βη2/2)/γ. A first-principle stochastic approach is to rather express the dynamics as a continuous-time discrete state Markov chain (CTMC). The SIS-model above is then defined via discrete transitions with exponentially distributed waiting times,

    S+IβΣ12IIγS} (2.4)

    meaning, e.g., that one infectious individual may infect one susceptible individual, and such that β and γ are now understood as rate parameters in the driving Poissonian processes. One can show that now R0=(1Σ1/2)×β/γ such that consistency with the previous SDE formulation would require that η2=Σ1β and hence follows from the typical Poissonian population-dependent noise scaling ηΣ1/2.

    Let P(t) be some given measure of the intensity of the disease, such as the absolute or relative disease prevalences, I(t) or Σ1I(t), respectively. Another alternative could be the infectious pressure φ(t), which also informs on the current disease intensity. Gathering data from the disease spread means collecting information about P(t), typically in the form of a time-series, (F(P(ti))i, say, for some measurement operator F. Consider endemic conditions first, that is, P is considered stationary. As an ansatz, suppose

    P(ti)P[P=p]=ρ(p)exp(2/σ2E(p)), (2.5)

    for some epidemiological potential E. Under endemic conditions we can expect this to be a single well potential, say,

    E(p)(pα)2+constant,  (2.6)

    or at least can locally be approximated by this form. This is consistent with the Ornstein-Uhlenbeck (OU-) process [32],

    X(0)=X0, (2.7)
    dX(t)=kμX(t)dt+σdW(t), (2.8)

    where X(t)R and where the parameters of the model are θ=[k,μ,σ]R3+ and in (2.6), α=k/μ.

    As a concrete example, we may approximate the continuous-time Markov chain (2.4) by linearizing the rates around the non-trivial stationary state I=Σ(1γ/β) of (2.2). Similarly, the noise term σ can be determined by inspecting the total variance of the Poissonian rates in (2.4) around this equilibrium. This yields a linear noise OU-approximation to the state variable I of the SIS-model (2.4) with parameters

    k=Σ(βγ)2/β=γΣ(R01)2/R0μ=βγ=γ(R01)σ2=2Σ(βγ)2/β=2γΣ(R01)/R0} (2.9)

    where we use the approximation R0=(1Σ1/2)×β/γβ/γ. Hence, with this specific interpretation of the OU-process,

    R0=(1Σ1k/μ)1=(1Σ1α)1. (2.10)

    The quality of this approximation is exemplified in Figure 1; the effect of linearizing around the stationary state can be seen as a slightly too fast transient compared to the Markov chain.

    Figure 1.  The approximation (2.9) exemplified for R0=[1.1,1.5,2] (bottom and up), with Σ=1000 and γ=1. For comparison the ODE-, the OU-, and the continuous-time Markov chain (CTMC) interpretation are shown.

    During epidemic conditions we are rather sampling a transient of the process and, moreover, there are also typically many kinds of feedback involved, e.g., from minor adjustments of individual level behavior, to major societal changes and governmental intervention strategies. Although this means that there is now a greater challenge in formulating a data-centric meta-model of the situation we may still consider a window of time for which the disease spread parameters are approximately constant and such that (2.7)–(2.8) remain a relevant model of the situation. A difference is then that data is gathered out of equilibrium (hence away from α) and presumably also under more noisy conditions with larger values of σ. More general epidemiological potentials could be considered through the SDE in gradient formulation,

    dP(t)=E(Pt)dt+σdW(t), (2.11)

    for which the stationary measure is still given by (2.5). For general SDE models, a change of variables allows us to locally consider an SDE with constant noise [33] and, in turn, any time-homogeneous SDE with constant noise,

    dX(t)=f(Xt)dt+σdW(t), (2.12)

    is of course readily linearized into an OU-process.

    The OU-process (2.7)–(2.8) is a Gaussian continuous process with mean and covariance given by

    E[Xt]=kμ(1eμt)+eμtX0, (2.13)
    Cov[Xt,Xs]=σ22μ(eμ|ts|eμ(t+s)), (2.14)

    such that the stationary distribution is XN(k/μ,σ2/(2μ)).

    Numerical simulation procedures are typically based on Euler-type discretization methods. Assume for convenience a fixed numerical time step Δt and compute

    Xn+1=Xn+(kμXn)Δt+σΔWn, (2.15)

    with ΔWn being i.i.d. normally distributed numbers of zero mean and variance Δt. The numerical trajectory (Xn) is then an approximation to the OU-process (X(tn)) at discrete times tn=nΔt for n=0,1, and is also a Gaussian vector with mean and covariance

    E[Xn]=kμ+(1μΔt)n(X0kμ), (2.16)
    Cov(Xn,Xn+p)=σ2Δt1(1μΔt)2(1(1μΔt)2n)(1μΔt)p. (2.17)

    Actually, (2.15) forms an AR(1)-sequence [34]. Comparing (2.13)–(2.14) with (2.16)–(2.17) we readily find the following useful result.

    Proposition 2.1. (Exact OU-samples and backward analysis). The discrete process (Xn) given by the explicit Euler method (2.15) follows the law of the discrete samples of an exact OU-process with perturbed parameters (kΔt,μΔt,σΔt), where

    1μΔt=exp(ΔtμΔt)kμ=kΔtμΔtσ22μμ2Δt=σ2Δt2μΔt} (2.18)

    This system of equations can be solved explicitly provided Δt<μ1,

    kΔt=klog(1μΔt)μΔtμΔt=log(1μΔt)Δtσ2Δt=2σ2log(1μΔt)2μΔtμ2Δt2} (2.19)

    These perturbed parameters are all first order perturbations in Δt of the exact parameters.

    The inverse of (2.19) is that

    k=μkΔtμΔtμ=1exp(ΔtμΔt)Δtσ2=σ2Δt2Δtμ2μΔtμ} (2.20)

    It follows that, for given parameters [kΔt,μΔt,σΔt], a forward Euler simulation using the new set of parameters [k,μ,σ] defined by (2.20) will produce a sample trajectory obeying the law of the OU-process with the given parameters exactly.

    We first briefly consider in this section the behavior of the Bayesian posterior as a function of the prior and of data in the sense of full state measurements. Hence we suppose that sampled process data DN=(di)Ni=0 is available at some fixed time step h, that is, di=X(ti) with ti=ih and X() an OU-process with X0 for simplicity considered a known sure number.

    It will be convenient to consider the following reparametrization of the OU-process

    α=kμRβ=eμh(0,1)γ=2μσ211e2μhR+} (3.1)

    and we put u[α,β,γ] for brevity, also defining u0[α0,β0,γ0], i.e., the true parameters generating the data.

    Under this reparametrization we readily find from Proposition 2.1 that the exact discrete process can be written in the AR(1)-sequence form

    Xi+1=βXi+α(1β)+γ1/2ξi, (3.2)

    where the ξiN(0,1) are independent. Writing δ:=α(1β), explicit least squares estimators for the parameters can be found by solving

    minδ,βA×[δ,β]Tb22=:minδ,β[1[di]N1i=0]×[δ,β]T[di+1]N1i=022, (3.3)

    where the brackets over the data form column vectors. The residual of this solution implies the corresponding estimator for γ,

    ˆγ1=(N2)1A×[ˆδ,ˆβ]Tb22. (3.4)

    By the Gaussian character of the OU-process these estimators coincide with the maximum likelihood estimators.

    The Bayesian convergence to the true parameters as N can be characterized by either the Bernstein von Mises theorem (BvM), for h fixed, or by contrast functions convergence for a fixed time window t[0,T]. BvM states that the Bayesian posterior converges to a normal distribution centered at the maximum likelihood estimate with the inverse Fisher information matrix (FIM) as covariance [35,36,37]. Similarly, the contrast functions approach considers the convergence of the approximation of discretized observations, approaching the same normal distribution for N large enough [38,39,40]. In the present case the FIM can be determined explicitly using the log-likelihood logL(Xi+1|Xi;u) induced by (3.2) and the definition

    ΣE[2u2logL](u0). (3.5)

    The result is that

    Σ=Diag((1β0)2γ0,1/(1β20),1/(2γ20)). (3.6)

    Proposition 3.1. (BvM Theorem for (3.2)). Provided that the prior has a continuous and positive density in an open neighborhood of (α,β,γ), as N we have that the ML-estimators (3.3)–(3.4) converge to the true values as

    N(ˆuu0)dN(0,Σ1), (3.7)

    (convergence in the sense of distribution) with Σ defined by (3.6). Alternatively, (3.7) dictates the asymptotic convergence of the mean of the Bayesian posterior.

    To summarize, the asymptotic variances of all parameters are independent of α0. The noise term γ0 mainly affects the convergence of α0 (and γ0 in an absolute sense). Finally, an increase of β0 implies a faster convergence towards β0 at the cost of a slower convergence towards α0. Under the interpretation of an SIS-model (2.9)–(2.10), the parameter α0 is in one-to-one correspondence with the basic reproduction number R0, hence its central interest here.

    The asymptotic nature of both the BvM Theorem and the contrast function convergence is a poor match in epidemiological situations where data is often scarce and poorly informative at the time scale over which the parameters can be considered static. This motivates our interest in also the pre-asymptotic regime of the Bayesian posterior. We therefore consider prior densities of the specific form

    π(u)γrexp(γ2P(α,β)), (3.8)

    where, for integrability, P is to be a polynomial of degree 2 in α and nonnegative for β(0,1). This choice has the convenient property that the posterior measure after N observations is

    Π(N)(u)γN2+rexp(γ2(QN+P)(α,β)), (3.9)

    where

    QN(α,β)N1i=0(di+1βdiα(1β))2, (3.10)

    and where di=X(ti) is the observation at time ti.

    Remark. To include also the case of flat priors, while avoiding technicalities for nonintegrable densities, we note that, since the value of a constant prior has no influence on the posterior, we can still determine a posterior after a single initial observation, and then use this posterior as a prior for the rest of the observations.

    The following result examines the convergence of the Bayesian posterior.

    Theorem 3.2. (Convergence of log-likelihood). Consider the polynomial qN:

    qN(α,β)N1QN(α,β)=N1N1i=0(di+1βdiα(1β))2. (3.11)

    Then as N we have the almost everywhere uniform convergence on every compact of qN towards the function f,

    f(α,β)=(1β)2[1γ0(1β20)+(α0α)2]+2βγ0(1+β0). (3.12)

    Proof. For a general point (α,β), we rely on the ergodic theory of Markov chains [41,p. 472] to get that N1g(di,di+1) converges almost surely towards E[g(di,di+1)] for g integrable against the stationary measure. Using (2.13)–(2.14) this implies the limits:

    N1N1i=0di+1βdiα0(1β), (3.13)
    N1N1i=0(di+1βdi)2(α0(1β))2+1+β2γ0(1β20)2ββ0γ0(1β20). (3.14)

    Combined, we obtain the claimed limit in a pointwise sense. As each (qN) is a polynomial in (α,β) of bounded degree, pointwise convergence is equivalent to convergence of the coefficients, and hence implies the convergence on all compact sets.

    An obvious extension is to consider measurements polluted by noise, say, di=Xi+ηi, for i.i.d. ηiN(0,η) and some variance η. The posterior so obtained is readily computed via a recursive Kalman filter but does not have a simple analytic form. To first order in η, however, (3.8)–(3.9) still holds provided γ is replaced with [1/γ+η(1+β2)]1 and QN in (3.10) is replaced with

    QNN1i=0[di+1βdiα(1β)+ηβγ(diβdi1α(1β))]2, (3.15)

    where the right ηβγ-term is skipped when i=0. Intuitively, the first order effect of noise in data is to broaden the posterior with the variance of this noise. We next proceed to investigate more severe truncations of the measurements from the epidemiological process.

    In the previous section we considered full process data to be available without any extrinsic noise. This models the best possible Bayesian setup but is also clearly unrealistic in epidemics. As a model of a more challenging situation we consider in this section the recorded data to be some (possibly stochastic) function of the OU-process X(), which itself is considered a latent variable. We are initially concerned with binary data of the form DN=(di)Ni=0 where di=Yi=1X(ti)c, again over a uniform grid in time ti=ih, and for a known filter cut-off value c (see also the related setup in [42]). That is, the epidemiological interpretation is that the data is considered to be time-discrete information about whether or not the prevalence X() of the population is above or below a certain known threshold c. With the prevalence process hidden one is forced to estimate it simultaneously with any parameter estimates. Using the AR(1)-form (3.2) we have that the stationary measure for the (p+1) steps (Xi,,Xi+p) is Gaussian N(α,Σ) with Σ given by (for p0)

    Σ=Σ(β,γ)=γ11β2[1ββ2βpβ1ββp1β2β1ββpβ2β1]. (4.1)

    In order to be able to conduct an analysis where information is obtained only from the observable Y(), we take (4.1) as a motivation for the following:

    Assumption 4.1 (p-step Markovian stationary assumption). We say that we work under the stationary assumption whenever we assume that the law of the latent variable Y() can be directly inferred from the stationary p-step law (4.1).

    We stress that it is known that this type of clipped Gaussian processes are not p-step Markov for any p [43]. Assumption 4.1 rather serve as an approximation where we approximately model Y() as if it was p-step Markov with law deduced from (4.1). The pseudo-likelihood for the filtered variable Y then becomes

    LN(u)=Npi=0φu(Yi,,Yi+p) (4.2)

    where φu(e) denotes the probability for a Gaussian stationary filtered process with parameters u to be e{0,1}p+1. We also define the pseudo-potential

    ΦuNlogLN(u)=Npi=0logφu(Yi,,Yi+p). (4.3)

    We show below in §4.1 that the stationary assumption allows for converging posterior estimates for the parameter β, but leaves any prior density unchanged over a certain curve in (α,γ). In §4.2 we briefly consider the filter cutoff value c (the sensitivity) to be uncertain and straightforwardly show that any prior density on c is unaffected by data. We next replace the sharp deterministic filter by a stochastic filter implementing a sigmoidal response function and sharpen our results in this more general setting in §4.3. Finally, in §4.4 we consider slightly more informative measurements consisting of a finite discrete response and we show that this resolves the singularity issues associated with purely binary measurements.

    We first show that under the p-fold stationary assumption one can estimate the correlation term β rather well, but increasing the gap between the filter threshold c and the mean α has an effect on the likelihood which is indistinguishable in law from increasing the noise γ.

    We start with a technical lemma.

    Lemma 4.1 (Equality in law). For u in any admissible set of parameters, let pu be the law of (Y0,,Yp) under Assumption 4.1 corresponding to N(αu,Σu). Then pu=pw if and only if βu=βw and γu(cαu)=γw(cαw).

    Proof. () Suppose pu=pw. Consider (Xu0,,Xup) a Gaussian vector with law N(αu,Σu). As P(Xu0c)=e{0,1}pφ(0,e), we deduce that P(Xu0c)=P(Xw0c). This can be written as an equality between standard cumulative distribution functions of Gaussians, since γu(1β2u)(Xu0αu) and γw(1β2w)(Xw0αw) are standard Gaussians. This translates into:

    γu(1β2u)(cαu)=γw(1β2w)(cαw) (4.4)

    We also have P(Xu0,Xu1c)=P(Xw0,Xw1c), and this implies, with A=γu(1β2u)(cαu),

    11β2u(,A]2exp(x2+y22βuxy1β2u)dxdy=11β2w(,A]2exp(x2+y22βwxy1β2w)dxdy. (4.5)

    This last expression shows that the map βuP(Xu0,Xu1c) is locally analytic. As a corollary of Slepian's lemma [44], this map is also increasing and hence strictly increasing. It follows that the last equality implies βu=βw.

    () The law pu is uniquely determined by its 2p+1 values. All these values are of the type

    I(Q;u)=Q(A)(detM(βu))(p+1)/2×exp(xM(βu)xT)dx, (4.6)

    where Q(A) is a product of p+1 intervals, each being either (,A] or [A,+). And so, if βu=βw and γu(cαu)=γw(cαw), the laws pu and pw are indeed equal.

    Theorem 4.2 (Non-identifiability). Assume p1. For any integrable prior π, non-zero on E, and for f bounded, as N,

    fLNdπEfd˜π, (4.7)

    where E{u;γ(cα)=γ0(cα0),β=β0} and ˜π is the prior restricted to the set E.

    Proof. At first, thanks to Lemma 4.1, the law φ(Yi,Yi+p) is not characterized by u0=[α0,β0,γ0], but rather by [γ0(cα0),β0]. As the Yi's are binary, the quantity φ(Yi,,Yi+p) can only take 2p+1 values, which we denote by φe for e{0,1}p+1. This means that the log-likelihood under Assumption 4.1 can be written as a finite sum:

    logLN=e{0,1}p+1N(e)logφe. (4.8)

    where

    N(e)={i{0,Np};(Yi,,Yi+p)=e} (4.9)

    Now, as the discrete OU-process is an AR(1)-sequence [34], we may infer the convergence of N(e)/N towards a stationary value:

    logLNNe{0,1}p+1φu0elogφue, (4.10)

    where φue denotes the law of a stationary filtered process with parameter u. We note that

    limNlogLNN=e{0,1}p+1φu0elogφueφu0e+e{0,1}p+1φu0elogφu0e. (4.11)

    Up to a constant this limit is the negative of the Kullback-Leibler divergence between the laws φu and φu0, which vanishes if and only if the two distributions are equal. We already know that the laws of φu and φu0 are equal if and only if β=β0 and γ(cα)=γ0(cα0), i.e., if uE. Consider the associated relation:

    u=(α,β,γ)u=(α,β,γ) iff β=β and γ(cα)=γ(cα). (4.12)

    It is straightforward to show that this relation is reflexive, symmetric, and transitive, thus forming an equivalence relation. Thanks to Lemma 4.1, we know that the map uLN(u) factorizes through the equivalence relation to a map w˜LN(w) where w ranges over the different equivalence classes of u.

    Lemma 4.1 also allows us to state that the map ~LN is injective. Consider w0 in the equivalence class of u0. Then for any neighborhood W of w0, there is a δ>0 such that for N large enough, wW implies

    log˜LN(w0)Nlog˜LN(w)N+δ, (4.13)

    which means that the posterior measure of W converges towards 1. Since the factorized map ˜LNδw0, the theorem follows.

    The same argument implies that, if we consider the case p=0, the posterior will be even more degenerate. For p=0 one gets the convergence

    fLNdπEfd˜π, (4.14)

    where now ˜π is the prior restricted to E:

    E{u;γu(1β2u)(cαu)=γ0(1β20)(cα0)}. (4.15)

    This case is relevant whenever we consider large gaps of time in between the measurements such that they can be considered practically independent.

    At this point, let us remind that the quantities LN(u) and ΦpN(u) are defined in Eqs (4.2) and (4.3).

    Theorem 4.3 (Rate of convergence of pseudo-potential). Under the p-Markovian stationary Assumption 4.1 there exists a mapping fp and a constant σp>0 such that, as N, we have the convergence in law:

    N(ΦpN(u)Nfp(u))N(0,σp). (4.16)

    Proof. This is a direct consequence of the Central Limit Theorem, with fp(u) being the mean value of ΦpN(u) and σp a nonnegative constant.

    A relevant variation of the theme is to consider the parameter c (the "test sensitivity") an uncertain parameter. Mathematically, this means considering the likelihood LN a function of (α,β,γ,c), and priors and posteriors depending also on c. However, the following result shows that this setup will not produce any more information about the parameters.

    Theorem 4.4 (Translation of filter uncertainty). Consider a prior of the form μ(α)π(cα,β,γ) and the pseudo-likelihood in §4.1. Then the resulting posterior measure will be of the form μ(α)Π(cα,β,γ), where, for fixed c, Π(c,,) is the pseudo-posterior from the prior π(c,,).

    Proof. This result is immediate once one realizes that the pseudo-likelihood has the property that, for any tR,

    LN(α,β,γ,c)=LN(α+t,β,γ,c+t). (4.17)

    In other words, the only information we can infer on the parameters (c,α) is the gap cα. Any uncertainty of c can be understood as an uncertainty on α, and vice-versa.

    Rather than a sharp cut-off value c, most environmental sampling methods obey some kind of sensitivity response, e.g., of sigmoid character, with a quick rise in the detection probability as one progresses through some threshold region. Examples here could include sampling and subsequent analysis of sewage water or animal droppings, but this would also be a relevant model in the case of statistical regression estimates using data obtained via self-reporting smartphones applications.

    A general ansatz to capturing this situation is to consider

    YiB(s(Xi)), (4.18)

    where s is a map from R to [0,1] and B denotes the Bernoulli law. Typically, the map s is sigmoidal with a gradient around the threshold c which depends on the sensitivity and specificity of the test. We naturally ask that the efficiency of the filter does not depend on its previous use, i.e., that (Yi|Xi)i is an independent family. To construct such an object, one may consider an i.i.d. family of uniformly distributed variables (ξi) on (0,1), independent from (Xi), and set

    Yi=1ξis(Xi). (4.19)

    This object clearly fulfills all properties mentioned and is general enough to capture also quite specific situations.

    We will now establish our result in this more general setting. Consider a bounded and continuous map f:Rp+1R. We observe the OU-process X(t) through the map f, so the observations are Yi=f(Xi,,Xi+p) where as before Xi=X(ti). To define a pseudo-likelihood, we reason as if the observed data can be regarded as stationary. We thus denote by fu the law of a stationary OU-process with parameters u filtered via the measurement map f and we remind ourselves that the latent process X(t) is not necessarily stationary.

    We consider a likelihood of the kind

    LNexp(Npi=0logfu(Yi)), (4.20)

    normalized to mass one, and we denote the potential by

    gN(u):=N1Npi=0logfu(Yi). (4.21)

    This setup simply corresponds to observing the data portion (Xi,,Xi+p) via the filter f and, assuming stationarity, building a pseudo-likelihood. Unfortunately this does not directly include the intended case of a sigmoid filter, as this would rather involve measurements h(ξi,Xi,,ξi+p,Xi+p) with (ξi) an i.i.d. sequence independent from (Xi) and uniform on (0,1). However, once the proposed case is examined, one can get to the latter case by exchanging the order of integration using the Fubini property, i.e., studying the behavior of h(t0,Xi,,tp,Xi+p) and then integrate over (t0,,tp). For example, consider the case p=1. Then the intended map h would be

    h(t0,x0,t1,x1)=(1t0s(x0),1t1s(x1)) (4.22)

    For a fixed u, fu can take at most 4 values, e.g.,

    fu(1,1)=E[s(Z0)s(Z1)], (4.23)

    where (Z0,Z1) is a stationary OU-process of parameter u, and similarly for fu(0,1), fu(1,0), and fu(0,0). As this technique would render the proof lengthy, we decided to put it aside.

    In order to state our result, we need to specify some minimal set of regularity conditions. Convergence of the potential is required as well as definiteness in the Kullback-Leibler divergence (DKL). Additionally, we shall also require a separation condition in the large data limit.

    Assumption 4.2 (Regularity). We assume the following specifics:

    1) The potential gN converges uniformly on the compacts of u.

    2) DKL(fu0,fu)=0u=u0.

    3) There is a δ>0 and a compact neighborhood K of u0 such that, for N large enough and for uK, gN(u0)>gN(u)+δ.

    Theorem 4.5 (Weak convergence). Under Assumption 4.2, we have the weak convergence

    LNδu0. (4.24)

    This theorem can be adapted to cover other situations:

    ● If DKL(fu0,fu)=0 does not imply u=u0, one can try to factorize the map through an equivalence relation (as in the proof of Theorem 4.2) to get the convergence towards the indicative function of the set {u;DKL(fu0,fu)=0}.

    ● If we only want to consider parameters within a subset U of the set of parameters, one can always consider the assumptions restricted to the set U, and then one would have the convergence only for priors with support in U.

    Proof. We divide the proof in two steps as follows.

    Step 1 Let us call g the limit of (gN), which by assumption is not random. The same argument as in the proof of Theorem 3.2, the ergodicity of the process (Xt), allows us to state that the limit function must be E[logfu] (up to a constant, the limit is actually DKL(fu0,fu)). As adding a constant is equivalent to multiplying all posteriors by a nonnegative constant, we may assume the limit function to be equal to uDKL(fu0,fu).

    Step 2 As LN is proportional to exp(gN), the assumptions allow us to deduce that the mass of any neighborhood of u0 converges towards 1. This implies the weak convergence of the posteriors towards the Dirac δu0.

    As a final variation on the theme we now show how Theorem 4.5 can be applied in such a way as to overcome the issues with the non-identifiability of Theorem 4.2. The idea is that the test with one filter c evidently at best gives us a curve containing the parameter u0, namely the one satisfying β=β0, and γ(cα)=γ0(cα0). Suppose instead that we have two kinds of tests, one with cut-off c1 and one with cut-off c2. Since the intersection of the two curves implied by the respective filters c1 and c2 is exactly the point u0, it is natural to assume that two filters are enough to get the convergence of the posterior towards δu0. We show that this is indeed the case.

    Theorem 4.6 (Trinary filter). Let c1<c2 and consider the filtered values Yi to be

    Yi=1Xi>c1+1Xi>c2, (4.25)

    and the associated pseudo-likelihood

    LN=Npi=0φ(Zi,,Zi+p), (4.26)

    where φ is the multivariate cumulative distribution function of a Gaussian N(α,Σ) (4.1). Then the sequence of pseudo-likelihoods converges weakly:

    LNδu0. (4.27)

    Proof. It is sufficient to check that the sequence (Yi) verifies Assumption 4.2. Considering gN:=N1logLN, one has

    gN=e{0,1,2}pNeNlogpu(e), (4.28)

    where Ne is the number of times (Yi,,Yi+p) matches e, and pu is the cumulative distribution function of the underlying Gaussian process N(α,Σ). From this we get directly conditions (4.2) and (4.2) of Assumption 4.2. For condition (4.2), if the Kullback-Leibler divergence is zero, then we must have pu(e)=pu0(e) for all e. This implies the equalities

    β=β0γ(c1α)=γ0(c1α0)γ(c2α)=γ0(c2α0)} (4.29)

    As c1c2 we have u=u0, and so the assumptions are fulfilled.

    We devote this section to some illustrations of selected results from §§3 and 4. We shall do this in the intended epidemiological setting and thus no longer assume the OU-process, but rather the SIS- and SISE-models from §2 in the form of continuous-time Markov chains over a discrete state-space. In §5.1 we investigate the precision of the predicted posterior uncertainty under full state measurements, while in §5.2 we offer a demonstration of the singular behavior under filtered measurements. Finally, in §5.3 we highlight the use of synthetic data when approaching more realistic problems defined over a network.

    The software for the numerical experiments is available for download via the corresponding author's web-page*.

    *Refer to the BISDE-code at https://user.it.uu.se/~stefane/freeware.html

    We first consider the Bayesian uncertainty under accurate measurements and take the continuous-time Markov chain version of the SIS-model (2.4) as an example. Using the SIS OU approximate interpretation (2.9) we have from (3.1) the relations

    αOU=Σ(1R10)βOU=exp(γh(R01))γOU=Σ1R0[1β2OU]1}R0=1+αOU(1β2OU)γOUΣ=αOU+[(1β2OU)γOU]1γ=logβOU[h(R01)]1} (5.1)

    where {R0,Σ,γ} are the SIS-model parameters.

    We generate synthetic data from the Markov chain as illustrated in Figure 1 for Σ=1000 and with ranges of values γ[0.1,1] and R0(1,3.5]. We let I(0)=0.01×Σ and sample exact values of I(ti) for ti=ih, i=1,,N, and h1. This corresponds to 100 perfect samples in a closed population at a rate equivalent to between one to one tenth the disease period unit (=1/γ).

    We evaluate the posterior over a grid of values in the (R0,γ)-plane by simply normalizing the likelihood for the Markov chain given the synthetic data discussed previously. The likelihood of the Markov chain is formally obtained by solving the associated master equation which, however, is inconvenient to do except for small populations Σ. A more general approach is via a local linear Gaussian approximation and a Kalman filter. Put I0=I(0) and define

    Ik+1=(1+ΔtβΣ1(ΣIk)Δtγ)Ik+wk,wkN(0,[ΔtβΣ1(ΣIk)+Δtγ]Ik), (5.2)

    that is, this is the forward Euler discretization of the Langevin equations approximating the Markov chain. The Kalman filter associated with (5.2) computes a likelihood for each data point, albeit for a perturbed model. The relative error in the Langevin approximation generally scales with the inverse of the population size Σ, and can be expected to be rather small in the present context (see [45,Ch. 11.3]). Further, Proposition 2.1 suggests analyzing the Euler discretization via backward analysis as a parameter perturbation, but unfortunately this is not generalizable to non-additive noise [46]. For a resolved discretization, however, γΔt1, weak first order convergence can be expected under broad conditions. We use the constant Kalman resolution Δt=h/4 and next focus on the estimation error.

    We have already evaluated the asymptotic covariance matrix under accurate data in (3.7). Using the linear uncertainty transformation QJQJT, where J is the Jacobian of the parameter map (5.1), and where Q is the (diagonal) covariance matrix in (3.7), we can estimate the posterior variance

    Var(R0)R30ΣN×1+βOU1βOU. (5.3)

    A similar formula can be worked out for the variance of γ as well, although a bit more involved. For small enough h, the denominator 1βOUγh(R01) in (5.3), and so the relative uncertainty in any consistent estimator of R0 can be expected to depend weakly on R0 itself. This effect is seen for R0>1 in Figure 2 (top), where it can also be seen that the approximation (5.3) derived from the OU-approximation is somewhat optimistic. Similarly, we find that the relative uncertainty of R0 goes down with increasing values of γ, or, which by (5.1) is the same thing, with decreasing correlation βOU (cf. Figure 2, bottom).

    Figure 2.  Marginal posterior uncertainty (± 2 SD) for the SIS-model and a range of parameters. Top: with γ1 fixed, bottom: with R01.5 fixed. Crosses: MMSE-estimators ˆR0 (i.e., posterior means), dotted: estimated uncertainty according to (5.3), red: posterior width (± 2 SD).

    The SIS-model investigated previously was dependent on two parameters only and hence the singularity detected in §4 is not likely to be limiting any convergence. While various model modifications naturally lead to additional independent parameters, e.g., an extra transition SI modeling external infectious events, the most immediate modification is to simply consider the population size Σ uncertain. For instance, this is a possible setup for inference relying on sewage water analysis, where the data is binary according to whether the infectious substance is above or below some known threshold value c, but the sewage uptake area is populated with an unknown number of individuals.

    We remark that this is a considerably challenging task, and although we are able to demonstrate the sharpness of our negative results from §4 in this setting, the fact that this problem can at all be approached is quite remarkable.

    As ground truth we use the same parameters as in the previous section, but with γ=1/10 (time1) and R0=1.5 fixed, and we need to sample more, N=1000 points equispaced with h=1. The data is then taken to be the filtered sequence Yi=1Xic with c=0.9×I=0.9×Σ(1R10).

    The measurement map is strongly nonlinear and so the Kalman filter needs to be extended in some way. We took an immediate approach by simply discretizing the state variable I into M=100 Gaussian particles, distributed according to the percentiles of the stationary measure for a given proposed set of parameters. Each particle is evolved in time [0,h] using steps of size Δt according to the Kalman filter (5.2), after which the prior distribution is formed by aggregating the probability mass in the vicinity of each particle. This yields the likelihood for a single data point after which the posterior distribution for the state is obtained by setting selected particles' mass to zero (according to the data point) and rescaling appropriately.

    From Theorem 4.2 we have the singular curve γOU(cαOU)=constant, which gets transferred via the map (5.1) into a surface in (R0,Σ,γ)-space. After arbitrarily fixing γ we thus obtain a curve in the (R0,Σ)-plane. Since the SIS-to- OU map (2.9) is an approximation and, moreover, the likelihood is approximated via a Kalman filter, this analytical curve can be expected to be a perturbation of the observed numerical posterior level curves. As shown in Figure 3 the match is quite remarkable.

    Figure 3.  Left: (log-)posterior for the SIS-model under filtered data and conditioned on the true value γ=1/10 (time1). Also indicated is the singular curve as predicted by theory passing through the true parameter generating the data (circle). Right: marginal distribution for R0 together with a normal fit (dashed).

    Epidemic models on networks can give rise to phenomena not observed in single-node systems, e.g., the rescue effect [1], where the infection is "rescued" from extinction through the network structure. Here we consider both the SIS- and the SISE-model, respectively, cf. (2.2)–(2.3). We assume these models at each node in a network implicitly defined by pre-recorded movements of individuals between the nodes. As a concrete example this would be an appropriate model for estimating disease parameters in a monitoring program for bovine animals using cheap, but low-informative, tests collected on a weekly basis for a subsample of sentinel nodes.

    The nodal model is replicated across 1600 nodes, populated with 196,168 individuals, and the nodes are connected using 466,692 prescribed movements of individuals over four years, see Figure 4. The system is not well stirred on the aggregate level, but events occur frequently; the average # of events per sample node and day =0.20[0.19,0.21] with 50% credible interval (CrI). This particular network was constructed by anonymizing a set of recorded cattle movements and can be accessed through the publicly available R-package SimInf [19,47]. We extract model measurements from the same 100 randomly pre-selected sentinel nodes every 7th day for a total of 4 years. Each measurement is the outcome of a binary test: if the prevalence (P=Σ1I) in the node is above a threshold value (c= 30% or 4% for the SIS- or the SISE-model, respectively), where all nodes are seeded at 10% or 2% initially. In Figure 5, we illustrate the sampling output in time for the SISE-model.

    Figure 4.  Illustration of the transport network; the red points are the sentinel nodes, and the grey points are the latent ones. Red/black lines are transport events into a sentinel- or latent node, respectively.
    Figure 5.  The population-weighted average prevalence (red) is unobservable, but the pseudo prevalence (blue) is obtained from weighting together multiple binary measurements. The least squares OU-fit for the pseudo prevalence is used as summary statistics (a few samples in grey are shown).

    A challenging aspect of many data-driven inference problems, e.g., including network, dynamics is that the likelihood function is intractable and must be estimated through repeated simulations. Bayesian inference in this setting is termed Likelihood-free inference (LFI), or Approximate Bayesian Computations (ABC); see [48,49] for reviews. In this example, we consider the Sequential Monte Carlo (SMC) adaptation of ABC (SMC-ABC) implemented in SimInf and described in [50].

    Our SMC-ABC implementation determines proposal rejections per generation n using the normalized Euclidean kernel Kε(x,y)=i((xiyi)/xi)2<εn for statistics of the simulation proposal y and observation x and with a series of decreasing tolerances εn. The statistics are computed as follows. Each simulation generates a time series of pseudo prevalences, i.e., a population-weighted sum of positive samples. We interpret the time series as an OU-process and select the least square estimates (3.3)–(3.4) as indirect summary statistics [51]. A word in favor of this particular choice of statistics for other ABC implementations, e.g., synthetic likelihoods [52], is that least square estimates are asymptotically normally distributed under broad assumptions [53]. Notably, we get away with using statistics with one more or equal dimension as the parameter set, suggesting that this characterization is indeed very fitting.

    For the inference we use a single initial simulation with the true parameters (β,γ,R0)SIS=(0.16,0.1,=1.6) and (β,γ,ρ,R0)SISE=(0.054,0.1,0.44,=1.108), respectively, and we infer all parameters simultaneously. For priors, since we have no likelihood and thus cannot easily produce a strictly non-informative prior, we take uniform distributions over quite large intervals in parameter space: β and γU(0,1) in both cases and ρU(0.4,0.5). We use the decreasing ABC tolerances εn=100exp(0.25(n1)), for n=115 and 1000 SMC particles.

    We found that ρ requires a tighter prior than the others for the computations to complete in a reasonable time: the SISE-model is considerably more challenging but is also more realistic, particularly so in our setting on a network where rather large prevalences are required for the SIS-model to not simply die out.

    The results for the R0-marginals are displayed in Figure 6, where we also investigate the concentration effect of data through the relative change in quartile coefficient of dispersion (QCD); a small concentration factor indicates an accurately identifiable parameter. Although the SISE-model is clearly more challenging, R0 is still well reconstructed for both models.

    Figure 6.  Top: posterior and prior distributions for R0 with the true values indicated. Bottom: recorded QCD concentration factors for all the parameters (see text for details).

    Binary data implies identifiability when given multiple observations at the same time, e.g., over a small collection of nodes in a network rather than on a single node, as indicated in a qualitative sense by Theorem 4.6. Additionally, R0 is identifiable with quite high accuracy even when the dependent parameters covaries. The example is prototypical of using synthetic data to evaluate the feasibility of an intended setup. Since the posteriors are robust with respect to capturing the synthetic truth, we have good reasons to also have some faith in the design when approaching real data.

    Throughout this work we have employed the Ornstein-Uhlenbeck process as a meta-model of more involved epidemiological models. We indicated in §3 a convergence analysis of the Bayesian posterior under direct process observations. Since this is an unrealistic setup in most epidemiological applications, one can think of these results as best possible.

    We next took the opposite standpoint and considered data to be severely filtered such that, literally, each data point contributed only a single bit of information. For instance, this could be a model of pooled data obtained through environmental sampling and subsequent analysis. To obtain a closed framework we added a fairly general stationary p-step Markov assumption and worked out conditions on the data to obtain a non-singular inverse problem.

    With increasing compute power and improving possibilities for gathering data, fully Bayesian first-principle epidemiological models can be realized. As a minimum standard, we propose, such methods should be preceded by a proof of self-consistency: data generated from the model itself and chosen "nearby" the actual data should allow for accurate parameter identifiability. Under this basic standard, Bayesian epidemiological modeling with both short-term prediction and generation of forecasting scenarios can be included as an integrated part of the public health's methodological arsenal.

    This work was financially supported by the ENS Paris (S. Bronstein) and in part by the Swedish Research Council Formas (S. Engblom) and the Swedish Innovation Agency Vinnova (S. Engblom, R. Marin). Several helpful comments and kind suggestions on earlier versions of the manuscript were offered by friends and colleagues. These helped to improve the presentation considerably.

    The authors declare there is no conflict of interest.



    [1] M. J. Keeling, P. Rohani, Modeling Infectious Diseases in Humans and Animals, Princeton University Press, 2011. https://doi.org/10.1086/591197
    [2] T. McKinley, A. R. Cook, R. Deardon, Inference in epidemic models without likelihoods, Int. J. Biostat., 5 (2009). https://doi.org/10.2202/1557-4679.1171 doi: 10.2202/1557-4679.1171
    [3] H. Andersson, T. Britton, Stochastic Epidemic Models and Their Statistical Analysis, Springer Science & Business Media, 151 (2012). https://doi.org/10.1007/978-1-4612-1158-7
    [4] S. Eubank, H. Guclu, V. A. Kumar, M. V. Marathe, A. Srinivasan, Z. Toroczkai, et al., Modelling disease outbreaks in realistic urban social networks, Nature, 429 (2004), 180–184. https://doi.org/10.1038/nature02541 doi: 10.1038/nature02541
    [5] N. M. Ferguson, D. A. Cummings, S. Cauchemez, C. Fraser, S. Riley, A. Meeyai, et al., Strategies for containing an emerging influenza pandemic in southeast Asia, Nature, 437 (2005), 209–214. https://doi.org/10.1038/nature04017 doi: 10.1038/nature04017
    [6] D. Balcan, V. Colizza, B. Gonçalves, H. Hu, J. J. Ramasco, A. Vespignani, Multiscale mobility networks and the spatial spreading of infectious diseases, Proc. Natl. Acad. Sci. USA, 106 (2009), 21484–21489. https://doi.org/10.1073/pnas.0906910106 doi: 10.1073/pnas.0906910106
    [7] S. Merler, M. Ajelli, A. Pugliese, N. M. Ferguson, Determinants of the spatiotemporal dynamics of the 2009 H1N1 pandemic in Europe: implications for real-time modelling, PLoS Comput. Biol., 7 (2011), e1002205. https://doi.org/10.1371/journal.pcbi.1002205 doi: 10.1371/journal.pcbi.1002205
    [8] E. Brooks-Pollock, G. O. Roberts, M. J. Keeling, A dynamic model of bovine tuberculosis spread and control in Great Britain, Nature, 511 (2014), 228–231. https://doi.org/10.1038/nature13529 doi: 10.1038/nature13529
    [9] J. Stehlé, N. Voirin, A. Barrat, C. Cattuto, V. Colizza, L. Isella, et al., Simulation of an SEIR infectious disease model on the dynamic contact network of conference attendees, BMC Med., 9 (2011), 87. https://doi.org/10.1186/1741-7015-9-87 doi: 10.1186/1741-7015-9-87
    [10] P. Bajardi, A. Barrat, L. Savini, V. Colizza, Optimizing surveillance for livestock disease spreading through animal movements, J. R. Soc. Interface, 9 (2012), 2814–2825. https://doi.org/10.1186/1741-7015-9-87 doi: 10.1186/1741-7015-9-87
    [11] M. Salathé, M. Kazandjieva, J. W. Lee, P. Levis, M. W. Feldman, J. H. Jones, A high-resolution human contact network for infectious disease transmission, Proc. Natl. Acad. Sci. USA, 107 (2010), 22020–22025. https://doi.org/10.1073/pnas.1009094108 doi: 10.1073/pnas.1009094108
    [12] T. Obadia, R. Silhol, L. Opatowski, L. Temime, J. Legrand, A. C. M. Thiéaut, et al., Detailed contact data and the dissemination of Staphylococcus aureus in hospitals, PLoS Comput. Biol., 11 (2015), e1004170. https://doi.org/10.1371/journal.pcbi.1004170 doi: 10.1371/journal.pcbi.1004170
    [13] D. J. Toth, M. Leecaster, W. B. Pettey, A. V. Gundlapalli, H. Gao, J. J. Rainey, et al., The role of heterogeneity in contact timing and duration in network models of influenza spread in schools, J. R. Soc. Interface, 12 (2015), 20150279. https://doi.org/10.1098/rsif.2015.0279 doi: 10.1098/rsif.2015.0279
    [14] Q. Zhang, K. Sun, M. Chinazzi, A. P. y Piontti, N. E. Dean, D. P. Rojas, et al., Spread of Zika virus in the Americas, Proc. Natl. Acad. Sci. USA, 114 (2017), E4334–E4343. https://doi.org/10.1073/pnas.1620161114 doi: 10.1073/pnas.1620161114
    [15] Q. H. Liu, M. Ajelli, A. Aleta, S. Merler, Y. Moreno, A. Vespignani, Measurability of the epidemic reproduction number in data-driven contact networks, Proc. Natl. Acad. Sci. USA, 115 (2018), 12680–12685. https://doi.org/10.1073/pnas.1811115115 doi: 10.1073/pnas.1811115115
    [16] S. Widgren, S. Engblom, U. Emanuelson, A. Lindberg, Spatio-temporal modelling of verotoxigenic Escherichia coli O157 in cattle in Sweden: exploring options for control, Vet. Res., 49 (2018). https://doi.org/10.1186/s13567-018-0574-2 doi: 10.1186/s13567-018-0574-2
    [17] T. Söderström, P. Stoica, System Identification, Prentice-Hall International, 1989.
    [18] G. Fournié, A. Waret-Szkuta, A. Camacho, L. M. Yigezu, D. U. Pfeiffer, F. Roger, A dynamic model of transmission and elimination of peste des petits ruminants in Ethiopia, Proc. Natl. Acad. Sci. USA, 115 (2018), 8454–8459. https://doi.org/10.1073/pnas.1711646115 doi: 10.1073/pnas.1711646115
    [19] S. Engblom, R. Eriksson, S. Widgren, Bayesian epidemiological modeling over high-resolution network data, Epidemics, 32 (2020), 100399. https://doi.org/10.1016/j.epidem.2020.100399 doi: 10.1016/j.epidem.2020.100399
    [20] X. Shen, L. Wasserman, Rates of convergence of posterior distributions, Ann. Stat., 29 (2001), 687–714. https://doi.org/10.1214/aos/1009210686 doi: 10.1214/aos/1009210686
    [21] A. M. Stuart, Inverse problems: a Bayesian perspective, Acta Numer., 19 (2010), 451–559. https://doi.org/10.1017/S0962492910000061 doi: 10.1017/S0962492910000061
    [22] A. Stuart, A. Teckentrup, Posterior consistency for Gaussian process approximations of Bayesian posterior distributions, Math. Comput., 87 (2018), 721–753. https://doi.org/10.1090/mcom/3244 doi: 10.1090/mcom/3244
    [23] J. Latz, On the well-posedness of Bayesian inverse problems, SIAM/ASA J. Uncert. Quant., 8 (2020), 451–482. https://doi.org/10.1137/19M1247176 doi: 10.1137/19M1247176
    [24] H. Owhadi, C. Scovel, T. Sullivan, On the brittleness of Bayesian inference, SIAM Rev., 57 (2015), 566–582. https://doi.org/10.1137/130938633 doi: 10.1137/130938633
    [25] B. Sprungk, On the local Lipschitz stability of Bayesian inverse problems, Inverse Probl., 36 (2020), 055015. https://doi.org/10.1088/1361-6420/ab6f43 doi: 10.1088/1361-6420/ab6f43
    [26] A. Galani, R. Aalizadeh, M. Kostakis, A. Markou, N. Alygizakis, T. Lytras, et al., SARS-CoV-2 wastewater surveillance data can predict hospitalizations and ICU admissions, Sci. Total Environ., 804 (2022), 150151. https://doi.org/10.1016/j.scitotenv.2021.150151 doi: 10.1016/j.scitotenv.2021.150151
    [27] B. Kennedy, H. Fitipaldi, U. Hammar, M. Maziarz, N. Tsereteli, N. Oskolkov, et al., App-based COVID-19 syndromic surveillance and prediction of hospital admissions in COVID Symptom Study Sweden, Nat. Commun., 13 (2022), 2110. https://doi.org/10.1038/s41467-022-29608-7 doi: 10.1038/s41467-022-29608-7
    [28] W. O. Kermack, A. G. McKendrick, A contribution to the mathematical theory of epidemics, Proc. R. Soc. Lond. A, 115 (1927), 700–721. https://doi.org/10.1098/rspa.1927.0118 doi: 10.1098/rspa.1927.0118
    [29] S. Engblom, S. Widgren, Data-driven computational disease spread modeling: from measurement to parametrization and control, in Disease Modeling and Public Health: Part A, Handbook of Statistics, Chapter 11 (eds. C. R. Rao, A. S. Rao, S. Payne), Elsevier, Amsterdam, 36 (2017), 305–328. https://doi.org/10.1016/bs.host.2017.05.005
    [30] T. Britton, Stochastic epidemic models: a survey, Math. Biosci., 225 (2010), 24–35. https://doi.org/10.1016/j.mbs.2010.01.006 doi: 10.1016/j.mbs.2010.01.006
    [31] A. Gray, D. Greenhalgh, L. Hu, X. Mao, J. Pan, A stochastic differential equation SIS epidemic model, SIAM J. Appl. Math., 71 (2011), 876–902. https://doi.org/10.1137/10081856X doi: 10.1137/10081856X
    [32] G. E. Uhlenbeck, L. S. Ornstein, On the theory of the Brownian motion, Phys. Rev., 36 (1930), 823. https://doi.org/10.1103/PhysRev.36.823 doi: 10.1103/PhysRev.36.823
    [33] I. Shoji, Approximation of continuous time stochastic processes by a local linearization method, Math. Comput., 67 (1998), 287–298. https://doi.org/10.1090/S0025-5718-98-00888-6 doi: 10.1090/S0025-5718-98-00888-6
    [34] A. P. Ghosh, W. Qin, A. Roitershtein, Discrete-time Ornstein-Uhlenbeck process in a stationary dynamic environment, J. Interdiscip. Math., 19 (2016), 1–35. https://doi.org/10.1080/09720502.2013.857921 doi: 10.1080/09720502.2013.857921
    [35] I. V. Basawa, B. P. Rao, Chapter 10 - Bayesian Inference for Stochastic Processes, in Statistical Inference for Stochastic Processes (eds. I. V. Basawa, B. P. Rao), Academic Press, London, 1980 (1980), 255–293. https://doi.org/10.1016/B978-0-12-080250-0.50017-8
    [36] M. Mishra, B. Prakash Rao, Rate of convergence in the Bernstein-Von Mises theorem for a class of diffusion processes, Stochastics, 22 (1987), 59–75. https://doi.org/10.1080/17442508708833467 doi: 10.1080/17442508708833467
    [37] J. Bishwal, Rates of convergence of the posterior distributions and the Bayes estimations in the Ornstein-Uhlenbeck process, Random Oper. Stoch. Equ., 8 (2000), 51–70. https://doi.org/10.1515/rose.2000.8.1.51 doi: 10.1515/rose.2000.8.1.51
    [38] D. Florens-Zmirou, Approximate discrete-time schemes for statistics of diffusion processes, Statistics: J. Theor. Appl. Stat., 20 (1989), 547–557. https://doi.org/10.1080/02331888908802205 doi: 10.1080/02331888908802205
    [39] V. Genon-Catalot, Maximum contrast estimation for diffusion processes from discrete observations, Statistics, 21 (1990), 99–116. https://doi.org/10.1080/02331889008802231 doi: 10.1080/02331889008802231
    [40] M. Kessler, Estimation of an ergodic diffusion from discrete observations. Scand. J. Stat., 24 (1997), 211–229. https://doi.org/10.1111/1467-9469.00059 doi: 10.1111/1467-9469.00059
    [41] K. B. Athreya, S. N. Lahiri, Measure Theory and Probability Theory, Springer Science & Business Media, 2006. https://doi.org/10.1007/978-0-387-35434-7
    [42] E. A. Stoltenberg, N. L. Hjort, Models and inference for on-off data via clipped Ornstein-Uhlenbeck processes, Scand. J. Stat., 48 (2021), 908–929. https://doi.org/10.1111/sjos.12472 doi: 10.1111/sjos.12472
    [43] E. Slud, Clipped Gaussian processes are never M-step Markov, J. Multivariate Anal., 29 (1989), 1–14. https://doi.org/10.1016/0047-259X(89)90072-9 doi: 10.1016/0047-259X(89)90072-9
    [44] K. Joag-Dev, M. D. Perlman, L. D. Pitt, Association of normal random variables and Slepian's inequality, Ann. Probab., 11 (1983), 451–455. https://doi.org/10.1214/aop/1176993610 doi: 10.1214/aop/1176993610
    [45] S. N. Ethier, T. G. Kurtz, Markov Processes: Characterization and Convergence, Wiley Series in Probability and Mathematical Statistics, John Wiley & Sons, New York, 1986. https://doi.org/10.1002/9780470316658
    [46] T. Shardlow, Modified equations for stochastic differential equations, BIT Numer. Math., 46 (2006), 111–125. https://doi.org/10.1007/s10543-005-0041-0 doi: 10.1007/s10543-005-0041-0
    [47] S. Widgren, P. Bauer, R. Eriksson, S. Engblom, SimInf: An R package for data-driven stochastic disease spread simulations, J. Stat. Software, 91 (2019), 1–42. https://doi.org/10.18637/jss.v091.i12 doi: 10.18637/jss.v091.i12
    [48] J. M. Marin, P. Pudlo, C. P. Robert, R. J. Ryder, Approximate Bayesian computational methods, Stat. Comput., 22 (2012), 1167–1180. https://doi.org/10.1007/s11222-011-9288-2 doi: 10.1007/s11222-011-9288-2
    [49] S. A. Sisson, Y. Fan, M. Beaumont, Handbook of Approximate Bayesian Computation, CRC Press, 2018. https://doi.org/10.1201/9781315117195
    [50] T. Toni, D. Welch, N. Strelkowa, A. Ipsen, M. P. Stumpf, Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems, J. R. Soc. Interface, 6 (2009), 187–202. https://doi.org/10.1098/rsif.2008.0172 doi: 10.1098/rsif.2008.0172
    [51] C. C. Drovandi, A. N. Pettitt, M. J. Faddy, Approximate Bayesian computation using indirect inference, J. R. Stat. Soc.: Ser. C (Appl. Stat.), 60 (2011), 317–337. https://doi.org/10.1111/j.1467-9876.2010.00747.x doi: 10.1111/j.1467-9876.2010.00747.x
    [52] S. N. Wood, Statistical inference for noisy nonlinear ecological dynamic systems, Nature, 466 (2010), 1102. https://doi.org/10.1038/nature09319 doi: 10.1038/nature09319
    [53] W. K. Newey, D. McFadden, Chapter 36 Large sample estimation and hypothesis testing, Handb. Econom., 4 (1994), 2111–2245. https://doi.org/10.1016/S1573-4412(05)80005-4 doi: 10.1016/S1573-4412(05)80005-4
  • Reader Comments
  • © 2023 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(1778) PDF downloads(68) Cited by(0)

Figures and Tables

Figures(6)

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog