Knowing if and how rapidly an emerging pathogen will spread

through a population enables public health officials to make well-informed

decisions to protect the public. Mathematical modeling can provide them this

means to predict pathogen spread, but modeling previously unheard of pathogens,

like severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is

challenging.

Typically, mathematical modeling requires researchers to acquire

at least one dataset with the relevant data points to develop the model, and

another similar dataset to validate the model. For emerging diseases like novel

coronavirus disease 2019 (COVID-19), for which we did not have a readily

available diagnostic kit to distinguish SARS-CoV-2–positive cases from negative

cases, validity and completeness of the data is often problematic.

**Mathematical modeling
strategies**

Designing a model involves researchers making presumptions

about which parameters influence pathogen transmission the most and thus which

variables should be included in modeling and need inferences made about them

[1,2]. Researchers designing models may assume all people are equally

susceptible to the pathogen and equal mixing of the population, along with some

other assumptions such as infinite population size (to avoid modeling births

and deaths) to make modeling easier. However, many researchers idealize realism-type

approaches and value the addition of more parameters.

Modelers may incorporate into their model variables for

population age structure and growth, different infection susceptibilities based

on age (or another factor), and social networking patterns. However, the more

parameters included, the more mathematically complex the model becomes. Complex

models might be more realistic, but they are often not better. With larger

numbers of variables, missing or inaccurate data points have more influence

over model results, and longer time periods are needed to compute outcomes.

Also, sometimes variables seem important but have little influence over model results.

Models developed for pathogens during their emergence are

often inaccurate [3]. Typically, when incomplete datasets are used to develop

models, multiple different models appear capable of fitting the existing data

points but predict different outcomes.

**COVID-19 findings**

For the COVID-19 pandemic, mathematical modeling has been

used to estimate a few aspects relating to pathogen spread, such as the basic

reproductive number (R_{0}, number of secondary infections caused by 1

infection in a completely susceptible population) for SARS-CoV-2 (R_{0}

2.8-4.0) [4] and the percentage of people with asymptomatic infections (~17.9%)

[5]. Some modeling studies have shown population control measures did decrease

pathogen spread [6,7], and Kucharski et al. found that four independent

SARS-CoV-2 introduction events into environments mimicking Wuhan, China would

provide >50% chance of virus establishment in that population [7]. As of

March 29, 2020, no studies published in scientific journals have shown

predictions on the extent of pathogen spread globally.

**Modeling spread of
pandemic**

In future efforts to model pandemic spread of SARS-CoV-2, I would

suggest testing a model designed against another RNA virus (or a virus with a similar

mutation potential) that had an established surveillance system ongoing

(relatively valid data set) and caused a respiratory disease (similar

transmission capability) in a population that was arguably 100% susceptible:

maybe the H5N1 or H1N1 pandemic strains. One could perhaps take a model

designed to predict pandemic spread of a somewhat similar pathogen and plug-in another

dataset. If the model predicts COVID-19 spread, perhaps we can use this model

with the next respiratory disease pandemic.

**References**

1. Funk S, King AA. Choices and trade-offs in inference with

infectious disease models. Epidemics. 2019 Dec 20;30:100383. doi:

10.1016/j.epidem.2019.100383. [Epub ahead of print]

2. Siettos CI, Russo L. Mathematical modeling of infectious

disease dynamics. Virulence. 2013 May 15;4(4):295-306. doi: 10.4161/viru.24041.

Epub 2013 Apr 3.

3. Pellis L, Cauchemez S, Ferguson NM, Fraser C. Systematic

selection between age and household structure for models aimed at emerging

epidemic predictions. Nat Commun. 2020 Feb 14;11(1):906. doi:

10.1038/s41467-019-14229-4.

4. Zhou T, Liu Q, Yang Z, Liao J, Yang K, Bai W, Lu X, Zhang

W. Preliminary prediction of the basic reproduction number of the Wuhan novel

coronavirus 2019-nCoV. J Evid Based Med. 2020 Feb;13(1):3-7. doi:

10.1111/jebm.12376.

5. Mizumoto K, Kagaya K, Zarebski A, Chowell G. Estimating

the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on

board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Euro Surveill.

2020 Mar;25(10). doi: 10.2807/1560-7917.ES.2020.25.10.2000180.

6. Kraemer MUG, Yang CH, Gutierrez B, et al; Open COVID-19 Data

Working Group. The effect of human mobility and control measures on the

COVID-19 epidemic in China. Science. 2020 Mar 25. pii: eabb4218. doi:

10.1126/science.abb4218. [Epub ahead of print]

7. Kucharski AJ, Russell TW, Diamond C, et al; Centre for

Mathematical Modelling of Infectious Diseases COVID-19 working group. Early dynamics

of transmission and control of COVID-19: a mathematical modelling study. Lancet

Infect Dis. 2020 Mar 11. pii: S1473-3099(20)30144-4. doi: 10.1016/S1473-3099(20)30144-4.

[Epub ahead of print] Erratum in: Lancet Infect Dis. 2020 Mar 25.