January 21, 2021

Mathematical modeling the emergence and spread of new pathogens: Insight for SARS-CoV-2 and other similar viruses

Mathematical modeling the emergence and spread of new pathogens:  Insight for SARS-CoV-2 and other similar viruses

Knowing if and how rapidly an emerging pathogen will spread
through a population enables public health officials to make well-informed
decisions to protect the public. Mathematical modeling can provide them this
means to predict pathogen spread, but modeling previously unheard of pathogens,
like severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is

Typically, mathematical modeling requires researchers to acquire
at least one dataset with the relevant data points to develop the model, and
another similar dataset to validate the model. For emerging diseases like novel
coronavirus disease 2019 (COVID-19), for which we did not have a readily
available diagnostic kit to distinguish SARS-CoV-2–positive cases from negative
cases, validity and completeness of the data is often problematic.

Mathematical modeling

Designing a model involves researchers making presumptions
about which parameters influence pathogen transmission the most and thus which
variables should be included in modeling and need inferences made about them
[1,2]. Researchers designing models may assume all people are equally
susceptible to the pathogen and equal mixing of the population, along with some
other assumptions such as infinite population size (to avoid modeling births
and deaths) to make modeling easier. However, many researchers idealize realism-type
approaches and value the addition of more parameters.

Modelers may incorporate into their model variables for
population age structure and growth, different infection susceptibilities based
on age (or another factor), and social networking patterns. However, the more
parameters included, the more mathematically complex the model becomes. Complex
models might be more realistic, but they are often not better. With larger
numbers of variables, missing or inaccurate data points have more influence
over model results, and longer time periods are needed to compute outcomes.
Also, sometimes variables seem important but have little influence over model results.

Models developed for pathogens during their emergence are
often inaccurate [3]. Typically, when incomplete datasets are used to develop
models, multiple different models appear capable of fitting the existing data
points but predict different outcomes.

COVID-19 findings

For the COVID-19 pandemic, mathematical modeling has been
used to estimate a few aspects relating to pathogen spread, such as the basic
reproductive number (R0, number of secondary infections caused by 1
infection in a completely susceptible population) for SARS-CoV-2 (R0
2.8-4.0) [4] and the percentage of people with asymptomatic infections (~17.9%)
[5]. Some modeling studies have shown population control measures did decrease
pathogen spread [6,7], and Kucharski et al. found that four independent
SARS-CoV-2 introduction events into environments mimicking Wuhan, China would
provide >50% chance of virus establishment in that population [7]. As of
March 29, 2020, no studies published in scientific journals have shown
predictions on the extent of pathogen spread globally.

Modeling spread of

In future efforts to model pandemic spread of SARS-CoV-2, I would
suggest testing a model designed against another RNA virus (or a virus with a similar
mutation potential) that had an established surveillance system ongoing
(relatively valid data set) and caused a respiratory disease (similar
transmission capability) in a population that was arguably 100% susceptible:
maybe the H5N1 or H1N1 pandemic strains. One could perhaps take a model
designed to predict pandemic spread of a somewhat similar pathogen and plug-in another
dataset. If the model predicts COVID-19 spread, perhaps we can use this model
with the next respiratory disease pandemic.


1. Funk S, King AA. Choices and trade-offs in inference with
infectious disease models. Epidemics. 2019 Dec 20;30:100383. doi:
10.1016/j.epidem.2019.100383. [Epub ahead of print]

2. Siettos CI, Russo L. Mathematical modeling of infectious
disease dynamics. Virulence. 2013 May 15;4(4):295-306. doi: 10.4161/viru.24041.
Epub 2013 Apr 3.

3. Pellis L, Cauchemez S, Ferguson NM, Fraser C. Systematic
selection between age and household structure for models aimed at emerging
epidemic predictions. Nat Commun. 2020 Feb 14;11(1):906. doi:

4. Zhou T, Liu Q, Yang Z, Liao J, Yang K, Bai W, Lu X, Zhang
W. Preliminary prediction of the basic reproduction number of the Wuhan novel
coronavirus 2019-nCoV. J Evid Based Med. 2020 Feb;13(1):3-7. doi:

5. Mizumoto K, Kagaya K, Zarebski A, Chowell G. Estimating
the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on
board the Diamond Princess cruise ship, Yokohama, Japan, 2020. Euro Surveill.
2020 Mar;25(10). doi: 10.2807/1560-7917.ES.2020.25.10.2000180.

6. Kraemer MUG, Yang CH, Gutierrez B, et al; Open COVID-19 Data
Working Group. The effect of human mobility and control measures on the
COVID-19 epidemic in China. Science. 2020 Mar 25. pii: eabb4218. doi:
10.1126/science.abb4218. [Epub ahead of print]

7. Kucharski AJ, Russell TW, Diamond C, et al; Centre for
Mathematical Modelling of Infectious Diseases COVID-19 working group. Early dynamics
of transmission and control of COVID-19: a mathematical modelling study. Lancet
Infect Dis. 2020 Mar 11. pii: S1473-3099(20)30144-4. doi: 10.1016/S1473-3099(20)30144-4.
[Epub ahead of print] Erratum in: Lancet Infect Dis. 2020 Mar 25.

Leave a Reply

Your email address will not be published. Required fields are marked *