Cardiff University | Prifysgol Caerdydd ORCA
Online Research @ Cardiff 
WelshClear Cookie - decide language by browser settings

Reducing size bias in epidemic network modelling

Bansal, Neha, Kaouri, Aikaterini ORCID: https://orcid.org/0000-0002-9850-253X and Woolley, Thomas ORCID: https://orcid.org/0000-0001-6225-5365 2025. Reducing size bias in epidemic network modelling. Journal of Theoretical Biology
Item availability restricted.

[thumbnail of reducing_size_bias_in_epidemic_network_modelling_clean.pdf] PDF - Accepted Post-Print Version
Restricted to Repository staff only

Download (5MB) | Request a copy
[thumbnail of Provisional file]
Preview
PDF (Provisional file) - Accepted Post-Print Version
Download (17kB) | Preview

Abstract

Epidemiological models can inform policymaking on disease control strategies, and these models often rely on sampled contact networks. The Random Walk (RW) sampling algorithm, commonly used for network sampling, produces size-biased samples that over-represent highly connected individuals, leading to biased estimates of disease spread. The Metropolis-Hastings Random Walk (MHRW) addresses this by providing samples representative of the underlying network’s connectivity distribution. We compare MHRW and RW in reducing size bias across four network types: Erdős–Rényi (ER), Small-world (SW), Negative-binomial (NB), and Scale-free (SF). We simulate disease spread using a stochastic Susceptible-Infected-Recovered (SIR) framework. RW tends to overestimate infections (by 25% in ER, SW, NB) and secondary infections (by 25% in ER, SW and 80% in NB), and underestimate time-to-infection in NB networks. MHRW reduces the size bias, except on SF networks, where both algorithms provide non-representative samples and highly variable estimates. We find that RW is appropriate for fast-spreading, high-mortality epidemics in homogeneous or moderately random networks (ER, SW). In contrast, MHRW is better suited for slower and low-severity epidemics and can be effective in both homogeneous and heterogeneous networks (ER, SW, NB). However, MHRW is computationally expensive and less accurate when duplicate nodes are removed. We also analyse real-world data from cattle movement and human contact networks; MHRW generates disease spread estimates closer to the underlying network than RW. Our findings guide the selection of sampling algorithms based on network structure and epidemic characteristics, enhancing the reliability of disease modelling for policymaking.

Item Type: Article
Status: In Press
Schools: Schools > Mathematics
Subjects: Q Science > QA Mathematics
Publisher: Elsevier
ISSN: 0022-5193
Funders: NERC, BBSRC, MRC, EPSRC
Date of First Compliant Deposit: 28 October 2025
Date of Acceptance: 23 October 2025
Last Modified: 28 Oct 2025 17:02
URI: https://orca.cardiff.ac.uk/id/eprint/181883

Actions (repository staff only)

Edit Item Edit Item

Downloads

Downloads per month over past year

View more statistics