Running millions of simulations on Google Cloud makes epidemiological models faster and more accurate
Using the Global Epidemic and Mobility Model (GLEAM), a metapopulation model that combines real-world data on populations and human mobility with elaborate stochastic models of disease transmission, the team ran simulations to describe possible scenarios for the outbreak as it spreads and governments enact interventions. But running their simulations with so many epidemiological and transportation variables requires huge amounts of compute power, data processing, and storage. So the team took advantage of the integrated, scalable tools built into Google Cloud to set up their pipeline with Google Cloud Life Sciences, load each model into Google Cloud Storage, run thousands of parallel simulations on preemptible Virtual Machines (VMs) in Google Compute Engine, and then analyze them in BigQuery. With help from Google Cloud research credits, they have been able to simulate over nine million different scenarios to date and have analyzed over 5.5 PBs of simulation data in BigQuery. They also assessed the relative risk of importing cases from and within China, estimated the risk of sustained community transmission outside Mainland China, and published their results online with Google’s visualization tool, Data Studio.
Dr. Chinazzi, Associate Research Scientist at MoBS, had prior experience with a similar Google Cloud pipeline when he studied the spread of Zika virus in 2016. Then he was able to run ten million simulations, sometimes as many as 250K a day, to analyze how variables like locations, temperature, population, and travel patterns affected possible rates of infection through mosquitoes. This time his team drew on public datasets of case importations from China and private airline transportation data from the Official Aviation Guide (OAG) and the International Air Transport Association (IATA) as well as ground mobility/commuting data from Statistical Offices in more than thirty countries and 80,000 administrative regions.
Based on internationally reported cases, their models show that at the start of the travel ban from Wuhan, most Chinese cities had already received many infected travelers. The travel quarantine of Wuhan delayed the overall epidemic progression by only three to five days in Mainland China, but had a more marked effect at the international scale, where case importations were reduced by nearly 80% until mid-February. “Developing data-driven models for the spread of this infectious disease is critical,” says Dr. Chinazzi. “Our team is working around-the-clock to model the effects that different containment and mitigation strategies have in affecting the spread and severity of the COVID-19 outbreak.”