Jump to content

Caltech team develops new tool to improve climate simulations on Google Cloud

Using Google Compute Engine, a team of climate scientists is building a next-generation open-source Earth System Model that will integrate more earth and atmospheric data than ever before. By improving climate simulations, this framework will reduce uncertainty about extreme environmental events like floods, droughts, and hurricanes.

Recent advances in computational and data sciences are revolutionizing the field of climate sciences. With parallel computing on ultra-fast graphics processing units (GPUs), it will soon be possible to perform calculations in exascale, or one thousand times the petascale capacity of today’s supercomputers. This unprecedented computing power is pushing the frontiers of data-intensive fields like climate science, which rely on massive datasets to run global simulations to predict environmental change. But harnessing this new opportunity requires new tools to manage and process that data at scale.

Tapio Schneider, ​​Theodore Y. Wu Professor of Environmental Science and Engineering at Caltech and Senior Research Scientist at JPL, says that cloud modeling has been a particular challenge for climate scientists. Low-lying clouds are too small to represent in climate models, yet are crucial for climate predictions because they regulate how much sunlight Earth absorbs. “Clouds are a dominant source of uncertainty in climate predictions,” he says, “so they matter enormously. But it’s difficult to simulate the small-scale turbulence sustaining them; they literally fall through the cracks of global models.” But they can be simulated in smaller areas, at high resolution. To capitalize on this opportunity required a climate modeling tool that could integrate both lower-resolution global simulations and high-resolution, limited-area large-eddy simulations (LES).

In response, the Climate Modeling Alliance (CliMA), a team of researchers from Caltech, MIT, the Naval Postgraduate School, and NASA’s Jet Propulsion Laboratory (JPL), developed ClimateMachine, a new modeling approach to simulate weather and climate globally at coarser resolution and locally at high resolution in one unified framework. By integrating direct observations from a wide range of datasets alongside detailed simulations, the new model, when it is complete, is expected to provide more accurate predictions. Akshay Sridhar, Research Scientist at Caltech, says, “ClimateMachine is a step towards next-generation tools designed to reduce uncertainties in climate modeling. This framework, along with other packages developed by our group, will allow researchers to provide higher quality data to facilitate an improved understanding of atmospheric phenomena, and enable more robust decisions around future extreme weather and climate events.”

ClimateMachine is a step towards next-generation tools designed to reduce uncertainties in climate modeling.

Akshay Sridhar, Research Scientist, Environmental Science and Engineering at Caltech

“ClimateMachine is a step towards next-generation tools designed to reduce uncertainties in climate modeling. This framework, along with other packages developed by our group, will allow researchers to provide higher quality data to facilitate an improved understanding of atmospheric phenomena, and enable more robust decisions around future extreme weather and climate events.”

By building ClimateMachine on Google Cloud, the CliMA team distributed the massive simulations across the lightning-fast GPUs on Google Compute Engine. In a preprint for the journal Geoscientific Model Development, they demonstrated that ClimateMachine, in its limited area configuration, had efficient strong scaling on CPUs and weak scaling on up to 16 GPUs, both to run global simulations and local high-resolution simulations. Schneider explains that “strong scaling reduces time and energy proportionally. Weak scaling can enlarge the area covered using the same amount of computing time.” “The increased spatial extent possible with weak scaling allows us to capture larger scale features of atmospheric turbulence at relatively high resolutions,” Sridhar says. “Deploying this on Google Cloud was easier than we expected.”

Accessibility and sustainability were important goals for this project. The team used the open-source Julia programming language to facilitate collaboration, access, and reproducibility across platforms and institutions. In 2020, Schneider joined a cross-disciplinary group of 33 Google Research Innovators, who receive extra support, training, and access to Google experts to encourage collaboration. Schneider says, “all code and data we generate are public, so they benefit all researchers. We want our tools to be accessible to everyone.” Sridhar adds, “we hope this is a positive step forward in designing toolkits anyone can deploy on demand.”

To get started with Google Cloud, apply for free credits towards your research.

All code and data we generate are public so they benefit all researchers. We want our tools to be accessible to everyone.

Tapio Schneider, ​​Theodore Y. Wu, Professor of Environmental Science and Engineering at Caltech and Senior Research Scientist at JPL

Sign up here for updates, insights, resources, and more.