Google Cloud Platform supports international effort to produce first-ever image of a black hole

An international team of researchers uses GCP to store and process five petabytes of data from the Event Horizon Telescope, accelerating new discoveries in astronomy

In April 2019 the first-ever image of a black hole was announced by a team of international researchers. This extraordinary breakthrough in astronomy imaged a supermassive black hole, located 55 million light-years from Earth in the Messier 87 (M87) galaxy, having a mass 6.5 billion times greater than our sun. Even the most powerful telescopes can't see a black hole directly because black holes absorb all light, but they can detect the “event horizon,” or the boundary around it. The resulting image shows the shadow of the black hole, 40 billion kilometers across, with a bright ring where its intense gravitational pull draws light and plasma inside. The unprecedented image offers the first observational evidence for the existence of black holes and opens up new ways to study distant galaxies.

Google VMs accelerate data processing from weeks to days

Researchers combined data from eight high-altitude radio telescopes across the globe, using Very Long Baseline Interferometry (VLBI) at short wavelengths. This "virtual telescope" is called the Event Horizon Telescope (EHT). Each of the eight telescopes in the EHT produced enormous amounts of data—roughly 350 terabytes per day over a week of observations in April 2017. These petabytes of data were then stored on high-performance helium-filled hard drives, weighing half a ton all together and physically transported to specialized supercomputers known as correlators. Chi-Kwan Chan, leader of the EHT Software and Data Compatibility Working Group and an assistant astronomer at the University of Arizona, led the effort to develop the infrastructure to process the data. The data had to be high resolution, account for the earth’s rotation at each site, and be precisely synchronized across the world’s time zones.

To process all that data Chan and the EHT team drew on National Science Foundation (NSF) funding to set up a pipeline on Google Cloud Platform (GCP). “We have very intermittent workflows,” Chan says. “Since this was the first time a VLBI experiment was being carried out at this scale, there were many new systematic errors in the data that we needed to understand. We needed large computation power to run the analysis pipeline, people power to understand the results, and an iterative process to go back and improve the pipeline. In this case, we went through nine versions of data releases before we reached the data set that was used to create the black hole image.”

Five petabytes of data were first reduced to five terabytes before migrating to the cloud. Chan persuaded the team to choose GCP because “GCP was user-friendly and powerful.” The team used “three small 24/7 Google virtual machine instances (VMs) —one as a file server, one as a development/testing server, and one as a data access server,” he reports, adding “we also have many 64-core to 96-core VMs that we only launch when we need them. Those powerful VMs sped up our data analysis timeline from weeks to days and provided fast access to our colleagues from different parts of the world. In addition, using many powerful VMs allowed us to study a large parameter space and understand the effect of the algorithm on the final image.”

“Those powerful VMs sped up our data analysis timeline from weeks to days and provided fast access to our colleagues from different parts of the world. In addition, using many powerful VMs allowed us to study a large parameter space and understand the effect of the algorithm on the final image.”

Chi-Kwan Chan, leader of the EHT Software and Data Compatibility Working Group and an assistant astronomer, University of Arizona

The sky is no longer the limit

“We have achieved something presumed to be impossible just a generation ago," concludes Sheperd Doeleman, EHT Project Director and Senior Research Fellow at the Center for Astrophysics at Harvard University. "Breakthroughs in technology, connections between the world's best radio observatories, and innovative algorithms all came together to open an entirely new window on black holes and the event horizon." This breakthrough paves the way for more exciting discoveries. Chan is already developing on top of Kubernetes to orchestrate the EHT’s data analysis pipeline and using GPUs/TPUs to accelerate the data analysis. “The large number of GPU options, and the availability of TPUs, on GCP allow us to do that,” Chan says. Looking ahead, he adds: “with the lessons learned from the EHT, we are also exploring high-performance computing (HPC) on the Cloud so that we can perform numerical simulations on the same infrastructure. Hybrid HPC-Cloud infrastructures are needed to support future scientific collaborations like the EHT and we are looking into using Google's HPC solutions to make this possible.”

“We have achieved something presumed to be impossible just a generation ago. Breakthroughs in technology, connections between the world's best radio observatories, and innovative algorithms all came together to open an entirely new window on black holes and the event horizon."

Sheperd Doeleman, EHT Project Director and Senior Research Fellow at the Center for Astrophysics, Harvard University

Thanks for signing up!

Let us know more about your interests.