210 TB of video data processed in under eight hours
Because the processing could be done in small batches in any order, the team could also take advantage of Google Cloud’s preemptible VMs. These are offered with a discount of up to 80% and may have service interruptions, making them a cost-effective choice for this custom workload. The team was able to process 210 TB of video data in about eight hours, at an average cost lower than many on-premise HPC systems. Amy Apon, Professor and C. Tycho Howle Director of the School of Computing at Clemson, supervised Posey’s project and explains that “we want to leverage the instant scalability of the commercial cloud to run workloads that would normally require a massive supercomputer. The ability to scale up is important, but scaling down makes it cost effective. We envision that this type of system can be utilized to aid in planning evacuation processes by allowing first responders or Departments of Transportation to observe and simulate traffic in real time during a major event.”
The team learned that a Google Cloud-based parallel-processing system can be launched and then deprovisioned in just a few hours, and that it can efficiently process data at a lower cost than many on-premises HPC systems. “Most hurricane evacuations won’t require the full 2-million-plus virtual CPUs used here to process evacuation data in real time, but it’s reassuring to know that it’s possible,” says Kevin Kissell, Technical Director for HPC in the Office of Google’s CTO. “And it’s a source of some pride to me that it’s possible on Google Cloud.” For Apon, this experiment also confirms how cloud computing is democratizing access to technology: “it’s the advent of a new age in parallel supercomputing. Now even a researcher at a small college or business can run a parallel application in the cloud.”
For more technical details about Clemson’s experiment, see this blog post.