Since 2014, the Allen Institute for Artificial Intelligence (AI2) has been using artificial intelligence (AI) to build tools and programs to improve the everyday work of researchers. Michael Schmitz, Director of Engineering, explains that their mission is "to contribute to humanity through high-impact AI research and engineering. We have a multiple strategies to achieve this, such as producing impactful research discoveries, publishing in peer-reviewed journals, and building tools to help accelerate the pace of research and enable research that was previously prohibitive.” Their programs have included Semantic Scholar, a natural language processing tool that makes it easy to mine vast amounts of scientific literature for insights, and Aristo, a multidisciplinary project that draws on AI to reason about science.
In 2017 Marc Millstone, Manager and founder of the Beaker team, partnered closely with a group of researchers at AI2 to observe and understand how they work. It soon became clear, he says, that “the process of trying out new ideas was held back, not by the new ideas, but due to the cognitive overhead in infrastructure and collaboration—running experiments, tracking their results and then sharing them.” Reproducibility in the sciences is crucial for validating results and ensuring that they can be duplicated in another lab and built upon, but modern deep learning systems are complicated with large-numbers of parameters to track and complex dependencies to maintain.
"We migrated to Google Kubernetes Engine because Google simply provides the best offering for managed Kubernetes between the three big cloud providers."Marc Millstone, Manager and founder of the Beaker team, Allen Institute for Artificial Intelligence
Making research easier
To solve this problem, AI2 formed a project called Beaker, a Kubernetes-based platform for running reproducible experiments at scale. “Within computational research, researchers are frequently modifying their code, their data, the parameters to run their code, and even the environments in which their code runs. Each of these dimensions can affect the underlying results and models, sometimes significantly. With Beaker, we track all of this for the user automatically. We also make using Google Cloud and Kubernetes as easy as running a simple command, managing all the underlying details.” Millstone says. The program records the variables, results, and associated metadata for every experiment and makes them shareable with a link. By enabling researchers to return to the exact procedure and repeat or modify it, Beaker fosters collaboration and helps them run more experiments more efficiently.
Millstone adds that building Beaker on Kubernetes was an obvious choice, first by managing their own clusters then by migrating to the cloud: “We chose Kubernetes as it is the leading container orchestration framework and allows us to focus on higher level problems for our users. Then last year, we decided that managing our own Kubernetes cluster was too much overhead. We migrated to Google Kubernetes Engine because Google simply provides the best offering for managed Kubernetes between the three big cloud providers. It's been a great experience. Google engineering and support have been wonderfully responsive and allowed us to focus on delivering customer value as opposed to scaling the underlying master nodes.” Schmitz adds that “stability has dramatically improved since we moved off of a cluster that we managed ourselves. We've also appreciated Google's customer-first billing strategy (i.e. sustained usage discounts) and their willingness to engage with us directly to improve their products.”
Migrating to GCP for speed and transparency
At AI2, they want to move fast to help researchers move even faster. Darrell Plessas, Engineering Manager, explains that to that end, “every project is set to be autonomous so research can proceed at maximum speed. We don’t dictate their tools. That said, the majority of projects we have are based on Kubernetes and Google Cloud Platform (GCP).” Kubernetes automates container management at scale and GCP leverages all the advantages of cloud infrastructure so it made sense to systematically migrate their whole technology stack. It was “very straightforward,” Plessas remembers, and “Google makes it easier to use containers in a managed way.” AI2 needed an infrastructure platform that was both fast and secure, that was simple to learn and flexible, and he found it: “With GCP if you want to run fifty experiments you can do so easily and customize. And it was very easy to see what we’re spending and our GPU usage over time,” Plessas says.
Beaker is currently an internal tool used only by AI2 researchers, but it’s already had a significant impact: Millstone reports that “in 2018, our researchers ran over 60,000 experiments in Beaker, totalling nearly 190,000 GPU compute hours. Our growth is simply incredible. Our researchers have submitted at least twenty papers to top journals, with models trained using Beaker. In the first two months of 2019 we have already run 60,000 compute hours and 33,000 experiments through the system.” The more experiments they can run, the more they can learn. Plessas concludes, “we want to get out of the way and make it as easy as we can for researchers to do groundbreaking research. Our researchers are not meant to be superstar Site Reliability Engineers. We need them to think about bigger things.”
"With GCP if you want to run fifty experiments you can do so easily and customize. And it was very easy to see what we’re spending and our GPU usage over time."Darrell Plessas, Engineering Manager, Allen Institute for Artificial Intelligence