Case Studies: Cmu GCP - Google for Education

At Carnegie Mellon University, machine learning gets social

Carnegie Mellon’s Articulab wanted to understand how robotic assistants could collaborate with humans to complete tasks and build relationships, rather than merely replacing the work of human assistants. To study robotic interactions with humans and train their agents with social awareness, they used Google Cloud’s Machine Learning Engine.

Giving artificial intelligence some social capabilities

The Annual Meeting of the New Champions 2016 in Tianjin, China, featured, among other things, a remarkable debut: SARA, the Socially Aware Robot Assistant that could interact with people in a whole new way. Rather than merely replacing the role of a human assistant, or processing and delivering information in an impersonal manner, SARA was different: intuitive, friendly, engaging and designed to “collaborate” with human users, recognize and respond to their facial expressions, learn preferences, and improve task performance based on the users she encountered. She was also programmed to learn certain social cues, nodding her head as a user spoke, and understanding different intonations.

A half year later, in January 2017, the project was presented at the World Economic Forum in Davos, Switzerland, where it was the only demo to be featured in the Davos Congress Center. SARA served as a virtual personal assistant, offering attendees information about the sessions being presented, introducing them to relevant fellow attendees, recommending places to get food, and more.

Initially, SARA served as a virtual personal assistant with a specific application—offering help at the conference and interacting with guests. She was able to learn about the interests and goals of global leaders, and then helpfully recommend sessions that they might want to attend. Even better, SARA could use her conversations to build relationships with each person who spoke with her, learning more about their preferences and goals—and having done so, she could improve task performance in future conversations by offering even more personalized help.

She was the creation of the Articulab, a small team based at Carnegie Mellon University whose mission involves studying human interaction in social and cultural contexts as an input into computational systems, which in turn help us better understand human interaction. How do people communicate with technology, and how might that communication be enhanced over time? Cultivating social bonds are key, just as they are crucial between people. As the Articulab team has noted about SARA: “Rather than ignoring the socio-emotional bonds that form the fabric of society, SARA depends on those bonds to improve her collaboration skills.”

"Google Cloud is accelerating academic AI research."
Yoichi Matsuyama, post-doctoral fellow at the Language Technologies Institute and SARA project lead

Using Google Tools to build SARA

Led by Justine Cassell, Associate Dean for Technology Strategy and Impact in the School of Computer Science at Carnegie Mellon University, Articulab was familiar with Google Cloud prior to SARA, thanks to tools and funding for other research projects. “Because we had been using TensorFlow for a number of Machine Learning tasks, it was a natural transition to start using Google Cloud for our recent Deep Learning projects,” says Yoichi Matsuyama, post-doctoral fellow at the Language Technologies Institute, and the project lead on SARA. “We’ve also been using a number of Google APIs, such as Google Speech API (Speech Recognition) for our conversational agents, and Firebase for crowdsourced data collection frameworks.” The use of Google Cloud continues as SARA expands into new domains and applications. “We are still in the deployment phase,” he says, noting that “Google Cloud is accelerating academic AI research.”

Matsuyama says the lab is “heavily using” Compute Engine, including GPU instances with 4 x Nvidia Tesla K80 and TensorFlow. This year the team has been working on models such as Deep Reinforcement Learning-Based Social Reasoning in Task Contexts, and Socially Conditioned Natural Language Generation.

Assessing what might be described as SARA 1.0—the rollout at the World Economic Forum, Matsuyama says, “We had over 250 attendees try to use SARA over a four-day conference. So overall, that was successful. But we are still trying to analyze the findings, what’s good, what’s bad.” He adds: “One major finding out of that data was that what we are calling ‘rapport’—interpersonal relationships—is actually correlated with task performance, and in this case it affected recommendation acceptance. When rapport was high, and SARA established the relationship with the user well, they were likely to accept her recommendation results. That’s our major finding so far, but we are continuing to analyze the data.”

Expanding into new domains—including education

It appears that SARA’s work has only begun. Other applications of Articulab’s “socially-aware artificial intelligence” thus far have included education, such as supporting children in public schools with few resources, and encouraging collaboration among peers (which has been proved to be critical to learning gains); and assisting high-functioning autistic children, and those with Asperger’s, with practicing interactive social skills to improve peer relationships.

Michael Madaio, a doctoral student at the Human-Computer Interaction Institute and the project lead for the “Rapport-Aware Peer Tutor” (RAPT) project, notes that in the “human-human” peer tutoring data they have collected, “the rapport between the students who were collaborating is highly associated with their engagement in the task, with their problem-solving, and ultimately with their learning.” In other words, working together, socially, can benefit all.

As they develop their work on understanding rapport in learning into educational applications, Madaio notes that they want to provide tools that do more than just help students learn. “There are already learning platforms out there,” he says, “but what education research has shown is that students aren’t just information processing machines who compute numbers. There is this social grounding. And to learn, it’s important to build that bond with other students. It’s valuable too for when a virtual tutor gives feedback: If it has to tell the kid they’re wrong, how does it do that? Maybe when you start out, you’re more polite, more indirect, to soften the blow. But over time, you can build up this relationship, and be a little more direct—and actually give them specific feedback that will really help them.”

If a virtual tutor is successful, then you can increase “the likelihood they will want to come back,” he adds, which is important, as well as how engaged students are. “It’s not just, are the students returning for help, but how do they act when they are interacting with the virtual tutor? Are they more forthcoming? Do they feel comfortable sharing more of their learning goals and anxieties?” It’s a bond, like any other, that must build up over time. But the extraordinary thing, as SARA has proven, is that such a bond can occur.

So far, the response in educational applications has been positive. But, says Madaio, “We haven’t yet done an in-school deployment. Part of our design challenge for this year is to figure out the nature of a large-scale deployment.” They are trying to imagine what that future deployment might look like, such as creating homework support buddies or literacy tutors for students struggling to read.

Although there are no plans for a virtual tutoring system within Carnegie Mellon, the team might implement a personal assistant that could help students find out about upcoming talks on campus, offer event recommendations, and more. Perhaps the most ambitious Articulab goal is to create a version of SARA that can work not just at a four-day conference, but seven days a week, 24 hours a day, in multiple domains. It’s a challenging yet exciting idea to contemplate, filled with boundless possibilities.

"It’s not just, are the students returning for help, but how do they act when they are interacting with the virtual tutor? Are they more forthcoming? Do they feel comfortable sharing more of their learning goals and anxieties?"
Michael Madaio, doctoral student, Human-Computer Interaction Institute