Jump to content

Chicago researchers measure representation in children’s books with Google Cloud’s Vision AI

A team at the University of Chicago is using machine learning to identify and measure representation in over 1,000 children’s books. They have trained Google’s visual and natural language AI tools to recognize the race, gender, and age of characters in order to better understand how different demographic groups are represented and how to address any disparities.

As a Senior Computational Scientist at the University of Chicago, Dr. Teodora Szasz helps faculty members design and execute innovative projects involving machine learning and image recognition. Two years ago she teamed up with Dr. Anjali Adukia, Assistant Professor at the University of Chicago Harris School of Public Policy, to use visual AI to measure representation in award-winning children’s books. By identifying and analyzing characters in the books by age, gender, and race, they aimed to better understand how demographic categories affect children’s developing awareness of themselves and the world.

Images are important even before kids can read. When kids recognize themselves or others in different characters that helps them imagine themselves or others in different futures.

Dr. Teodora Szasz, Senior Computational Scientist, University of Chicago

Diversity in children’s books is still a work in progress

Working with a team at the interdisciplinary Messages, Identity, and Inclusion in Education lab (MiiE), Adukia and Szasz divided over 1,000 children’s books published from 1920 to 2010 into mainstream and diversity categories. They developed new machine-led methods for systematically converting their texts and images into data. By using Google’s Vision AI to identify faces in over 200,000 files, along with established text analysis methods, they were able to measure the representation of race, gender, and age in children’s books commonly found in U.S. schools and homes over the last century. “Our research found that diversity is not yet mainstream,” Adukia reports, “with people of color underrepresented – and White males overrepresented – relative to their share in the U.S. population. Our text and image analysis also shows that females are more likely to be included in images than in text, suggesting symbolic inclusion in pictures rather than substantively in the actual story. We found that children were more likely to be shown with lighter skin than adults, even though there is no reason why that should systematically be the case.”

Vision AI accelerates results, with over 93% precision

The MiiE team chose to use Google tools, Szasz says, because “some of the team members were already familiar with them, but also we wanted quick results. We had to build and optimize models from scratch, and Google Cloud does that automatically so it was easy to use.” Training a machine-learning model to recognize faces in illustrations and cartoon characters was challenging, but Szasz and team compared results for different open source tools in optical-character recognition (OCR) and found that the model she trained with Google’s Vision AI detected around three times more faces than the other tools. Content analysis has traditionally been done “by hand,” but manually coding images is labor- and time-intensive. Using AI made it possible for the team to analyze many more images more quickly and more cost-effectively. In comparing their training data set to a manually-coded dataset, they achieved over 93% accuracy in precision; in other words, over 93% of their labels were accurately predicted.

Leveraging AI for collaboration and social good

In 2020 Szasz joined a cross-disciplinary group of 33 Google Research Innovators, who receive extra support, training, and access to Google experts for their research. That helped her meet peers at other institutions doing similar work with Google tools, which helped her generate new ideas and public interest. “There’s been so much media interest in our work,” she reports. “I’m so grateful to Google. This wouldn’t be possible otherwise.”

The Chicago team’s results have been published in a Computer Vision Foundation article and a National Bureau of Economic Research working paper with co-authors Alex Eble, Emileigh Harrison, Ping-Jung Liu, Ping-Chang Lin, and Hakizumwami Birali Runesha. In the working paper, the authors argue that this research can show the potential for using AI for social good: “These tools can facilitate broader and more cost-effective measurements of racial constructs, gender identity, and age in images and text in a larger set of content than could be analyzed by any one individual or institution.“ Adukia adds, “We see a world of opportunities here. These AI tools have the potential to transform how we think about representation, and the messages we’re sending to children.”

To hear Dr. Adukia discuss this research during Google’s Public Sector Summit’s closing keynote, click here. To get started with Google Cloud, apply for free credits towards your research.

Through collaboration we can improve these tools and open them to other domains beyond children’s books, like representation on television. AI can help us measure fairness.

Dr. Anjali Adukia, Assistant Professor, Harris School of Public Policy, University of Chicago