What do you know about the characters in your lists? Using data to reveal biases in contemporary fiction

Photo of Andrew Piper.

Andrew Piper is Professor and William Dawson Scholar in the Department of Languages, Literatures, and Cultures at McGill University. His work focuses on using the tools of data science, machine learning, and natural language processing to promote a more inclusive understanding of culture and creativity. His research is grounded in the history of reading technologies and how they have shaped human cultures.

Photo of Eve Kraicer.

Eve Kraicer is Senior Researcher at .txtLAB, a cultural analytics laboratory at McGill University. She graduated from McGill in 2017 with a BA in Cultural Studies. For her honours thesis, she analyzed patterns of linguistic contagion in contemporary plague fiction. Her current work at .txtLAB applies machine learning and social network analysis to measures of cultural inequality, with particular focus on issues of gender bias and intersectionality in recent texts.

Andrew Piper and Eve Kraicer will be speaking at Tech Forum 2018 in a session called Using Data to Reveal Gender Bias in Contemporary Fiction.

Characters are the bedrock of any good work of fiction. They make us laugh, cry, and dream. We think about them, but also through them, when we read. As we imagine new worlds, characters anchor our experiences.

Until recently, we could only ever talk about characters as individuals, one by one, book by book. New techniques in text and data mining, however, now allow us to explore characters more broadly in order to better understand the kinds of worlds publishers are presenting to their readers. In our Tech Forum talk, we'll walk you through how we use data in our lab to assess where fiction is going. What messages are you sending to readers through the characters on your list? And what might you want to change?

How distinct are your characters? 

Did you know that Jane Austen invented a new type of character? In the nineteenth century, female novelists were breaking into the literary scene in increasing numbers. To do so, they focused increasingly on main characters who were psychologically "deep," spending much of their time thinking, feeling, and wondering. It marked the invention of the literary introvert, and Jane Austen was one of her most ardent promoters.

In the graph below, you can see a plot of main characters from well-known books based on their qualities of extroversion versus introversion. The further right and up you go the more sociable and communicative the book's protagonist (i.e., extroverted), while the further left and down the more cogitative and perceptual (i.e., introverted). Prize-winning novels tend to have more cogitative characters, while bestselling novels like The Martian tend to have more talkative and social heroes and heroines. Plots like this one let us see the personality dispositions of fictional characters. They can help you see whether your novels are creating something new, like Jane Austen once did, or whether you're following the pack.

Do you want to break the mold or conform to expectations? Data can help you understand what the books on your list are doing.

How diverse are your characters?

We can also try to ask questions about diversity and representation. What's the likelihood of finding a strong female protagonist on your list? Do your characters tend to be representative of one type of social class, race, or ethnic background? What about sexuality?

The story we have to tell about these issues during our Tech Forum talk is unfortunately not a pretty one. Across all seven genres we surveyed, which encompass over 1,300 works of fiction published in the past decade, women have been routinely under-represented in novels. The likelihood of encountering a female protagonist is about 39%. The likelihood of encountering two leading female characters? 12%. The further down the list of importance you go in a novel, the fewer women you are likely to encounter. Even the extras are biased.

However, the point of our talk is not all doom and gloom; we want to emphasize that we can use this knowledge to do something about these problems. Data is useful because it makes us more self-aware about behaviour that, though habitual, produces outcomes we wouldn't actively want to strive for. In other words, we know we can do better.

So what's in your list?

To hear more from Andrew Piper and Eve Kraicer on Using Data to Reveal Gender Bias in Contemporary Fictionregister for Tech Forum, March 23, 2018 in Toronto. You can find more details about the conference here, or sign up for our mailing list to get all of the conference updates.