Machines telling stories: Reader perspectives on synthetic audiobook voices

18.11.2025, 14:20
20m
15-minute research paper Audiobooks and AI

Sprecher

Karl Berglund (Uppsala University) Sarah Hedman-Dybeck (Uppsala University)

Beschreibung

The audiobook boom has created a new profession: the audiobook performing narrator, with its own forms of popularity and prestige tied to it. There are prizes awarded for best audiobook narration, and publishers are currently standing in line for the narration stars whose voices large groups of readers appreciate.

The rapid technological development is however about to cause the next rupture in contemporary book publishing: AI-narrated audiobooks. Advances in the subdivision of generative AI called TTS (text-to-speech) now makes it possible to create realistic synthetic voices that mimic dialects/sociolects, breathing sounds, intonations and emotions, including voice clones of specific human voices. There are already commercial services targeted towards publishers available for transforming ebooks to audiobooks, and book streaming services are experimenting with AI voices in various ways (Berglund 2024: 116–118). Within the next few years, the number of AI-narrated audiobooks will skyrocket. In ten years, it is not unlikely that human-narrated audiobooks will be outnumbered by machine-narrated ones.

This research paper presents tentative results from an ongoing empirical study of the audiobook streaming service Storytel’s newly launched function “VoiceSwitcher” (Storytel 2023), which lets the user choose between the original human narration, the AI replica voice of a Swedish star narrator, and a number of generic but still stereotypical AI voices: “mature, calm Carin”, “young, energetic Amanda”, “middle-aged, honest Martin”, “soft, masculine Erik”, etc. Methodologically, the study combines data analysis of logged sessions of audiobook streaming using the VoiceSwitcher with large-scale surveys of the same individuals, focusing on their attitudes to and experiences of listening to AI voices.

Karl Berglund is an assistant professor of literature at Uppsala University, Sweden. His research lies in the intersection of the sociology of literature and cultural analytics, and spans popular genre fiction, publishing and reading studies, translation studies, and computational literary analysis.

In 2020–2024, Berglund was PI for the cross-disciplinary and SRC-funded project “Patterns of Popularity”, where he has investigated the ongoing audiobook boom in the Nordic countries from various angles, and in particular by means of large-scale analysis of streamed audiobook data. In 2025–2030, he is a project member of the SRC-funded research environment “VOICE. AI-created voices. Legal and societal perspectives”, where he will study the implications of the introduction of synthetic voices for audiobook reading and publishing.

He is the author of Reading Audio Readers: Book Consumption in the Streaming Age (Bloomsbury Academic 2024), the first book to encounter audiobooks from within the world of book streaming and user data. His writing has appeared in PMLA, Translation Studies, Journal of Cultural Analytics, European Journal of Cultural Studies, Public Books, and other publications.

Sarah Hedman-Dybeck is a PhD student in Comparative Literature at Uppsala University. Her research interests lies in sociology of literature, reading studies, and publishing studies. She is especially engaged in questions concerning AI and the impact it has on reading practices, literary production and distribution.

Hauptautoren

Karl Berglund (Uppsala University) Sarah Hedman-Dybeck (Uppsala University)

Präsentationsmaterialien

Es gibt derzeit keine Materialien.