Automatic speech recognition (ASR) is increasingly used, e.g., in emergency response centers, domestic voice assistants, and search engines. Because of the paramount relevance spoken language plays in our lives, it is critical that ASR systems are able to deal with the variability in the way people speak (e.g., due to speaker differences, demographics, different speaking styles, and differently abled users). ASR systems promise to deliver objective interpretation of human speech. Practice and recent evidence however suggests that the state-of-the-art SotA ASRs struggle with the large variation in speech due to e.g., gender, age, speech impairment, race, and accents. The overarching goal in our project is to uncover bias in ASR systems to work towards proactive bias mitigation in ASR. In this talk, I will present systematic experiments aimed at quantifying, identifying the origin of, and mitigating the bias of state-of-the-art ASRs on speech from different, typically low-resource, groups of speakers, with a focus on bias against gender, age, regional accents and non-native accents.
Prochains événements
Voir la liste d'événementsSRPP: Nature of contrast and coarticulation in dense coronal systems
School of Languages and Linguistics, Jadavpur University
SRPP de Michael Neumann
Modality.AI, Inc.
SRPP de Said-Iraj Hashemi
Laboratoire de Phonétique et Phonologie
SRPP de Louise McKeever
Laboratoire de Phonétique et Phonologie