Rachid Riad (ENS)

12 April 2019, 14h0015h30

This paper discusses a new framework for the evaluation of the detection of disfluency in speech and natural language processing. We argue for the supervised identification of two tracks of communication, primary and collateral tracks, inspired by the theory of performance from H. Clark, Using Language.
This setting enables a direct quantitative comparison between the detection techniques from the Natural Language Processing(NLP) and Speech Technologies(ST) communities. It finally provides comparison metrics of the models that can be used by speech pathologists, HCI engineers, (psycho)-linguists, whom can have all different needs based on the model predictions.
Finally, we tackle this problem of disfluency identification in adults stuttered speech in the context of semi-directed interviews. We compare word-based prediction and frame-base predictions, using semantic, span and acoustic-prosodic information.

Prochains événements

Voir la liste d'événements

27 March 2026

SRPP The past and present of stop vocalization in Danish

Rasmus Puggaard-Rode(University of Oxford)

10 April 2026

SRPP 10/04/2026 Megan Dailey

Megan Dailey (University of Lausanne)

30 April 2026

Stefanie Keulen - Seminar 1

Language and the brain: a lifetime perspective.

05 May 2026

Stefanie Keulen - Seminar 2

The enigmatic cerebellum: involvement in speech and language.

Information relative aux conditions de la RGPD concernant les cookies