Catégorie d'événements : SRPP
SRPP 22/05/2026 Katia Chirkova
SRPP 10/04/2026 Megan Dailey
SRPP 27/03/2026 Rasmus Puggaard-Rode
SRPP 20/03/2026 Claire Njoo
SRPP 13/03/2026 Christophe Corbier
SRPP Beyond reaction time: Articulatory evidence of perception-production link in speech using the Stimulus-Response Compatibility paradigm.
This talk will introduce our ongoing project investigating the link between perception and production in speech. In the Stimulus-Response Compatibility (SRC) paradigm, in which participants are typically prompted to produce a target syllable while being presented with either congruent or incongruent distractors. Responses tend to be slower in incongruent trials (i.e., covert imitation effect), reflecting a competition between perception-driven and goal-driven motor plans. The short response-distractor time lag in the SRC task design makes it suited to study the motor system engagement upon speech perception during speech planning. Our aim is to obtain finer-grained insights into the nature of perception-production link using electromagnetic articulography (EMA).
The discussion will be based on our preliminary analyses based on a subset of data from ten L1 British English speakers, using /ɹa/ and /va/ as prompt and distractor syllables. Reaction time (RT) analysis based on acoustic data shows a clear covert imitation effect for /ɹa/ but not for /va/. The timing of maximal displacement of tongue tip (TT) for /ɹa/ and lower lip (LL) for /va/ also followed a similar pattern. Time-varying position trajectories and tangential velocity profiles, however, show evidence of TT gestural intrusion for the /va/ production in the incongruent trials (i.e., with the distractor /ɹa/). Such between-condition difference in TT activity, despite a lack of clear congruency effects in RT measurement, might result from a greater degree of TT activation during speech planning due to the perception of distractor stimulus, demonstrating that the motor patterns activated based on observation only might in part be executed. The implications of the behavioural results will be discussed in the light of articulatory complexity, multimodal speech perception, and cognitive sensorimotor theories.
SRPP Éléments de prosodie du bedja (couchitique, Soudan)
Le bedja, langue couchitique du Soudan, est un parler aujourd’hui bien connu et décrit (Wedekind et al. 2005, Vanhove 2017, etc.). Toutefois, présenté comme tonal par certains auteurs (Mous 2012, Hellmuth & Pearce 2020) ou accentuel par d’autres (Hudson 1973, Mous 2022), son système prosodique reste méconnu au-delà de certaines de ses propriétés distributionnelles, tout particulièrement lorsque l’on considère les niveaux se situant au-delà de celui du mot prosodique.
Dans cette étude très exploratoire, nous proposerons de premiers éléments d’analyse descriptive des différents niveaux de la hiérarchie prosodique du bedja, depuis les mores jusqu’aux groupes intonatifs, en passant par les syntagmes phonologiques. Parmi les questions que nous aborderons au fil de cet examen, au-delà de celle du statut accentuel ou tonal du bedja (qui sera évalué à l’aune de la typologie de Hyman (2006, 2009, etc.)), nous évoquerons quelques paramètres conditionnant la longueur de surface des voyelles (la langue présentant une opposition phonémique entre brèves et longues), le rôle que pourrait jouer la sonorité des voyelles pour les corrélats associés à la proéminence, ou encore la localisation des tons de frontière. En fin d’exposé, l’intonation associée aux questions ou aux éléments focalisés sera brièvement évoquée.
SRPP Flapping vs. tapping in the Japanese rhotic: Evidence from X-ray microbeam and EPG corpora
One highly debated question in Japanese phonetics is whether the rhotic ‘/r/’ should be classified as a flap or a tap (Vance, 1987; Okada, 1999; Arai et al., 2007) – the consonants distinguished by tangential vs. direct movement trajectories of the active articulator with respect to the passive articulator (Ladefoged & Maddieson, 1996; Derrick & Gick, 2011). Although the flap is considered to be the primary allophone in some descriptive accounts (e.g., Okada, 1999), articulatory evidence for flapping has been limited (Sudo et al., 1972 based on electropalatography, EPG) or appeared to be absent altogether (Maekawa, 2023 based on dynamic MRI). The methods used in these and other studies, however, did not always allow for an examination of the tongue tip movement or its contact with the palate with sufficient temporal and/or spatial resolution. These studies have also frequently largely considered /r/ flanked by vowels and thus being strongly coarticulated (cf. Katz et al., 2018).
In this talk, I revisit the question of flapping vs. tapping by examining fine-grained dynamics of the Japanese rhotic productions in two corpora of read speech – the X-ray microbeam (XRMB) speech production database of Japanese (Hashi, 2000, 19 speakers, 144 tokens) and the EPG Cross-Language Articulatory Database (Kochetov et al., 2017, 5 Japanese speakers, over 1800 tokens). The results show that flap realizations of the Japanese rhotic are considerably more common than previously reported. This was specifically the case in utterance-initial position, where all speakers in the XRMB corpus produced at least some rhotic tokens with a preparatory raising/retraction of the tongue tip, followed by a rapid downward/fronting movement of the articulator in the proximity of the alveolar ridge. Similarly, three out of five speakers in the EPG corpus produced many word-initial rhotics with a closure advancing from the postalveolar to front alveolar regions. In both sets of data, the following back vowels favoured flapping, while the following front vowels, as well as the intervocalic position, favourred tapping.
Based on these results, we may conclude that the Japanese rhotic is inherently a flap (cf. Okada, 1999), as this configuration is presumably its intended articulatory target. The tap allophone appears in contexts less favourable for flapping due to the stronger overlap with neighbouring vowel gestures. This is reminiscent of the variation found for North American English flap/tap allophones of /t, d/ next to rhotics and non-rhotic vowels (Derrick & Gick, 2011) and is broadly similar to the cross-linguistically common appearance of tap/approximant realizations of phonemic trills in aerodynamically unfavourable contexts (Ladefoged & Maddieson, 1996).
SRPP A long-form single-speaker real-time MRI speech dataset and benchmark
We release the USC Long Single-Speaker (LSS) dataset containing real-time MRI video of the vocal tract dynamics and simultaneous audio obtained during speech production. This unique dataset contains roughly one hour of video and audio data from a single native speaker of American English, making it one of the longer publicly available single-speaker datasets of real-time MRI speech data. Along with the articulatory and acoustic raw data, we release derived representations of the data that are suitable for a range of downstream tasks. This includes video cropped to the vocal tract region, sentence-level splits of the data, restored and denoised audio, and regions-of-interest timeseries. We also benchmark this dataset on articulatory synthesis and phoneme recognition tasks, providing baseline performance for these tasks on this dataset which future research can aim to improve upon.


