Lecture 3: May 31: 2 to 4 pm.
Foreign Accent Syndrome as a motor speech disorder.
Foreign Accent Syndrome (FAS) has been regarded as a speech production disorder defined in perceptual terms: the articulation of individuals changes suddenly as a result of e.g. damage to the central nervous system and their speech is recognised as foreign accented by speakers of the same speech community as the FAS patient. The first patient with FAS was reported in 1907 by the French neurologist Pierre Marie, who described a patient whose original Parisian French accent had changed into an Alsatian accent after a stroke affecting the left hemisphere of the brain. Since then approximately 150 patients with this syndrome have been reported in the scientific literature.
This lecture will discuss the history of Foreign Accent Syndrome and illustrates the different taxonomical subtypes that have been distinguished. It will furthermore explore some of the salient neurological aspects of the different types. It will generally be argued that FAS arises as a misinterpretation by listeners of markers of ‘state’ as ‘speech community’ markers.
In this talk I explore Breton stress from three perspectives: theoretical, experimental and historical. I start with a theoretical analysis of stress patterns across speakers from different linguistic backgrounds, encompassing both so-called ‘traditional’ and ‘new’ speakers, and challenging the notion that ‘new’ speakers use a French stress pattern when speaking Breton. I then use experimental methods to explore the concept of ‘stress deafness’, a term which was first applied to French, and consider whether speakers of Breton, a minority language, perceive and store stress patterns with greater ease than monolingual speakers. Finally, I examine a phenomenon which seems to have undergone change in the recent history of Breton, namely the placement of stress on proclitics.
In this presentation, the focus will be on the speech characteristics of people with Parkinson’s disease, highlighting the fact that speech changes should not be considered as a late symptom anymore but as an early marker of motor impairment. Given the heterogeneous presentation of both gross motor and speech motor symptoms, there is significant variability in the data. The individual progression of speech impairments and the diverse responses to treatment options continue to pose a challenge to comprehensively characterizing the ability to speak in people with Parkinson’s disease.
Advancements in machine learning and deep learning techniques have notably improved the capabilities of automatic speech analysis systems, enabling more precise and nuanced assessments of speech. Consequently, leveraging artificial intelligence could address recent research challenges by integrating speech features with clinical characteristics to establish robust, reliable, and objective speech-based biomarkers. This presentation will explore the potential of AI in addressing these challenges, elucidate potential data analysis methodologies, and discuss future applications in this domain.
(1) An acoustic-articulatory description of French ‘R’
Andrés Felipe Lara (Laboratoire de Phonétique et Phonologie)
The acquisition of rhotics poses significant challenges for individuals of all age groups, including children and adults. These challenges arise from the variable acoustic properties and intricate articulation processes associated with rhotics. The aim of my research is to tackle this complexity by establishing a quantifiable description of the production of French ‘R’ among native French speakers. Through the application of quantitative acoustic and articulatory analysis, my objective is to lay the groundwork for future comparisons with bilingual adult speakers and late learners of French. In this presentation, I will strive to establish a correlation between specific acoustic characteristics of French rhotics and the corresponding articulatory gestures necessary for their production. This analysis will provide valuable insights into the difficulties that bilingual speakers and language learners may encounter when acquiring the French rhotic.
(2) Traitement de la frontière prosodique du français : études comportementale et électroencéphalographique
Lei Xi (Laboratoire de Phonétique et Phonologie)
La prosodie assume beaucoup de fonctions importantes dans la communication verbale, dont la segmentation et la désambiguïsation. Dans cette communication, nous allons présenter deux expériences sur la désambiguïsation syntaxique par la prosodie en français.
Dans l’expérience comportementale, nous avons demandé à 20 francophones natifs de compléter une série de phrases localement ambiguës, avec deux clôtures prosodiques différentes (précoce et tardive) afin de déterminer s’ils pouvaient assigner correctement les mots cibles à leurs fonctions syntaxiques sur la base des indices prosodiques disponibles. Les données comportementales ont montré que les participants, bien que natifs du français, ont eu des difficultés à établir la fonction syntaxique du mot cible, notamment pour la clôture précoce.
Afin de mieux étudier le traitement des deux frontières prosodiques différentes, nous avons ensuite mené une expérience électroencéphalographique (EEG), pendant laquelle le signal électrophysiologique continu de 20 francophones natifs a été enregistré en écoutant les mêmes phrases ambiguës. Les données neurocognitives suggèrent que les frontières prosodiques ont été analysées comme l’atteste le potentiel évoqué Closure Positive Shift (CPS), présentant un maximum autour de 400 à 500 ms après l’onset de la dernière syllabe qui précède la frontière prosodique. Par ailleurs, l’analyse sur la latence du pic CPS implique la même sensibilité neurocognitive pour le traitement de clôture précoce vs tardive.
Prises en ensemble, les données ont été interprétées à la lumière des prédictions de Late Closure Preference et d’Informative Boundary Hypothesis et nous avons souligné l’importance du contexte prosodique complet et informatif en perception de la parole.
In various languages such as Italian and Japanese, consonant duration plays a contrastive role, as seen in examples like Italian /pipa/ « pipe » versus /pipːa/ « pipsqueak ». Traditionally, research on singleton/geminate contrasts has focused on acoustic analyses, revealing that geminate consonants exhibit longer closure and total duration compared to singletons. This has led to proposals suggesting representations of geminates based on features such as [±long], association with multiple timing units, or gestures with longer activation intervals. However, recent articulatory studies, particularly those employing motor control-oriented approaches, have challenged the notion that geminates are merely longer versions of singletons. Research by Löfqvist (2005) and Tilsen & Hermes (2020) has suggested that geminates may have distinct targets and control regimes, indicating a need for a more nuanced understanding. Conversely, phonologically oriented work has taken seriously the idea that geminates may be constituted by two (overlapping) articulatory gestures (e.g., Di Benedetto et al. 2021).
In this presentation, I will share the findings of an articulatory (3D EMA) study of Italian (bilabial) singleton and geminate consonants produced at a variety of rates. The results of this work challenge the notion that geminates are purely longer versions of singletons. Even in the face of variation in speaking rate, Italian geminates display distinct kinematic profiles and kinematic parameters compared to singletons; intriguingly geminates also display a different timing organization to surrounding segments. I present preliminary modeling evidence that considers the distinct spatial and temporal features between singleton and geminate stop and tries to integrate them into a unified model of phonology and speech production. Finally, I will broaden the picture and discuss other ways in which geminates can be instantiated in ways that are more reminiscent of longer singletons by drawing from ongoing analysis of articulatory work on Japanese geminates.
I conclude by briefly outlining the importance of physiological work on geminates in developing representation that can bring closer phonology, phonetics, and motor control.
During interaction, speakers tend to adjust the amount of coarticulatory cues to increase or decrease perceptual distances between competing speech units. Anticipatory coarticulation has also been observed in the visual-gestural modality. Despite this, little is known about the use of coarticulatory strategies in sign language. We built the first study that investigates coarticulation in French Sign Language (LSF) using 3D Electromagnetic Articulography (EMA) to provide precise kinematic measurements in sign production. In this novel approach, a deaf native signer was recorded (EMA/video) producing phonological pairs of signs composed of ‘1’- and/or ‘3’-handshape. Our findings demonstrate that kinematic data allows for the detection of coarticulation in various discourse contexts. Temporally, we observe the anticipation of the ‘3’-handshape before the end of its immediately preceding ‘1’-handshape sign (and vice versa). Spatially, the (repetitive movement of) the sign is affected by reduction/truncation if followed by another sign. Within a dynamical approach (Articulatory Phonology), we analyze the kinematics of our sign data as a result of systematic patterns of overlapping organization triggered by the phonological system. Based on this view, we attempt to take a step forward towards an integration of gradient and categorical processes such as coarticulation and assimilation.
Modelling the process that a listener actuates in deriving the words intended by a speaker requires setting a hypothesis on how lexical items are stored in memory. This work aims at developing a system that imitates humans when identifying words in running speech and, in this way, provide a framework to better understand human speech processing. We build a speech recognizer for Italian based on the principles of Stevens’ model of Lexical Access in which words are stored as hierarchical arrangements of distinctive features (Stevens, K. N. (2002). “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. Am., 111(4):1872–1891). Over the past few decades, the Speech Communication Group at the Massachusetts Institute of Technology (MIT) developed a speech recognition system for English based on this approach. Italian is the first language beyond English to be explored; the extension to another language provides the opportunity to test the hypothesis that words are represented in memory as predicted by theory, and reveal which of the underlying mechanisms may have a language-independent nature. Future developments will test the hypothesis that specific acoustic discontinuities – called landmarks – that serve as cues to features, are language independent, while other cues may be language-dependent, with powerful implications for understanding how the human brain recognizes speech. A new Lexical Access corpus, the LaMIT database, that was created and labeled specifically for this work, will be described. It is provided freely to the speech research community. Furthermore, as will be presented, a legacy software, named xkl, with superior capabilities in performing detailed acoustic analysis of speech, that was developed in the 80’s by the late Dennis Klatt at MIT, was revamped and adapted to modern computing platforms. Finally, we will address a peculiar property of Italian, lexical vs. syntactic consonant gemination, as an exemplar case of the adopted research method.
This presentation will deal with a specific case of communication disorder, namely speech and voice disorders. After defining this specific context, we will focus on the assessment of this type of disorder, which is necessary in the clinical field, and on how automatic approaches can overcome the limitations of perceptual assessment, particularly in terms of subjectivity and reproducibility. We will briefly review the classical machine learning approaches used since the 90s and, more recently, the application of deep learning. At this point, we’ll look at the concept of interpretability in deep learning (as we define this concept) and how it can be used to provide useful information to clinicians.