SRPP: Language- and speaker-specific variability in anticipatory coarticulation

In this talk, I will report on our recent work which seeks to advance our understanding of the temporal extent of anticipatory coarticulation, its cross-linguistic variation, and its variation between individuals of the same language community. Some studies have linked language-specific variation in coarticulation to phonological contrast (e.g., Manuel, 1990), yet the evidence for this has been equivocal (Scarborough et al. 2015). Moreover, it is not well understood how language specific effects relate to variability in coarticulation between individual speakers. For instance, Noiray et al. (2011) in a study on anticipatory lip rounding in English and Canadian French found no evidence of systematic patterns by language, despite French, but not English contrasting rounding on vowels. They therefore concluded that the implementation of anticipatory lip rounding is speaker-dependent, rather than language specific.

We have undertaken a large-scale comparative study on contextual vowel nasalization as well as lip rounding in three languages. We recorded nasalance and ‘blue lip’ video data (Lallouache 1991) for French, German, and American English for 27-30 speakers per language. For English, neither nasality nor rounding is contrastive for vowels, for French, both are contrastive whereas German contrasts lip rounding only. This allows us to study, for the same speakers, the coarticulatory behavior for two independent articulators, in the presence or absence of a phonological contrast (across languages). Our results confirm the presence of language-specific effects for both articulators, despite a considerable range of individual variation within each language. Contrast possibly serves to constrain the between-speaker variability rather than the temporal extent of coarticulation. Notably in all three languages, the temporal extent of coarticulation exceeds what current theories of coarticulation allow us to predict.

References
Lallouache, M. T. (1991). Un poste « Visage-parole » couleur. Acquisition et traitement automatique des contours des lèvres. PhD thesis, Institut National Polytechnique de Grenoble.
Manuel, S. (1990), The role of contrast in limiting vowel-to-vowel coarticulation in different languages. Journal of the Acoustical Society of America, 88(3): p. 1286-1298.
Scarborough, R., G. Zellou, A. Mirzayan and D.S. Rood (2015). Phonetic and phonological patterns of nasality in Lakota vowels. Journal of the International Phonetic Association. 45(3): p. 289-309.

Phonetics and phonology in Romance and beyond

This workshop is dedicated to studies at the phonetics/phonology interface in any language, inspired by approaches to Romance.

Its goal is to highlight the contribution of research on the phonetics and phonology of Romance languages to the broader empirical and theoretical development of the field, extended typologically beyond the Romance family. In particular, presentations and discussions during the workshop should showcase the contributions of research on speech and sound structure in the Romance languages to the advancement of the laboratory phonology program.

Location
17 rue de la Sorbonne 75005 Paris
Amphithéâtre Louis Liard

The workshop is free, but registration is required. Please follow the instructions under the ‘Registration’ tab on the conference website.

Speakers

  • Margaret E. L. Renwick – University of Georgia: « Impacts of structure, usage and phonetics on Italian mid vowels »
  • Marc Brunelle – University of Ottawa: « An ultrasound study of cavity expansion during Canadian French voiced obstruents »
  • Ander Egurtzegi – CNRS, IKER UMR 5478: « An « impossible » opposition: /h/ vs. /h̃/ in North-Eastern Basque »
  • Juliette Blevins, Michela Cresci – CUNY Graduate Center ; Liceo « Camillo Golgi »: « Variant patterns of sibilant debuccalization in Camuno: Phonetic and phonological implications of *s > h in Valcamonica »

Local committee
Ioana Chitoran
Cécile Fougeron
Anne Hermes

SRPP: Vowel frication in two Cameroonian languages

By most definitions, vowels are produced by vocal tract configurations which do not produce supralaryngeal frication noise when they are voiced. This definition has persisted even though high vowels with apparent aperiodic noise generated upstream of the vocal folds are attested. Beyond those cases occurring intermittently and incidentally due to coarticulation with consonants, vowels which are produced with strong frication, and seemingly articulated to enhance this frication, are also attested (e.g. “apical vowels”, Shao & Ridouane 2023; “fricative vowels”, Connell 2007). In short, some vocoids appear to have targets for production of frication. In this talk, we present data from two Bantoid languages of Cameroon, Kom and Mundabli, which exhibit vowel quality contrasts which seem to be based in part on aperiodic noise. These are the first acoustic-phonetic studies of either language, focusing on the implementation of contrasts between canonical high vowels such as /i ɨ u/ and the phonemically contrastive fricated vowels, which range from more vowel-like /i̝ u̝ / as in Mundabli or more fricative-like /z̩(ɨ) v̩(ɨ)/ as in Kom. We use zero-crossing rate (ZCR) to characterize the frication, in addition to providing formant frequency and bandwidth data. Results show that each language’s contrasts involve both formant- and frication-related distinctions, with substantial between-language differences in the degree and timing of frication production, and formant frequencies seemingly unreliable in some cases. We take this data as further evidence for vocoids with inherent targets for frication, and that languages may exhibit a range of different frication targets on vocoids, not too different from other fine-grained subphonemic differences observed across vowel systems (Disner 1983).

SRPP: Synthesized speech as a tool in speech perception research

Much of the research investigating the impact of prosody on the perception of a speaker’s attitudes and characteristics has relied on attempting to elicit prosodic features in read speech, or artificial manipulation of recorded audio. Our novel method of implicitly controlling prosody in synthesized spontaneous speech provides a powerful tool for studying speech perception and can provide better insight into the interacting effects of prosodic features on perception while also paving the way for conversational systems which are more effectively able to engage in and respond to social behaviors. I will discuss the work presented at Interspeech 2022, which examined the combined impact of filled pause location, speech rate and f0 on the perception of speaker confidence. I will also discuss some more recent work on the perceptual impact of disfluencies, and goals and ideas for future work.

SRPP: Prosodic cues in broad, narrow and contrastive focus in Persian

Focus and its effects on sentence prosody have been the subject of numerous studies in different languages. Persian prosodic focus was previously dealt with in some experimental studies. These studies, however, investigated the influence of only one focus condition, namely contrastive focus, on the production of an utterance. In addition, the methods used to elicit this focus condition are mostly based on read speech and may not mirror the processes involved in spoken interaction. To fill this gap, the current research scrutinizes the prosodic cues in different focus types in Persian, namely, broad, narrow, and contrastive focus (accented), in relation to the background situation (unaccented). We address this issue in a detailed analysis concentrating on both laryngeal (intonational f0 movements) and supralaryngeal articulations (lingual and labial articulation of consonants and vowels). In this colloquium, the initial results from the acoustic part of the project will be presented and discussed in light of the existing literature. This endeavor as a collaborative study is part of SFB 1252 Prominence in Language.

SRPP: Le rôle de la réduction phonétique dans I’expression de la proximité sociale : Étude acoustique des voyelles orales du français québécois dans différentes situations de communication

Cette thèse propose une analyse de l’adaptation de la parole à l’interlocuteur·trice à travers les prismes de l’hyper-hypoarticulation et de l’accommodation. Nous analysons 140 436 occurrences de voyelles produites par 20 locuteurs et locutrices du français québécois dans le but d’observer leur positionnement en termes de réduction phonétique en fonction de la tâche demandée et de l’interlocuteur·trice avec qui elle est réalisée. Chacune de ces 20 personnes a dû lire des mots en isolation et dans des phrases porteuses, et réaliser une tâche d’identification de différences entre images seul·e, avec son ou sa conjoint·e, avec un enquêteur inconnu québécois et avec une enquêtrice inconnue française. L’analyse des voyelles à travers quatre mesures spectrales et une mesure temporelle montre globalement les patterns suivants (du plus au moins hypoarticulé) : Interaction < Seul·e < Lecture, et Couple < Inconnu Local≤ Inconnue Française. La prise en compte de quatre types de mesures rythmiques montre également que la parole est plus lente lorsque la distance sociale entre les interlocuteur·trice·s est plus forte et dans les tâches les moins interactives. L’hypoarticulation paraît donc être un moyen possible pour véhiculer des informations concernant la distance sociale, par une adaptation au type d’interlocuteur·trice avec qui se déroulent les interactions.

This thesis proposes an analysis of speech adaptation to the interlocutor through the prisms of hyper-hypoarticulation and accommodation. We analyze 140,436 tokens of vowels produced by 20 speakers of Quebec French in order to observe their positions in terms of phonetic reduction according to the task performed and the interlocutor with whom it is performed. Each of these 20 speakers had to read words in isolation and in carrier sentences, and perform a task of difference identification between images alone, with his or her spouse, with an unknown interviewer from Quebec, and with an unknown French interviewer. Vowel analysis by the mean of four spectral measures and one temporal measure shows overall the following patterns (from most to least hypoarticulated) : Interaction < Alone < Reading, and Couple < Local Stranger ≤ French Stranger. The analysis of four rhythmic measures also shows that speech rate is slower when the social distance between interlocutors is strong and in the less interactive situations. Hypoarticulation therefore seems to be a possible mean of conveying information concerning social distance, by adapting to the type of interlocutor with whom the interactions take place.

SRPP: Resolving the bouba-kiki effect enigmas

One central property of natural languages is the arbitrariness of the sign: the sounds of words do not generally inform about their meaning. Still, systematic sound symbolic exceptions to this principle, found in a wide range of languages’ lexicons, have always intrigued scientists and non-scientists. The so-called “bouba-kiki effect” is the textbook case for sensitivity to iconic sound symbolism. When presented with a sound–shape matching task, perceivers almost systematically associate auditory pseudowords such as “bouba” with round or smooth shapes, and others, such as “kiki”, with spiky or angular shapes. Since its first report, the bouba-kiki effect has been robustly replicated across languages, cultures and stimuli, suggesting that it relies on universal cues.
Yet, two intriguing problems remain. First, the stimuli parameters involved in this effect are still unclear. I will first show in a series of experimental studies in adults that both consonants and vowels appear to play a role in the bouba-kiki effect, though consonants to a larger extent. Then, by combining a meta-analysis of independent findings with computational modeling, I will show that this effect mostly relies on two independent non-speech acoustic parameters.
The second issue deals with the underlying mechanisms that could explain such audiovisual associations and whether the bouba-kiki effect is innate or can be learned through exposure to audiovisual regularities in the environment. Indeed, in spite of being very robust in adults and shared by most language and cultures, I will present a series of cross-modal experiments and a second meta-analysis on infants suggesting that this effect is learned. Then, I will present a mathematical demonstration combined with a series of intuitive physics experiments in adults, evidencing the physical principles creating universal audiovisual regularities in the environment and underlying the bouba-kiki effect. Finally, I discuss how this causal mechanistic account provides a complete and coherent resolution of the bouba-kiki effect enigma. I will conclude by raising some new perspectives on how language and human multimodal perception of natural scenes are intrinsically linked.

SRPP: Foreign language acquisition of novel L2 segments

In this talk, I am discussing results from two studies that look at how learners acquire perceptually similar sibilant fricatives. When language learners acquire a new language, they are also charged with the task of learning novel segments. In cases where the novel segment is perceptually similar to an L1 segment, acquisition is often slowed or even blocked (Flege, 1995; Best & Tyler, 2007). My goal for this research project is to better understand the L2 acquisition processes and examine how the link between perception and articulation plays a role in second language acquisition.

In the first study, I examined the acquisition of the three-way sibilant contrast, /s, ʂ, ɕ/, by L2 Lower Sorbian learners. In this context, there are no remaining L1 speakers who are teachers, the L1 population is also extremely small, and L2 learners often have no access to L1 speakers. In the second study, I examined L2 acquisition of the three-way sibilant contrast, /s, ʂ, ɕ/, by L2 Polish learners. In this context, regular pronunciation training has taken place to attempt to correct speech errors. The results will be discussed in context of L2 models of acquisition and foreign language acquisition contexts.