SRPP: Coarticulatory variability of vowels in Hungarian as measured in quality shift and across context variability

Collaborative work with Márton Bartók1,2, Tamás Gábor Csapó2,3, Tekla Etelka Gráczi2,4, Kornélia Juhász1,2 and Alexandra Markó1,2

1Eötvös Loránd University, 2MTA-ELTE ”Lendület” Lingual Articulation Research Group, 3Budapest University of Technology and Economics, 4Research Institute for Linguistics HAS

In most studies, vowel-to-vowel coarticulation induced variability of vowels is identified as the distance of vowel realisations (as expressed in articulator positions or formant values) in coarticulated, e.g., [ihu], and neutral, e.g., [uhu], contexts, that is, in quality shift. This so called distance measure, is found to be conditioned by several parameters, e.g., vowel quality of the target vowel (especially openness), prosodic prominence of the target and the trigger vowels (which modulate coarticulatory resistance and coarticulatory aggression, respectively), and the direction of coarticulation, while results with respect to the factor of the density of the (phonological or phonetic) vowel space of a given language, for instance, seem to be contradictory. Theoretically, however, contextual variation of vocalic segments may just as well be detected using the distribution of the values of some acoustics and/or articulatory parameter emerging across-contexts, that is, standard deviation, which is also commonly used in visualizing vowel distribution in the two-dimensional (acoustic, and in some cases, articulatory) vowel space. In my talk I present a series of studies where this second type of variance measure, across-context variability, or as we refer to it, the dispersion measure was explored and compared to the distance measure in Hungarian. In these studies, we tested if dispersion is also modulated by the prosodic prominence of the target and the trigger vowels, and by the direction of coarticulation. We obtained simple acoustic and articulatory measures of /i/ and /u/ realisations, and analysed them in a parallel manner. Our findings suggest that across-context variability (i.e., dispersion) captures a different aspect of coarticulatory variation that the distance measure, while diverging results emerging in the analysis of the two measures poses the question which of the two is a sufficient measure to capture the features of coarticulatory resistance and coarticulatory aggression reliably.

SRPP: The functional load of stress in Kambaata (Cushitic)

It is generally agreed that Cushitic languages have grammatically determined tonal accent or stress. However, in the available grammars of individual languages, the functional load of accent or stress is often only superficially described and does at times not go beyond the presentation of some illustrative minimal pairs. This paper sets out to describe the stress system of Kambaata, a Highland East Cushitic language of Ethiopia, in more detail. Every Kambaata word has one prominent syllable. Stress has (almost) no lexical importance (exception: stress on interjections and ideophones). Instead, the realisation of stress on nouns, verbs and adjectives is determined by the inflectional categories and values for which a word is marked. The stems of nouns, verbs and adjectives are unspecified for stress, but stress is imposed by inflectional morphemes. All (but one) inflectional morphemes in Kambaata have a segmental as well as a suprasegmental realisation. In my talk, I propose a typology of Kambaata inflectional morphemes depending on where they realise stress in a word. After a presentation of the general features of the Kambaata stress system, I present two case studies: (i) I demonstrate the importance of stress for case marking and (ii) I discuss relativisation in the imperfective and perfective aspect, which is marked by a stress-only morpheme.

SRPP: Variability and central tendencies in speech production

Speech is notoriously variable, but our understanding of this variability continues to evolve. Variability has typically been taken as an indication of failure to reach a desired target due to physical or neurological limits. However, it is likely that some variability is beneficial, an effect that has been found in other domains. Part of the effort to separate beneficial from destructive variability must be to understand the distribution of values around a speech target. One aspect that is commonly measured is the standard deviation of some objective aspect of speech. The standard deviation is most meaningful for normal distributions, and the assumption in speech research has been that values are indeed normally distributed. This has not been rigorously tested, however, as the test of normality requires a large number of samples (some studies suggest a minimum of 200) to determine whether the data is normally distributed or not. Speech research (and, indeed, most research with humans) seldom reaches such numbers for a consistent environment. Here, an initial estimate for 300 repetitions of English words by a single speaker are presented. The words were pseudo-randomized with an equal number of filler items, so that immediate repetitions (and the neural and physical fatigue repetition can cause) were avoided. One hundred trials were collected on each of 3 days. Words were chosen to have very little coarticulatory influence (“heed,” “ode”/“owed”) or sizable coarticulatory influence (“geek,” “dote”). Measurements of vowel formants at acoustic midpoints indicated that the distributions were indeed normal. This was true even of the high coarticulatory environment, which some theories would predict would be skewed by the vowel’s reaching the edge of an acceptable region. The current results indicate that vowel targets are consistent for different environments. Further, the range of the distributions was quite similar across the two types of environment, being, for example, about 100 Hz for F1. The amount of variability is fairly substantial but can be presumed to be beneficial, as all items were heard correctly. The normality of the distribution nonetheless indicates a control structure that accommodates the coarticulatory environment at the level of planning.

SRPP: Prosodically-driven harmony in Strict CV: a Celto-Semitic case

Handout

In Barra Gaelic (BG; Boseh de Jong 1997), stress (underlined) is generally word-initial and correlates with a high tone [aHran] ‘bread’. A harmonic (or “copy”) epenthetic vowel is inserted in the environment /#(C)VC1_C2(…)/, where C1 is any sonorant and C1 and Care hetero-organic: e.g. /t̪ɔrɣ/ => [t̪ɔrɔHɣ] ‘fishing line’. As shown in the transcription, this epenthetic vowel is doubly interesting: i. it carries the high tone despite being peninitial, and ii. it is as stressed as the initial vowel..

In Modern Hebrew (MH), stress is generally final: [mufsak] ‘begin.pass.prtc’. A harmonic process transforms [a] to [e] before a word-final unstressed sequence [eC]: [mufsek-et] ‘begin.pass.prtc-fm’. The unstressed [e] must be analyzed as epenthetic/weak.  Like BG peninitial epenthesis, the MH case is typologically strange: the weak, unstressed vowel triggers harmony on the lexical stressed one.

We propose a Strict CV account of both patterns that highlights their similarities. In both languages, a prosodic domain must be edge-aligned (to the left in BG, to the right in MH). When epenthesis obliges the aligned domain to span two V-slots, two effects follow. First, its non-aligned edge must also be marked: this is done by H in BG and by stress in MH. Second, the span of two V-slots must be signaled by harmony.

SRPP: Clear, clearer, charismatic? – On the role of pronunciation for perceived speaker charisma

It is one of the oldest assumptions in rhetoric that pronunciation is positively correlated with a speaker’s persuasive impact on listeners. For example, both rhetorical manuals and trainers emphasize the importance of « a clear enunciation of every word and every syllable » (Jahnke 2011: 91, Mortensen 2011) and urge speakers to pay special attention to consonants in this matter (Barker 2011). The present talk contrasts this traditional rhetorical assumption with experimental phonetic evidence. Results suggest that the rhetorical assumption is not fundamentally wrong, but oversimplified in various respects. Overly clear pronunciation (e.g. the lack of even common phenomena of speech reduction) can backfire on the speaker, making him/her sound arrogant and vain rather than charismatic. In addition, vowels are by no means negligible, and the way their pronunciation relates to the speaker’s perceived charisma is complex and formant-specific. Based on that, the talk concludes with the presentation of a new tool (MARRYS, mandibular action related rhythm signals) that is currently being developed and tested for pronunciation-oriented speaker-charisma training.

Why is speech special? An efficient neural coding perspective

En raison du COVID-19 ce SRPP aura lieu en ligne

It has been proposed that through evolution, the perceptual systems have evolved to encode external signals in the most optimal, efficient way possible. This has been amply demonstrated in the visual system, and more recently, evidence is emerging that the auditory system might also obey principles of efficiency. This talk will present a series of behavioral and brain imaging studies in which we investigate different aspects of efficient coding of speech and other natural sounds in the developing human auditory system. Additionally, I will also present information theoretical analyses of the statistical structure of speech in different languages, suggesting that linguistically relevant properties have important acoustic correlates in the speech signal when this latter is analyzed through principles of efficiency.

Traits distinctifs et gestes articulatoires des consonnes non-pulmoniques Hadza et Iraqw

En raison du COVID-19 ce SRPP aura lieu en ligne

Le Hadza (langue Khoesan) et l’Iraqw (Langue couchitique) ont des consonnes non-pulmoniques, clicks et éjectives, dans leur inventaire phonétique/phonologique. Ces consonnes nécessitent une description articulatoire et acoustique précise, pour les interpréter et les formaliser en termes de gestes articulatoires et de traits distinctifs. L’enjeu est aussi de comprendre les mécanismes de la production de ces segments et comment ce type de données teste, ou étend, les limites de nos connaissances sur la diversité du fonctionnement de la production de la parole dans les langues.

Sands, Maddieson et Ladefoged (1996) et Sands (2013) ont décrit le Hadza avec 9 clicks tandis que Miller (2008) suggère qu’il y en a 12. Des données récentes montrent que cette langue en présente 16 : [ʘ̰, |, |ʔ, |h, ŋ|, ŋ|ʔ, !, !ʔ, !h, ŋ!, ŋ!ʔ, ‖, ‖ʔ, ‖h, ŋ‖, ŋ‖ʔ]. Une des questions fondamentales avec les clicks est de décrire et de formaliser leur description du point de vue articulatoire et acoustique. Les 4 types de clicks Hadza, bilabial, dental, alvéolaire et latéral [ʘ, |, !, ‖] peuvent être accompagnés, de manière contrastive, par des traits aspirés, glottal et nasals, ces traits étant même parfois combinés. Un click peut ainsi être une suite et une superposition de gestes et de traits, par exemple, nasal, latéral et aspiré, [ŋ‖̰h]. Du point de vue acoustique, les clicks sont décrits avec 2 traits [grave vs aigu] et [abrupt vs bruité] en suivant une proposition de Traill (1985). Les clicks dental [|] et alvéolaire [!] du Hadza sont [grave] et bruité [|] ou abrupt [!]. Le click latéral [‖] est [grave et aigu]. Cette description est déduite de l’examen des spectres acoustiques pris au relâchement des clicks.
Un point important à souligner concerne la biomécanique de la production des clicks, dont les mouvements d’articulateurs sont semblables à des gestes de déglutition, mais sans bol alimentaire. Le geste contraire est celui des consonnes éjectives où le larynx s’élève avec la glotte fermée. Le Hadza en comporte 6 : [p ́, ts ́, tʃ ́, cʎ̥ ́, kx ́, kχw ́]. Acoustiquement, la forte intensité du bruit de relâchement des éjectives les rend comparables aux clicks. Leurs traits acoustiques suivent une gradation de [grave] vers [aigu] et des variations [±bruité] du relâchement. Le caractère labialisé de l’éjective [kχw ́] est une caractéristique intrinsèque et non une articulation secondaire. Le geste du click labialisé (comme celui des autres consonnes labialisées du Hadza et de l’Iraqw) est différent de celui de l’arrondissement et la production rencontrés avec les approximantes labio-vélaires [w]. Le click labialisé [kχw ́] montre un geste d’approximation des lèvres et l’approximante labio-vélaire [w] une protrusion et un arrondissement. La comparaison des éjectives Hadza et Iraqw montre des détails importants dans les mécanismes de production de ces sons. Le geste articulatoire des éjectives Iraqw implique un mouvement initial quasi horizontal du larynx, provoqué par l’activité du constricteur pharyngien inférieur. Ce mouvement précède celui de l’élévation du larynx, souvent très marqué dans cette langue. Des données de palatographie suggèrent, en outre, un renforcement de la constriction supra laryngienne comme pour les éjectives latérale alvéolaire [tɬ ́] de l’Iraqw et l’éjective palatale latérale [cʎ̥ ́] du Hadza. L’aspect temporel des gestes impliqués dans la production des consonnes non-pulmoniques Hadza et Iraqw crée des difficultés pour la description de ces segments uniquement en termes de gestes articulatoires. La combinaison des gestes impliqués dans leur production avec la description acoustique de leurs principaux traits permet de catégoriser plus robustement ce type de segments.

Miller, K. (2008). Hadza grammar notes. Riezlern.
Sands, B. (2013). Hadza. In R. Vössen ed. The Khoesan Languages. London. Routledge
Sands, B., Maddieson, I. et Ladefoged, P. (1996). The phonetic structures of Hadza. Studies in African Linguistics. 25, 2. 171-204.
Traill, A. (1995). Phonetic and phonological studies in !xóõ. Hamburg. Buske.

Prosodic structure as an interface between rhythmic and intonational patterns

In most studies on prosodIc structure, two or three levels of constituency above the prosodic word are usually assumed: the accentual phrase (also named minor phrase or clitic group), the intermediate phrase (also named major phrase or phonological phrase) and the intonational phrase. The different names assigned to these units often reflect distinct perspectives in apprehending prosodic structure, among which we may distinguish an intonation-based approach and a grammatically-driven approach. Because of these differences, endless debates exist on the validity of the various units.

In this communication, based on analysis of French prosody and on an examination of the intermediate phrase, we will argue for an approach that clearly distinguishes between metrically and intonationally-based prosodic units. First, we will clarify the extension and status of the intermediate phrase in such a way as to consider it essentially as a metrically-driven prosodic unit. Second, a distinction will be made between this metrically-driven phrase and two types of intonational phrases on the basis of the intonational contours occurring at their right edge.

This proposal is based on (a) the inventory and possible realisations of the contours at the right edge of these phrases, and (b) their relation with the morpho-syntactic and semantic structures. Note that our proposal accounts for phrasing and intonation contour choice at the underlying phonological level, the way the contours are realized being seen as resulting from choices made in other parts of the grammar and from performance factors.

Relative vs. absolute orientation in sign language: The case of two-handed signs

The segmental phonology of sign language is currently modeled with feature geometry and dependency relations. These models typically assume three phonemic classes as primitives (handshape, place of articulation and movement), and derive a fourth, orientation, as a result of the interaction between handshape and place of articulation. Current sign language models approach orientation as a relation between a hand-part and a plane of articulation. This relative way of defining it allows getting rid of the reference to the body as a landmark.
The goals of this study are i) to provide evidence for the need of absolute orientation in addition to relative orientation in order to capture the phonology of some signs, and ii) to minimally enrich current models which are only based on relative orientation so that the phonology of these “exceptional” signs is also accounted for.

We use French sign language symmetrical two-handed signs produced on the body like BELT, BONE, TABOO, UNEMPLOYMENT as a case study. We show that relative orientation does not meet descriptive adequacy when the two hands contact each other. Relative orientation can either capture the contact between the hands or the contact with the body, but not both. We propose secondary planes as a formal step to model orientation for these signs. While the implementation of this solution requires minimal changes in current theories, the impact on the whole theory of segmental phonology for sign is quite important. The core conceptualization of orientation as a purely relational phonemic class does not hold anymore (at least not for these signs), as secondary planes impose geometrical restrictions that force absolute orientation.

The phonetic basis of speech preparation

Silent phases before speech initiation are often seen as the time-interval during which the
utterance is planned. In most studies on pauses the focus is on cognitive and linguistic
factors such as word frequencies or utterance complexity. The aim of our study is to
investigate how phonetic factors affect these silent phases. In particular we are interested
in the physiological aspects of speech initiation such as breathing, articulatory posturing and
coordination of breathing and oral gestures. Pilot studies from three areas will be presented
here: (1) the effect of breathing on reaction time, (2) the coordination of respiratory activity
and breathing during interspeech pauses and (3) the effect of answer type on gap duration in
dialogues.