In the first part of this talk, I will introduce DoReCo, an initiative to create a multilingual reference corpus, consisting of at least 10,000 words for at least 50 languages. DoReCo extracts from fieldwork-based language documentation collections narrative texts that are already transcribed, translated into a major language, and morphologically analyzed. Within DoReCo, we convert these data to a common file format and time-align them at the phoneme level using the MAUS software. In the second part of this talk, I will present two cross-linguistic studies on a subset of this corpus: One study investigates word lengthening as a function of utterance-final position. Another, still ongoing study investigates pause probabilities before nouns vs. verbs and relates findings to the fact that, typologically, there are fewer prefixes on nouns vs. verbs.
Prochains événements
Voir la liste d'événementsSRPP 30/01/2026 Alexei Kochetov
Alexei Kochetov (University of Toronto)
SRPP 06/02/2026 Cédric Patin
Cédric Patin (Université de Lille)
SRPP 20/02/2026 Takayuki Nagamine
Takayuki Nagamine (UCL)


