Representational considerations in models of sound change
Rebecca Morley (Ohio State)
Tuesday 25 April 2017, 11:00–12:30
1.17 Dugald Stewart Building
One view of phoneme split takes it to be the result of divergent phonetic variants (e.g., Janda and Joseph 2003). Closely tied to this view is the hypothesis of iterativity: socially motivated phonetic exaggeration accumulating over successive generations (e.g., Labov 1972, Guy 1980), or progressive reduction of frequent words over time (Phillips 1984, Bybee 2002). Iterativity is often assumed to be an inherent property of exemplar models. In a typical scenario production starts with the selection of a token from the desired category. The token is then reduced, lenited, or otherwise altered in some way, resulting in a new phonetic token. The new token is added back to the cloud of stored tokens, and the process starts over again (see Pierrehumbert (2001)). Via this production-perception loop words can be reduced two or more times with respect to the originating token. As more frequent words are more often produced, the chances of multiply reduced tokens are higher. However, contrary to expectation, the mechanism described does not consistently result in shorter word lengths for high-frequency vs. low-frequency words. If frequency of occurrence is expressed in number of tokens, and sampling for production is random, then producing a less reduced token is also more likely in high, than low, frequency categories. And regardless of whether tokens decay, or are replaced, the low-frequency category will eventually ‘catch up’ with the higher-frequency category, and all words will achieve some optimal length. In fact, the production side of this model makes even more problematic predictions. If phoneme-level tokens are selected at random from a phonetically detailed exemplar cloud then egregious mismatch is possible; e.g., an [æ] originally followed by an [m] being selected for a pre-[b] context. It is the same at the word level: a word token originally produced in a frequent collocation, selected for a low-frequency context, etc. Indexing exemplar clouds with all the necessary contextual information, however, results in an explosion of categories, and a depletion of category members. In the limit, each category would contain a single member.
Developing a model for the interaction of synchronic variation and diachronic change requires resolving these and other representational issues, some of which only surface when the entire trajectory of change is considered. Thus, while existing models can capture category shift and merger (Pierrehumbert 2001), or contrast stability and dispersion (Garrett and Johnson 2013, Wedel 2004), there are few that can capture both. The model of Sóskuthy (2013) can generate phoneme split, no-change, and no-split with phonetic shift, as the result of vowel lengthening before voiced obstruents. However, these outcomes require a representational structure in which vowel categories contain at least two sub-categories: pre voiced-obstruent, and pre voiceless-obstruent. Crucially, these subcategories are semi-permeable, and greater frequency of occurrence can cause one sub-category to subsume the other. This scenario raises another unresolved question in exemplar modeling: the interaction between higher and lower level categories. Most models work exclusively at one level, and assume the others. But the process by which the necessary categories at the sub-word level are generated from the word level (or vice versa) is non-trivial, and may not be consistent with model assumptions. A category as abstract as “vowels occurring in environments followed by a voiced obstruent” requires a massive amount of generalization over words with different syllable structures, over obstruents at different places of articulation, etc. And if speakers create categories such as this, then they can be expected to create categories such as “vowels before coronals”, etc. It is not at all clear that existing models will be able to ‘scale up’ adequately under this added complexity.
This work gives a formal account of the representational commitments and assumptions of a range of models, and an assessment of their self-consistency. The claim is that the resolution of outstanding problems lies in determining the division between representations and processes. I argue, on the one hand, that phonetic effects such as “vowel lengthening”, or “vowel nasalization” are not processes themselves, but reside at the representational level. On the other hand, speaking rate must be able to apply after exemplar selection to compress or expand tokens as necessary to match speed of production. I consider prosodic effects, such as phrase-final lengthening to be necessarily processual as well. The ramifications of these representational choices are discussed with respect to the necessary constraints on a model deriving categorical sound change from existing synchronic variation.