|dc.description.abstract||Lexicostatistics is decades old, but newer techniques for computational approaches to historical linguistics have gained new attention with the rise of more sophisticated methods of data handling. Thus, for example, Gray and Atkinson (2003) claim to have established, using lexicostatistics and a Bayesian (MCMC) model, an authoritative Stammbaum for the Indo-European language family, including absolute chronologies of its branching. Others have argued that such methods, while valid for biology, cannot yield authoritative dates for language data (Atkinson 2009: 128).
The present paper examines a smaller subset of languages—Slavic—using new lexicostatistical methods in attempt to compare the computational results with received analyses that are closer to the present. We assume that examining a group of languages closer in time to the present, where the splits are more easily verifiable, allows testing of quantifiable methods. If a close fit can be found between a lexicostatistical approach and traditional analysis in Slavic, it should allow extension to greater time depths and larger families such as Indo-European.
The present paper applies several methods to two corpora, one the Slavic subset of Indo-European in Dyen, Kruskal and Black (1992) and the Slavic text-token set in Mańczak 2004.ReferencesAtkinson, Quentin D. 2009. Review of Language Classification by Numbers. By April McMahon and Robert McMahon. Oxford: Oxford University Press, 2005. Pp xvii, 265. Diachonica 26/1: 125–133.
Dyen, Isidore, Joseph B. Kruskal, and Paul Black. 1992. An Indoeuropean Classification: A Lexicostatistical Experiment. Philadelphia: American Philosophical Society.
Gray, Russell D. and Quentin D. Atkinson. 2003. Language-Tree Divergence Times Support the Anatolian Theory of Indo-European Origin. Nature 426: 435–439.
Mańczak, Witold. 2004. Przedhistoryczne migracje słowian i pochodzenie języka staro-cerkiewno-słowianskiego. Cracow: PAU.||en_US