A corpus-based study of 16th-century Slovene clitics and clitic-like elements

Jelovšek, Alenka; Erjavec, Tomaž

dc.contributor.author	Jelovšek, Alenka
dc.contributor.author	Erjavec, Tomaž
dc.date.accessioned	2019-10-29T19:35:39Z
dc.date.available	2019-10-29T19:35:39Z
dc.date.issued	2019
dc.identifier.uri	http://hdl.handle.net/1808/29671
dc.description.abstract	This paper undertakes a corpus-based linguistic investigation of the spelling variation in 16th century Slovene both from the diachronic and synchronic points of view. The investigation is based on a manually annotated sample (approx. 14,000 word tokens) from Primož Trubar’s Ta pervi deil tiga Noviga teſtamenta, 1557, and Hiſhna poſtilla, 1595, and Jurij Juričič’s Poſtilla, 1578, and it concentrates on clitics and clitic-like elements. Statistical analysis, based on comparison of the spelling conventions of the early modern period to those of contemporary Slovene using normalised forms of the originals, where we observe cases where one orthographic word is nowadays written as two or more words (1–n mapping) or vice-versa (n–1 mapping), shows that the overall percentage of split and joined word tokens is 5.7%, with JPo 1578 having the highest percentage, and TPo 1595 the lowest, less than half of that of JPo 1578. Of these, the vast majority is for cases where a word is now split. The most predominant among the bound words are non-syllable prepositions v ‘in(to)’, k ‘to’, and z ‘with’, followed by negative proclitic ne ‘not’, enclitic particle li ‘whether, if’ and in rare instances conditional particle bi, reflexive particle se, na ‘on’, ob ‘at, by’, pri ‘at, beside’ and za ‘for, behind’ (the absolute numbers of specific clitics partially correlate with the prevalence of bound variants in comparison with the freestanding variants of those clitics, with the most frequent being predominantly bound while the least frequent are predominantly freestanding). Individual instances of two accented words written together can be attributed to German influence (figino_drevo, der Pfeigenbaum ‘fig tree’). The cases where one modernised word correlates to two original words are, with the exception of superlative adjective/adverb prefix naj-/nar- ‘the most’ that is orthographically bound with its root in about 25% of instances, sporadic or can be identified as errors in the original books. Of interest are also cases when beginnings of words that are homonymous with non- or one syllable prepositions are separated from the remainder of the word with an apostrophe (eg. s’_nameinja ‘signs’, s’_derſhati ‘to endure’, do_bruta ‘goodness’, sa_doſti ‘enough’). The normalisation also enables the identification of the orthographical variants of the most commonly bound clitics, i. e. non-syllable prepositions k, z and v. K and its allomorph /h/ have 5 attested spelling variants, of which one <q_> is limited to hosts starting with a v-. For z with a voiced allomorph /z/ and voiceless allomorph /s/ three variant spellings were discovered that only partially correspond with a voiceless/voiced distinction of the initial sound of the host word, and the cases of merging with the host that begins with s-/z- were identified. Additional positional spellings probably represent other allomorphs: <sh/ſh/s’h> for palatalized /ž/ in front of a palatal ń and <ſa>, >ſo/so> for syllabified /za/, /zo/. The preposition v shows the highest degree of orthographical variation of all analysed words as it has 10 different spellings: general bound <v_> and <u_> and freestanding <v’_>; <uv_>, <uv’_> and <u’_v> in front of a vowel; <u’_> and <va_> attested only in front of a v-, as well as <v_> and <v’> merged with the initial v- of the host. The analysis of spelling variation in non-syllable prepositions showed that even a relatively limited hand-corrected annotated sample enabled identification of majority of spelling variants identified in previous works, while with the use of noSketch Engine tool further information about their relative frequency and distribution was obtained. As the hand-corrected corpus is expanded such research will yield even more relevant information for the study of the 16th century Slovene literary language that will significantly supplement existing findings (based on traditionally collected examples) with the help of a large amount of statistically relevant data.	en_US
dc.title	A corpus-based study of 16th-century Slovene clitics and clitic-like elements	en_US
dc.type	Article	en_US
kusw.oaversion	Scholarly/refereed, publisher version	en_US
kusw.oapolicy	This item meets KU Open Access policy criteria.	en_US
dc.rights.accessrights	openAccess	en_US

Files in this item

Name:: Jelovsek_Erjavec-3-19.pdf
Size:: 356.6Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

The University of Kansas prohibits discrimination on the basis of race, color, ethnicity, religion, sex, national origin, age, ancestry, disability, status as a veteran, sexual orientation, marital status, parental status, gender identity, gender expression and genetic information in the University’s programs and activities. The following person has been designated to handle inquiries regarding the non-discrimination policies: Director of the Office of Institutional Opportunity and Access, IOA@ku.edu, 1246 W. Campus Road, Room 153A, Lawrence, KS, 66045, (785)864-6414, 711 TTY.