Historical linguistics blog - even weekends

Words for Easter in Eurasian languages, defined by their meaning. Words for Easter in Eurasian languages, defined by their meaning.
This week, also known as the Holy Week, is part of the holiday that in English goes by the name of Easter. Easter, which is celebrated throughout all of the Judaeo-Christian world, is one of the most important festivities of the year, marking the beginning of spring or summer and the resurrection of Christ. Like most Christian holidays, the roots of Easter go back into pagan times. In particular in Northern Europe, many of the mysterious habits of an ancient spring festival have survived until today. Children chase an unvisible easter hare, which puts candy-filled eggs in the grass. Birch twigs are compiled, taken indoors, and ornamented with painted eggs and feathers. Children also dress as witches or 'easterhubbies' (the difference is whether you wear a scarf or a hat), painting their faces with red dots, and go from door to door asking for candy. Afterwards, they are supposed to fly on their brooms to Brocken. Fires and fireworks are lit, and, most importantly, enormous quantities of egg, fish, meat, and candy are consumed.

So, which are the terms we use for this festival? Most languages have form of the Greek (via Latin) word paskha, itself borrowed from Aramaic (Hebrew Pesach), meaning 'passover'. The West Germanic terms, such as English Easter and German Ostern, go back to a Common Germanic goddess of spring, Old English Eastre, which is identical to the Indo-European goddess of dawn *h2éus-ōs (Sanskrit uṣās, Latin aurōra). Other languages have words that in various ways relate to the basically biblical rituals of Easter, including 'sacrificial animal', 'taking of the meat', 'resurrection', 'great day' or 'great night', or 'liberation'.

Just as with the Christmas words (see http://www.gerdcarling.se/i/a32842142/2018/12/), the map of meanings of Easter unveil important information about various cultural spheres, as well as exceptions in the form of islands of different usage.

With this little etymological overview I would like to wish you all a Happy Easter!

Lubotsky, Alexander. Brill Online Dictionaries: Indo-European Etymological Dictionaries Online (https://dictionaries-brillonline-com.ludwig.lub.lu.se/iedo). Accessed 2019-04-17.
Troels-Lund 1932. Dagligt liv i Norden på 1500-talet. VII Årets fester. Stockholm: Bonniers.
Andersson et al 1968. Kulturhistoriskt Lexikon för Nordisk Medeltid XIII. Malmö: Allhems förlag.

I thank Ante Petrović for assistance with compiling/checking data for the Easter map.

Wikipedia has an excellent overview of names of Easter: https://en.wikipedia.org/wiki/Names_of_Easter
Läs hela inlägget »
Heatmap of frequency of source and target language of loan events in our data, defined by language power and population size (from 1-5). Graph  by Johan Frid. Heatmap of frequency of source and target language of loan events in our data, defined by language power and population size (from 1-5). Graph by Johan Frid.

In the previous blogpost, I started a compilation of safe loans from and into Tocharian. I will continue this work in the next post. In this post, I will talk about loan directionality, since I am currently completing a paper (with several co-authors) on lexical borrowability in Eurasian languages. I want to say a few word about this project.

We have compiled and extracted all loan events in the lexical database, and tested various statistical measures on this data. Worth noticing is the directionality of loans in contrast to language power as well as the differential source languages of the families. As I have described in recent posts, our data set on lexical data compiles culture concepts, i.e., words for farming, technology, hunting, and war, which have a presumed age that go at least back to the Chalcolithic. This means that this vocabulary is not representative for the entire lexicon, only these specific domains. Loans are also extended over long periods, at least back to antiquity. If we look at the source languages, we notice that they differ between families. In Indo-European, Latin is most frequent, followed by Middle Low German, French, Old French, Slavic, Classical Greek. In Caucasian, Turkic languages dominate, followed by Persian, Georgian, and Arabic. In Uralic, Scandinavian languages dominate, which is mainly due to the fact that our Fenno-Ugric languages dominate in our data (see pictures below).

The correlation between loan directionality and language power and populations size is also noteworthy. We define the power of languages by a quantitative rank based on several features, including literary power, economic power and population size. This we plot against the occurence as source and target language in loan events (see graph above). All languages are equally likely to be target languages, but the most powerful languages are more likely to be source languages. This is a significant correlation. The most frequent loan event is from a very powerful language to a very weak. The second most frequent language is from a medium powerful to a weak. The third most frequent loan is from a medium powerful to a medium powerful language. In scrutinizing the data, we observe that this type of loan event is almost entirely restricted to the middle ages, which is also an interesting result. Unequality between languages seems to be specific to the antique and modern periods, whereas language contact in the middle ages was more distributed between languages of equal power.

Graph illustrating the most frequent source languages in Indo-European (top), Caucasian (middle), and Uralic (bottom) families.
Graph illustrating the most frequent source languages in Indo-European (top), Caucasian (middle), and Uralic (bottom) families.
Läs hela inlägget »
European-looking farmers or traders in a Chinese tomb from 2nd c. BCE, Hunan province. From Hunan Provincial Museum. Photo: Gerd Carling European-looking farmers or traders in a Chinese tomb from 2nd c. BCE, Hunan province. From Hunan Provincial Museum. Photo: Gerd Carling

I was asked by my friend and colleague Victor Mair (University of Pennsylvania) to come up with my 'safe list' of loans from and into Tocharian. This is a very interesting and challenging topic, which I will continue working upon in a couple of coming posts. First, I will start with the most tricky one: Tocharian loan contacts with Chinese.
Establishing Tocharian loans from and into Chinese are particularly complex for two reasons: first, the reconstruction of Chinese phonology at various stages in the Chinese prehistory, which is connected to many uncertainties and a large amount of debate, and second, the reconstruction of Tocharian phonology, which is particularly tricky and complex. The fundamental question is: How can we be certain that a specific word was borrowed at a certain stage from one reconstructed language to another? The prehistory of both languages can be stratified into various stages, Pre-Proto and Proto-Chinese, Old Chinese (Early and Late) Middle Chinese, and Pre-Proto- and Proto-Tocharian, Common Tocharian, Pre-A and Pre-B, and Tocharian A and B. Beyond that, we have the proto-languages Proto-Indo-European and Proto-Sino-Tibetan, which can be further stratified into stages on their way to Proto-Chinese and Proto-Tocharian. 
How can we know that a word, that obviously looks as if it was borrowed from Indo-European, is borrowed from Tocharian? The answer is that we have to show that specific Tocharian sound changes have taken place in the specific borrowed lexeme. These changes also have to be identified in the target language from the corresponding period. The process is very tricky, and the result is very few certain loans, more uncertain loans, and a huge number of uncertain loans.

Tocharian loans from Old Chinese (before 2nd ct BCE)
Toch. AB klu ‘rice’ was borrowed from Old Chinese: Mod. Ch. dào, Mid. Ch. *dawX, Old Chin. *C-luu-? ‘rice, rice-paddy’ (GSR 1078). In Middle Chinese, the initial cluster OChin. *gl- was simplified to d-. 
Toch. B rapaññe ‘of the last month of the year’ (LP 12 a2 rapaññe meṃne ikäṃ-wine ‘on the second day of the month rapaññe’), an adjective formed on a noun *rāp, from Old Chinese: Mod. Ch. là, Mid. Ch. *lap, Old Ch. *raap (GSR 637j) ‘winter sacrifice’. It is likely that an earlier meaning of the Chinese word is reflected in Tocharian.
Toch. A ri B rīye 'town' < Common Toch. *riye matches the Old Chinese reconstruction of Mod. Ch. lĭ, Mid. Ch. *liX, Old Ch. *r̯ǝ-? (GSR 978a) ‘walled city’. The word may also be a Tocharian loan in Old Chinese.
Further loans include  Toch. A truṅk Toch. B troṅk 'cave' 

Tocharian loans from Early Middle Chinese (possibly 3-4th ct ACE)
TA ṣoṣtäṅk ‘tax collector, banker’ (Skt. śreṣṭhin-) corresponds to Niya ṣoṭhaṃga ‘tax collector’, Bactr. σωταγγο < *šoštaṅgV. A possible source is Mod. Ch. shōucáng, Mid. Ch. *syuw+dzang, Old. Ch. *xiw-N-s-(h)raŋ (GSR 1103a+727g´) ‘receive, accept, gather’ + ‘conceal, store’.
TA ṣukṣ ‘(smaller) village’, TB kwaṣo* ‘village’. Parallel Mod. Ch. sù, Mid. Ch. *sjuwk, Old Ch. *suk (GSR 1029a) ‘lodge, mansion’. Itō & Takashima (1996:401) reconstruct Old Ch. *sjәkw-s with a final *-s (that has a function of localisation and production of nomina actionis etc.).
Toch. A āṅk* ‘seal, stamp’, Mod. Ch. yìn, Mid. Ch. *ʔjinH, Old Ch. *ʔin-s (GSR 1251f), *ʔi̯əɳ (Takashima) ‘seal, stamp’.

Further loans include
Toch. B cāk, tau  '(dry measures)', Toch. B cāne 'money'. Toch. B śakuse 'brandy', Toch. B ṣaṅk '(measure of volume)', TA yāmutsi TB yāmuttsi 'waterfowl' < 'parrot', Toch. B ṣitsok 'millet alcohol', Toch. B ṣipāṅkiñc 'abacus', Toch. A Toch B cok 'lamp', Toch. A lyäk Toch. B lyak 'thief', Toch. A < Toch. B tseṃ 'blue, Toch. A nkiñc Toch. B ñkante 'silver'.

These words give important indications of the impact of the Chinese culture on Tocharian. The track will be continued further on.

Carling, Gerd. Proto-Tocharian, Common Tocharian, and Tocharian – on the value of linguistic connections in a reconstructed language. In: Jones-Bley, Karlene, Huld, Martin E., Volpe, Angela Vella,  Dexter, Miriam Robbins Proceedings of the Sixteenth Annual UCLA Indo-European Conference. Journal of Indo-European Studies. Monograph Series (Institute for the Study of Man) 50, 47-70.
Kim, Ronald. (1999). Observations on the absolute and relative chronology of Tocharian loanwords and sound changes. Tocharian and Indo-European studies, 8, p. 111–138.
Lubotsky, Alexander, & Starostin, Sergei. (2003). Turkic and Chinese loan words in Tocharian.
Židek, Jan. (2017). Tocharian Loanwords in Chinese [Dissertation]. Praha: Univerzita Karlova.

Läs hela inlägget »
What is the relation between universal patterns, frequency of words and forms, and language evolution and change? This is a question that is very little researched. What is the relation between universal patterns, frequency of words and forms, and language evolution and change? This is a question that is very little researched.
I have decided to move the updating of this blog to even weekends instead of Thursdays. Thursday is very often an extremely busy day, with no time left to update or complete blogposts for publication.

In this blogpost I will continue the previous topic of principles of language change. In historical linguistics, the pricinple of the particular status of the most frequent words and grammatical forms of language is well known. The most frequent lexemes and grammatical categories are more resistant to change. Lexemes, such as kinship words, body parts, numerals, fire, water, liver, and so forth, typically preserve more archaic paradigms, that may resist change for millenia. The most frequent adverbials and particles even resist phonological erosion and change. The most frequent verbs, such as 'to be' or 'to become', are typically irregular, and archaic inflection patterns and archaic categories, such as tenses, modalities, and aspectual categories, survive in these verbal stems. On the other hand, less frequent words, such as various verbs, nouns, and adjectives, are much more frequently impacted by analogy and other types of changes that harmonize and simplify language structures, making them more easy to memorize.

However, few studies investigate this from an evolutionary perspective, using phylogenetic methods. As shown by Pagel et al (2007) there is a correlation between lexical substitution and frequency in basic vocabulary. The most frequent words have generally lower substitution rates.

Frequency is very important in explaining cross-linguistic universal patterns, among others in morphological marking hierarchies in languages. More frequent categories, such as singular (in relation to plural), agent (in relation to object), present (in relation to past), are unmarked in relation to the categories, which are marked. This theory, known as the markedness theory (which has a lot of exceptions in languages) can to a large degree be explained by frequency (Greenberg 1966, Croft 1993, 2003).

In a current study I wanted to investigate the correlation between frequency and change rates of grammar, focusing on the Indo-European family. I compiled a sample of grammatical categories of word order, nominal morphology, verbal morphology and tense and organised the properties into hierarchical pairs according to the properties of present < past, pronoun < noun, agent < object, and masculine/feminine < neuter, which are well-known, universal, hierarchial relations, observed from a large number of languages. By means of an evolutionary model (performed by Chundra Cathcart), where transititions rates between property states over a tree were were reconstructed, we extracted the average number of transitions (per 1000 years) between each grammatical property in our data. 
When the results were split up into pairs of marking hierarchy, as mentioned above, it turned out that the rates of change in the lower categories (i.e., the less frequent ones from a universal perspective), was higher. The rates of the higher categories (i.e., more frequent ones from a universal perspectives), was lower. The difference was statistically significant (p=>0.005). Even if this study is based on one family (Indo-European), 149 languages and about 100 properties only, it seems likely that frequency impacts language change also in the grammar. This explains why more frequent grammatical categories preserve more archaic patterns over time.

Text has been updated 2019-03-11
Läs hela inlägget »
Marking hierarchies of grammatical properties observed in the literature. After (Bickel 2008; Comrie 1981; Croft 2003; Dixon 1979) Marking hierarchies of grammatical properties observed in the literature. After (Bickel 2008; Comrie 1981; Croft 2003; Dixon 1979)
I am currently travelling, so this blogpost will only very briefly discuss the topic of my current research in grammar reconstruction: the role of marking hierarchies in language change.
The notion of marking hierarchies has it roots in the markedness theory by Roman Jakobsen and implies that grammatical categories (e.g., singular - plural) typically are in a mutual, hierarchical relation, where one of the categories are morphologically unmarked, whereas the other is morphologically marked. The unmarked category thus has a higher position within a hierarchy of grammatical properties (singular < plural). These grammatical relations are, according to some authors, general, or "universal", anchored in our in-born grammatical system. However, we know that this is a problematic notion: there are a substantial amount of languages where the actual morphological marking contradict the proposed markedness hierarchies. Further, not all languages have morphology. Morphological marking alone cannot be the identifyer of marking hierarchies.
On the other hand, there is an obvious connection between the observed marking hierarchies and frequency. Superior categories, "unmarked" in the traditional markedness theory, are more frequently used in speech and in text. Again, the definion may be problematic, since not all languages have corpora that enable a detailed study of category frequency. Also, marking hierarchies based on frequency may contradict marking hierachies based on general morphological marking observations.
My current study on grammar reconstruction, which I have been writing about in several blogposts, indicate a clear correlation between change rates and marking hierarchies: superior categories, which are more frequent in grammar and most likely to be unmarked grammatically, have substantially lower change rates (and slower pace of change) than inferior categories, which have higher change rates (and faster pace of change). I will continue and follow up this topic in a coming blogpost. 

Bickel, Balthasar (2008), 'On the scope of the referential hierarchy in the typology of grammatical relations', in G. Corbett Greville and Michael Noonan (eds.), Case and Grammatical Relations. Studies in honor of Bernard Comrie (Amsterdam - Philadelphia: John Benjamins), 191-210.
Croft, William (2003), Typology and universals (Cambridge textbooks in linguistics, 99-0104661-0; Cambridge: Cambridge Univ. Press).
Comrie, Bernard (1981), Language universals and linguistic typology : syntax and morphology (Oxford: Blackwell).
Dixon, Robert M V (1994), Ergativity [Elektronisk resurs] (Cambridge: Cambridge University Press).
--- (1997), The Rise and Fall of Languages [Elektronisk resurs].
--- (2010a), Basic linguistic theory. Vol. 2, Grammatical topics (Oxford: Oxford University Press).
--- (2010b), Basic linguistic theory [Elektronisk resurs]. Vol. 2, Grammatical topics (Oxford: Oxford University Press).
Läs hela inlägget »
Tocharian B text THT 496, in cursive script, containing a literary poem, "Love letter". From CEToM database. Tocharian B text THT 496, in cursive script, containing a literary poem, "Love letter". From CEToM database.
This blogpost will give an overview of my popular lecture earlier this week on the role of patterns in syntax, grammar and literature for the deciphering of ancient languages (link to the lecture below, in Swedish).

My own experience on ancient language deciphering is basically restricted to Tocharian. On the other hand, Tocharian texts can be very difficult to understand, in particular if parallel text in Sanskrit, Khotanese, or Uighur (the most frequent translation languages for Tocharian) are absent.

Deciphering of ancient languages basically uses three instruments: script, language (lexicon and grammar), and literature. Reading the script is fundamental to understanding the content, and also in a phase where the content of a manuscript is known, there is often reason to go back to the manuscript and check the reading, which may open for new interpretations and renewed understanding of content of the text. In case of Tocharian, the script (North-Turkestanic Brahmi script) is relatively well known, even though there are some Tocharian B texts in cursive script that are very complex and difficult to interpret. On the other hand, almost all Tocharian texts are fragmentary in some aspects (burned, broken, etc.), which means that lacunae have to be completed and reconstructed. Parts of this reconstruction is to interpret the chacacters at manuscript edges, which may be cut or damaged. This indicates that even if the script is known, the work of a philologist still implies a substantial amound of manuscript reading.

Interpreting lexicon and grammar may imply substantial problems, if the language is not well known. In the case of Tocharian, the broken contexts, again, create large difficulties when we study syntax. Morphology is easier: paradigms can be established and reconstructed from forms found in texts, and there are few missing forms in the context of grammar forms in Tocharian. However, syntactic constructions require a larger corpus of complete sentences, and in a language such as Tocharian, there are often problems of finding enough complete sentences (that are not restored) for certain constructions, for instance in combination with a specific verb.
The lexicon has its own difficulties. In a language like Tocharian, the absence of close relatives is a problem (Tocharian descends immediatly from the Indo-European proto-language). If an unknown word is found in a text, we may assume a meaning based on the meaning of a presumed cognate in another Indo-European language. However, the connection to the presumed cognate may be a complete mistake and instead the meaning of the lexeme, as well as the etymology, is something entirely different.

This brings us over to the third category, literature. Besides script, literature is probably the most important of the instruments  mentioned at the beginning of this text. The exact meaning of words, which form the basis for a correct interpretation of a text, is highly related to the possibility of "proving" the content by a parallel or bilingual text. Most Tocharian texts are translations from Sanskrit, but besides that, Tocharian had its own literary tradition. Therefore, the exact source of a text can be difficult to trace. Some texts do not have any source texts at all. Since Tocharian, like any other literary language, is constrained by its literary tradition, the identificaiton of parallel patterns in, e.g., Sanskrit literary sources, are highly important to a proper understanding of the content and a correct translation of the lexical meanings and the syntax.

Link to a public lecture at Filosolficirkeln, Lund, about deciphering ancient languages.
Läs hela inlägget »
This week's blogpost will continue the thread about grammatical reconstruction, with some thoughts on lineage versus areality in grammar change. 

In general, change of grammar is supposedly cyclic (or spiralic according to some researchers): over time, typological organization of features in systems recur of are re-established. We may look at this issue both from a long-term and a short term perspective. One thing for a feature is the inherent possibility to be homologous (a simirlarity may depend on inheritance only) or homoplastic (a similarity may depend on internal or areal pressure, caused by various factors). Another thin is whether a similarity is caused by areal pressure or whether it is caused by lineage. A construction or a feature may be indicative of all of all these processes. For instance, a feature like word order is by nature homoplastic (similarities in word order may be due to areal or internal pressure, such as change in order of meaningful elements), but even then, a word order feature may be due to lineage: it has been inherited by ancestry generation after generation, or it is a critial innovation restricted to a specific sub-branch of a tree. Take for instance the verb-initial order in Celtic languages: it is likely that this feature is caused by interal pressure in the verbal paradigm (McCone 1987). Because of this, verb-initiality is a features which is restricted to the Celtic sub-branch and therefore a homologuous innovation of this specific branch, not caused by areal pressure. The feature is entirely independent of other Eurasian verb-initiality. Another example is the Germanic have-perfect. It is a homoplastic typological feature (expressing perfect by an auxiliary construction), which still uses the same cognate root as the auxiliary, the verb *haban. The process took place independently in all Germanic languages, due to parallel drift and possible areal pressure. As before, it is difficult to distinguish areality from lineage.

Very interesting is the process of Indo-European alignment change, from the proto-language to the daughter branches. It is quite evident that the reconstructed language bears morphological traces of a semantic-based system, similar to active-stative systems, as has been suggested by several scholars (Bauer 2000). But does it mean that Proto-Indo-European was an active language? Probably not. This concerns the question of stability of systems in general versus language-internal variation in tendencies to other systems. Indo-European alignment took three pathways of change, towards ergativity in the South-East, nautral marking in the West, and a preservation of the ancient system in between (roughly). What is the areal pressure component here, and what changes are dependent on internal procedures in languages, and what is the role of the residual morphology? These are questions that remain to be answered. 

McCone, Kim (1987), The early Irish verb (Kildare: Maynooth). 
Bauer, Brigitte (2000), Archaic syntax in Indo-European : the spread of transitivity in Latin and French (Trends in linguistics. Studies and monographs, 99-0115958-X ; 125; Berlin: Mouton de Gruyter).
Läs hela inlägget »
Berthold Delbrück (1842-1922) Berthold Delbrück (1842-1922)
The current post is about something that I am involved in right now: the reconstruction of grammar. In comparative linguistics, grammar can be reconstructed to a proto-language on the basis of the forms and functions in daughter languages. For instance, if there is a dative case in several languages with a specific marker that can be reconstructed to the joint proto-language, and this form has the function of dative in all languages, then it is also likely the the function of this marker was a dative also in the proto-language. However, the reality is often much more complex than that. Often, the function of a marker is different in various daughter languages: in our case above, we may have genitive or ablative instead of dative, and since we don't know if a genitive is more likely to become a dative or the other way round, we cannot reconstruct a the original, proto-language function of this specific marker. The problem is known as the "correspondence problem" and is a matter of controversy in syntactic reconstruction in general (Roberts 2007) (see picture below). 
The issue is particularly prominent in the reconstruction of Proto-Indo-European syntax, where many categories of the ancient languages, such as Sanskrit, Tocharian, and Greek, are absent in Anatolian, which, on the other hand, has a high number of other categories considered to be highly archaic.

In recent years, scholars have tried to approach this problem by using evolutionary and phylogenetic methods (Marutis and Griffith 2014, Dunn et al 2014, Cathcart et al 2018). The probability of presence of a specific feature at ancestral nodes is estimated, based on gains (1 -> o) and losses (0 -> 1) of features over a reference tree (lexical or hand-crafted). As expected, the method requires some adjustment to get reliable and reproducable results. One of them is to treat grammatical properties as logically dependent (which is a very tricky and complex matter), the other one is to use ancestry and clade constraints of trees, in order to avoid unecessary noice in the results.

However, even if evolutionary and phylogenetic methods are much more sophisticated than traditional methods in terms of amounts of data and number of calculations, the principle of the programs is based on the same problem as observed in the correspondence problem. If most of the daughter languages have specific property, then it is likely that this property was there also in the proto-language. If there is a rooted outgroup with another function, then the probability of presence of this function at the proto-language state is increased.

Currently, I am working with a dataset for Indo-European, which reconstructs probabilites of grammatical features to be present at the ancestral state of Proto-Indo-European (statistics has been performed by Chundra Cathcart, University of Zurich). The results are astonishing: with very few exceptions, the program reconstructs high probabilities for grammar features that were reconstructed to Proto-Indo-European by the Neogrammarians (Brugmann & Delbrück 1893, 1897, 1900). The reconstruction of Proto-Indo-European grammar by the Neogrammarians was done before the discovery of Hittite and Tocharian, which changed the preconditions for the typological reconstruction of the proto-language grammar to a high degree. Even if Tocharian and Anatolian is there in the data, this does not change the Neogrammarian reconstruction of Proto-Indo-European grammar. I will have reason to come back to this issue in further blogposts.  

Brugmann, Karl, Delbrück, Berthold, and Delbrück, Berthold (1893), Grundriss der vergleichenden Grammatik der indogermanischen Sprachen : kurzgefasste Darstellung der Geschichte des Altindischen, Altiranischen (Avestischen u. Altpersischen), Altarmenischen, Altgriechischen, Albanesischen, Lateinischen, Oskisch-Umbrischen, Altirischen, Gotischen, Althochdeutschen, Litauischen und Altkirchenslavischen. Bd 3, Vergleichende Syntax der indogermanischen Sprachen, T. 1 (Strassburg: Trübner).
--- (1897), Grundriss der vergleichenden Grammatik der indogermanischen Sprachen : kurzgefasste Darstellung der Geschichte des Altindischen, Altiranischen (Avestischen u. Altpersischen), Altarmenischen, Altgriechischen, Albanesischen, Lateinischen, Oskisch-Umbrischen, Altirischen, Gotischen, Althochdeutschen, Litauischen und Altkirchenslavischen. Bd 4, Vergleichende Syntax der indogermanischen Sprachen, T. 2 (Strassburg: Trübner).
--- (1900), Grundriss der vergleichenden Grammatik der indogermanischen Sprachen : kurzgefasste Darstellung der Geschichte des Altindischen, Altiranischen (Avestischen u. Altpersischen), Altarmenischen, Altgriechischen, Albanesischen, Lateinischen, Oskisch-Umbrischen, Altirischen, Gotischen, Althochdeutschen, Litauischen und Altkirchenslavischen. Bd 5, Vergleichende Syntax der indogermanischen Sprachen, T. 3 (Strassburg: Trübner).
Cathcart, Chundra, et al. (2018), 'Areal pressure in grammatical evolution.', Diachronica, 35 (1), 1-34.
Dunn, Michael, et al. (2017), 'Dative Sickness: A Phylogenetic Analysis of Argument Structure Evolution in Germanic', Language: Journal of the Linguistic Society of America, 93 (1), e1-e22.
Harris, Alice C. and Campbell, Lyle (1995), Historical syntax in cross-linguistic perspective (Cambridge studies in linguistics, 0068-676X ; 74; Cambridge: Cambridge Univ. Press).
Maurits, Luke and Griffiths, Thomas L. (2014), 'Tracing the roots of syntax with Bayesian phylogenetics', Proceedings of the National Academy of Sciences, 111(37), 13576-81.
Roberts, Ian G. (2007), Diachronic syntax (Oxford textbooks in linguistics, 99-2380132-2; Oxford: Oxford University Press).
The principle of evolutionary reconstruction. Gains and losses are measured against a reference tree (lexical/hand-crafted), resulting is a probability of presence at ancestral nodes.
The principle of evolutionary reconstruction. Gains and losses are measured against a reference tree (lexical/hand-crafted), resulting is a probability of presence at ancestral nodes.
Representation of the correspondence problem. In the figure at the top, A is more likely than B, but in the figure below, B is more likely, despite A being more frequent. This principle is applied by evolutionary methods.
Representation of the correspondence problem. In the figure at the top, A is more likely than B, but in the figure below, B is more likely, despite A being more frequent. This principle is applied by evolutionary methods.
Läs hela inlägget »
Currently, at least if you are in the northern hemisphere, the darkest time of the year is approaching. This is also when we celebrate one of our most awaited festivities, which in English goes by the name Christmas. How old is this custom? It is highly likely that a festival during the darkest time of the year, the winter solstice, has a very long history, earlier than the introduction of Christianity, probably all the way back into Neolithic times, when the return of the sun was important for the preparation of the growing season. The festival has many forms in various cultures, among Jews it is represented by Chanukka, a feast of light, which is celebrated somewhat earlier than Christmas.

In some northern cultures, the winter solstice marks the beginning of the winter, in other Central European cultures, the winter period begins earlier. In Indo-European languages, winter, the cold and rainy season goes by the name of *ǵh(e)im- 'winter', also 'snow', a root that is found with the meaning 'cold season' in most languages, including Indo-Aryan. Germanic languages use another word for the cold season, *wintru-, which has two possible origins, either it is related to Latin unda 'wave', referring to 'the wet time of the year', or it is related to Gaulish vindo 'white', meaning 'the white time of the year'.

The festival that marks the winter solstice, 'Christmas' goes by different names in different languages. However, the symbols and the cultural habits show striking similarities between cultures. Important components in festivities are, besides excessive eating and drinking and giving of gifts, also the presence of death and the return of dead ancestors, equality of humans, and a celebration of light. In ancient Rome and other parts of the Mediterranean, the winter solstice festival had the name Saturnalia, which was a festival devoted to the god of the earth, Saturnus. An important component of the festival, besides excessive eating, drinking, visiting of friends and giving of gifts, was that the slaves were supposed to sit and eat in company of their masters. This is paralleled by the habit in northern cultures, where servants and houseowners were supposed to eat together in the kitchen during Christmas.

The words for 'Christmas' are different in various languages. Even though we have little information about celebrations of the winter solstice in older culture without written sources, the words may give us important indications of the purpose of the feast.

Many Germanic languages have preserved an ancient and obscure word for the feast, jul, Swedish jul, Old Swedish iūl, Icelandic jól, Danish jul, Old English, geohhol, géol, English yule, Gothic (fruma) jiuleis 'the month of Christmas'. From Proto-Norse, the word has also been borrowed into Finnish joulu, Estonian jõulud. The meaning of this word is uncertain, but there are two alternatives: either the word is derived from a root related to Old Icelandic él 'storm', referring to the time of winter storms, or it is derived form a root of Indo-European *jek- 'speak out loud', which in many languages, such as Latin iocus 'joke', has the meaning of 'joke, amusement'.
The word for Indo-European 'winter',  *wintru-, recurs in Latvian Ziemassvētki.

Another group of words relate to meanings of 'holiness', such as German Weinacht, Middle High German wīhenahten (known since the 12th century), meaning 'holy night', or the word for 'God', in Slavic languages bȏgъ, Polish Boże Narodzenie, Bosnian Božić, Croatian and Serbian Božić, Macedonian Božiḱ. Lithuanian has preserved an ancient word in their term Kalėdos, which is from the name of the pagan god Koliada, who personalizes the newborn winter.

An important set of Christmas words relate to meanings of 'new' and 'birth' or 'rebirth'. We have derivations of Latin natīvitas in Spanish Navidad, Latin nātalīs in French Noël, Portuguese Natal, Italian natale, borrowed into many languages, such as Marathi Nātāḷa, or Turkic Noel, also Irish nollaig, Welsh nadolig, Scots-Gaelic nollaig (borrowed from Latin natalicia 'nativity'). Alternatively, we have Russian rozhdestvo, Belorussian roždiestvo, derived from ród 'birth' and borrowed into, e.g., Kazakh Rojdestvo, Uzbek Rojdestvo.

Another group - to which we count the English Christmas - refers immediately to the birth of Christ: Greek Χριστούγεννα, Dutch Kerstmis, Frisian Kryst, Luxembourgish Chrëschtdag, Albanian Krishtlindje. From English, the word has been borrowed into many languages, such as Hindi krisamas, Nepali Krisamasa, Malayalam krismas, Japanese kurisimasu, Samoan Kerisimasi, Tamil Kiṟistumas, Talugu Krismas, Swahili Krismasi, Thai Khris̄t̒mās̄, Xhosa Krisimesi, and so forth.

And with this little overview of Christmas words, I would like to wish you all a Merry Christmas!

-The text has been updated 2018-12-15-

Läs hela inlägget »
Phylogenetic tree, where Tocharian is second to branch off, after Anatolian (by Chundra Cathcart). Phylogenetic tree, where Tocharian is second to branch off, after Anatolian (by Chundra Cathcart).
This post is related to what I am currently busy with: preparing and introductory course on Tocharian. There is a long-debated dilemma in Tocharian studies, which concern the position of Tocharian within the Indo-European language tree. Due to its status as a kentum-language, most scholars of the early 20th ct. regarded Tocharian as a western Indo-European language (together with Celtic, Germanic, Italic and so forth) rather than an eastern language. This view is not supported anymore, but the position of Tocharian still remains an enigma. Today, most scholars agree that Tocharian branched off from the Indo-European proto-language directly (and is thus not more closely related to any other branch). The disagreement of contemporary scholars is whether Tocharian branched off second, after Anatolian, and before the other Indo-European branches or not. There are several arguments in favor of the second-to-branch-off theory. One argument is the occurrence of lexical archaisms in Tocharian, meaning that a handful of etymologies have preserved a more general meaning in Tocharian, whereas the other branches show a more spezialized meaning. Examples are:
  • Toch. AB yäp- ‘enter’, Skt. yabh-, Greek oíphō, Russ. ebu ‘have intercourse’ < PIE *yebh- ‘enter’ (LIV:309) The original meaning of the verb is preserved in Tocharian.
  • TB kärweñe ‘stone, rock’, Skt. grāvan- ‘stone for pressing out soma’, Welsh breuan ‘handmill’, Old Ch. Slav. žrǔny ‘handmill’.
  • TB śrān-* ‘(adult) man’ < PIE *ģerh₂-ōn, Skt. járant- ‘old, fragile’, Gr. géront- ‘geriatric’, Oss. zärond ‘old’ < PIE * ģerh₂- ‘mature, grow’ (LIV:165). The meaning ‘old’, ‘geriatric’ is an innovation of the non-Tocharian languages.
The idea of lexical archaisms is not totally irrelevant; as I wrote in my previous blog, we know by statistical testing, that specialization is more frequent than generalization.
The other argument is from phylogenetics. In phylogenetic trees, Tocharian consistently branches off second, after Anatolian. Again, this argument is based on lexical data, but from a completely different angle.
What about grammar? The arguments in favor of Tocharian to be second to branch off are complicated, in particular since they are dependent on which type of system we reconstruct for Proto-Indo-European. Without going too much into detail, we have two types of reconstrucitons, one relatively simple system, more similar to Anatolian, from which the other branches developed their system, and one more complex reconstruction, more similar to Sanskrit and Classical Greek, in which Anatolian lost most of its grammar. The position of Tocharian here is not clear. It is obvious that Tocharian rearranged and rebuilt most of its nominal - and partly also verbal - system, and this complicates the picture. The Tocharian reformation of the system was partly done by morphological material which is found in the other branches, partly Anatolian but also Old Indic and Classical Greek.
The enigma waits to be solved.
Läs hela inlägget »

Highlighted publications



Welcome to visit the infrastructure and lab DiACL. All data is open access and free of use to everyone!