Historical linguistics blog - even thursdays

Phylogenetic tree, where Tocharian is second to branch off, after Anatolian (by Chundra Cathcart). Phylogenetic tree, where Tocharian is second to branch off, after Anatolian (by Chundra Cathcart).
This post is related to what I am currently busy with: preparing and introductory course on Tocharian. There is a long-debated dilemma in Tocharian studies, which concern the position of Tocharian within the Indo-European language tree. Due to its status as a kentum-language, most scholars of the early 20th ct. regarded Tocharian as a western Indo-European language (together with Celtic, Germanic, Italic and so forth) rather than an eastern language. This view is not supported anymore, but the position of Tocharian still remains an enigma. Today, most scholars agree that Tocharian branched off from the Indo-European proto-language directly (and is thus not more closely related to any other branch). The disagreement of contemporary scholars is whether Tocharian branched off second, after Anatolian, and before the other Indo-European branches or not. There are several arguments in favor of the second-to-branch-off theory. One argument is the occurrence of lexical archaisms in Tocharian, meaning that a handful of etymologies have preserved a more general meaning in Tocharian, whereas the other branches show a more spezialized meaning. Examples are:
  • Toch. AB yäp- ‘enter’, Skt. yabh-, Greek oíphō, Russ. ebu ‘have intercourse’ < PIE *yebh- ‘enter’ (LIV:309) The original meaning of the verb is preserved in Tocharian.
  • TB kärweñe ‘stone, rock’, Skt. grāvan- ‘stone for pressing out soma’, Welsh breuan ‘handmill’, Old Ch. Slav. žrǔny ‘handmill’.
  • TB śrān-* ‘(adult) man’ < PIE *ģerh₂-ōn, Skt. járant- ‘old, fragile’, Gr. géront- ‘geriatric’, Oss. zärond ‘old’ < PIE * ģerh₂- ‘mature, grow’ (LIV:165). The meaning ‘old’, ‘geriatric’ is an innovation of the non-Tocharian languages.
The idea of lexical archaisms is not totally irrelevant; as I wrote in my previous blog, we know by statistical testing, that specialization is more frequent than generalization.
The other argument is from phylogenetics. In phylogenetic trees, Tocharian consistently branches off second, after Anatolian. Again, this argument is based on lexical data, but from a completely different angle.
What about grammar? The arguments in favor of Tocharian to be second to branch off are complicated, in particular since they are dependent on which type of system we reconstruct for Proto-Indo-European. Without going too much into detail, we have two types of reconstrucitons, one relatively simple system, more similar to Anatolian, from which the other branches developed their system, and one more complex reconstruction, more similar to Sanskrit and Classical Greek, in which Anatolian lost most of its grammar. The position of Tocharian here is not clear. It is obvious that Tocharian rearranged and rebuilt most of its nominal - and partly also verbal - system, and this complicates the picture. The Tocharian reformation of the system was partly done by morphological material which is found in the other branches, partly Anatolian but also Old Indic and Classical Greek.
The enigma waits to be solved.
Läs hela inlägget »
An evolutionary reconstruction of meanings of the cognacy tree of Proto-Indo-European *st(e)h₂w-ro- ’big cattle’ (?) (by Harald Hammarström) An evolutionary reconstruction of meanings of the cognacy tree of Proto-Indo-European *st(e)h₂w-ro- ’big cattle’ (?) (by Harald Hammarström)
I have not shared anything in a month, since I have been on a 'road-trip', first to Arizona for the CES conference, and then to Beijing and Changsha (Hunan Province) for a lecture series on historical and evolutionary linguistics.
In Arizona, we (with Harald Hammarström and Sandra Cronhamn) presented some results of evolutionary semantic studies on culture vocabularies of our corpus, including data from Indo-European, Caucasian families, Turkic, Uralic, Basque and ancient Semitic (book of abstracts is found here). This study has two aspects: one being the causalities of change rates, the second directionality of semantic change.
In this post, I will focus on the first aspect, causalities of change rates. As our data, we used the 100-list of cultural words of farming, pastoralism, hunting, war, technology, and industry, that we have in our database DiACL. We built an evolutionary model, where we measured gain and loss rates of 21,874 meaning tokens (6,224 types) within cognate trees, contrasted against Glottolog reference trees. After adjustment for transition frequency, 3,442 meanings remained. The gain and loss rates (given as probabilites) we tested against various metrics. We had some preliminary results, but the issue is still being researched. Previous research on lexical change rates (e.g., Pagel et al, Nature 449, Vejdemo et al, PLOS 2016 11,1) have indicated a connection to word frequency (the more frequent a word is, the lower change rates), as well as to age of acquisition, synonyms, arousal, imageability and average mutual information. However, this research has been performed on basic vocabulary only, and we expect most of these causalities to be less relevant to a vocabulary such as ours. Frequency, for instance, showed no correlation at all to our results. However, we found a negative correlation to borrowability, which is highly noteworthy: apparently, lexemes that are frequently borrowed have slower change rates. Further, we found a correlation to colexifcation tendency, as well as cognacy productivity, which is to be expected (words that change their meaning often and which are diverse in geography are expected to have high change rates). Currently, we test various semantic properties of the lexemes, and this is where the interesting part begins: it is evident that inherent properties that are said to impact gender and classifiers, such as animacy, shape, mass/count etc, have no correlation to change rates. But, cultural aspects, such as labour intensity, processability, possibility to control and change, do have an impact. I am still testing various properties and aspects, and hopefully, results can soon be made ready for submission.   
Läs hela inlägget »

Today, I am busy preparing my talk for the Cultural Evolution Society Conference at ANU Tempe, Arizona, and will not have time for a blogpost. In two weeks I will return with an overview of our talk on semantic evolution!
Visit the CES 2018 homepage, and follow on twitter or facebook.
 
Read also our recent paper on the DiACL database on ancient language typology.
 

Läs hela inlägget »
Drinking party among northern people according to Historia de gentibus septentrionalibus by Olaus Magnus (1555). Drinking party among northern people according to Historia de gentibus septentrionalibus by Olaus Magnus (1555).

Liquids are not just vital to our survival, they also form a central part of our culture. Most human gathering has the procedure of drinking as its common denominator, be it water, wine, beer, tea, or coffee. This post is about ancient drinking and words for drinking in languages (coffee and tea will be in a later blog).
The two most vital liquids to humans – as well as to mammals in general – are water and milk. Water we drink all our lives; without water we cannot survive. Milk we drink our first year; during this period, milk represents our entire need for nourishment. In many cultures, individuals continue to drink milk from cows, goats or sheep, either in the form of fresh milk or as cheese or yoghurt. In other cultures, milk is not a natural part of the diet later on in life.
Looking at the words for water and milk, they are both high-conservative words, which belong to languages’ basic vocabulary. In Indo-European, both words can be reconstructed to the proto-language, and the form has not changed much during the family’s history. The Proto-Indo-European word for water *wód-r-/wéd-n-, look similar in its earliest appearance, Hittite watar, strikingly similar to English water several millennia later. The root for milk, Proto-Indo-European *h₂melǵ-, is not very different from the form in Russian molokó, Tocharian B malkwer, or in Old Norse mjǫlk, English milk. Fresh milk as a drink is most frequent in Europe and less frequent in other parts of Eurasia, and the ability to drink milk, lactose tolerance, is a genetic mutation that goes back 6,000 years in Central Europe. The mutation is not unique to Europe, other independent epicentres are also found in Saudi Arabia and Western Africa.
 At least in Europe, there is a popular generalization about ‘drinking belts’, which sometimes are used to generalize about various peoples’ mentality, typically the ‘wine belt’ and the ‘beer belt’, often also the ‘vodka belt’ and sometimes also a ‘milk drinking zone’. Beer and wine are both very ancient and central drinks in all of Eurasia. Another important drink is mead, which is tightly connected with bee keeping. Mead has lost its importance in the last millennia, probably due to the more efficient production of beer and wine. Vodka, whiskey and other distilled drinks have a short history: they are a result of distillation, which is a relatively modern process.
Among beer and wine, beer is the most archaic drink, which appears in many lexical forms. The preparation of a toxic, fermented drink, based on cereals, was invented already by the earliest Neolithic farmers in West Asia and Anatolia 10,000 years ago. With the preparation of beer came also the practice of cultic feast; occasions where people worshipped the gods, ate, drank, sacrificed, and got (probably very) drunk. A common word for beer can be reconstructed to Indo-European *h2el-u-, but it is frequently substituted (like in English beer): likely, the production of beer was divergent and different in cultures, with many local deviations, and for this purpose, many languages substituted their beer words.
Wine has a different story. The production of wine is related to farming of the domesticated grape, a practice that began in the Caucasian region about 8-7,000 years ago. The word for wine is also the same in all languages, and it is most likely that the words spread through all languages at an early state, together with the invention of wine. Wine cannot be planted in Northern parts of Eurasia, still all languages have a word for wine. The ultimate source of the wine-root is not clear. Often, Proto-Semitic or Proto-Kartvelian are believed to be the sources of the word (PIE *woh₁i-no- ‘wine’ < PIE *weh₁-i- ‘to turn, wind’; Proto-Kartvelian *ɣwin- ‘wine’, Proto-North-West Caucasian *ωwə- ‘wine; alcoholic drink’, Proto-Dagestanian *ωun- ‘wine; one-year-old vine shoot’, also found in early Semitic languages, Old Testment Hebrew yayin, Ugaritic yn). The Indo-European root, on the other hand, is derived from a verb meaning ‘to wind’ (referring to the vine), which to some indicates an Indo-European origin. This may be a secondary adaptation in Indo-European, so we cannot be certain about the origin of the word wine.
Cognacy map of words for WINE in modern (top) and ancient (bottom) languages. One rott dominates almost the entire map.
Cognacy map of words for WINE in modern (top) and ancient (bottom) languages. One rott dominates almost the entire map.
Läs hela inlägget »
Ovid's Golden and Silver Age, as interpreted by the French Renaissance painter Bernard Salomon (1506-1561) Ovid's Golden and Silver Age, as interpreted by the French Renaissance painter Bernard Salomon (1506-1561)

In Greek and Roman mythology, carefully described by Ovid, there is this a tale about the four ages, which are rendered by the name of metals: the ‘golden age’, the ‘silver age’, the ‘bronze age’, and the ‘iron age’. Of the four ages, the golden age represents the perfect state, where gods dwell among humans, all humans are equal and respect each other, food and clothes are found in abundance, and war is absent. Thereupon, each of the continuing ages represent a decay in the conditions and in the morality of humans, and during the final state, the iron age, humans are at constant war, friends and brothers kill each other, families are separated, and hunger dominates. In the end, evil rules the entire earth. This myth is paralleled in the Old Indic myth about the yuga, the world ages, which are also described in terms of metals. Even today, for instance in a competition or a race, or in the counting of wedding anniversaries, we value the metals in the order gold, silver, bronze, iron. Even though, for instance, bronze has been out of industrial use for a long time.
So, the question remains: does this reflect a historical development of metallurgy or is it simply a mirror of humans’ value of the various metals? And how is this ancient evaluation of the metals reflected in language?
Like many other things, metallurgy emerged in the area of West Asia and Anatolia 10-8,000 years ago, together with the development of agriculture. In the earliest phase, metals were used to decorate pottery, and only later, a proper industrial use of metals emerged. The earliest metal objects, which were made of copper, were various tools; in particular small knifes or daggers. Around 6-5,000 years ago, the technique of smelting copper through furnaces was found in Mesopotamia, Egypt, Anatolia, Central Europe, Caucasus, and the Steppe area, and from Balkans the new technique spread quickly over all of Europe. This marks the beginning of a new era, labelled Chalcolithic, which also implied the emergence of many other important techniques, such as the wheel, the plow, and the domestication of the horse. An important result of the metallurgy, not just in mythology, was the emergence of large-scale war parties – more like ‘wars’ in a modern sense, with organized armies and massive killings in battle grounds.
Around 4,000 before present in Anatolia, the processing of a new metal would imply a historical change in the entire area and pave the way for the industrial revolution millennia later: iron. Since this metal requires higher temperatures for smelting, a technological improvement of the furnaces and technology was necessary to enable the smelting and preparation of objects of iron. This marked the beginning of a new era, the Iron Age, where tools, ploughs and weapons were hardier, sharper, and more efficient. Again, the iron affected both agriculture and warfare, something that we should suppose lies behind the mythical interpretation of the Iron Age.
Gold and silver, which are soft and shiny metals, are worthless for industrial use. Nevertheless, they are more appreciated than copper and iron. Why? The answer is complex. Iron and copper are associated with warfare and technology, gold and silver with shiny objects and wealth. Already in the Neolithic, and increasingly so during the Chalcolithic, gold was used for ornaments on weapons and objects, typically hammered out into thin, thin layers – something that indicates a high value and exclusiveness of this metal already at an early stage. First during the early antiquity, gold and silver became what they still are (at least to the abandon of the gold standard after WWI): a standard for measuring the value of all other objects.
What does language tell us about the history of metals? Actually, relatively much. All the basic metals can more or less be reconstructed to the Indo-European proto-language, just as to Caucasian families, and the reconstructed forms, as well as the meaning changes of the metal words, give us important information about their emergence and early use.
Gold in Proto-Indo-European is derived from a root with the meaning ‘yellow, green’ as well as ‘shining’ (PIE  *ǵhelh₃-to- ‘gold’ < PIE *ǵhlh₃- ‘green, yellow’). In languages, gold words typically change to meanings ‘shining’, but also to ‘coin’, ‘wealth’, and ‘currency’.
Silver has a different story. The main word for silver in most languages, including non-Indo-European, is a migratory root with an obscure origin, *silubhr-, which spread secondary at a very early state (probably at proto-language level) and became the standard word in most Eurasian languages. Again, some languages formed roots from words meaning ‘white, shining’ (PIE *h₂erǵ- ‘brilliant, white’, which among other underlies the Tocharian A-speakers word for themselves, ārśi). This indicates that gold was an indigenous metal in Indo-European and Caucasian, whereas silver spread secondary through the early stages of proto-languages.
Copper, bronze and iron have different stories. A term for copper can be reconstructed to Proto-Indo-European (PIE *h₂éi-es- ‘metal, copper, bronze’) as well as to Caucasian proto-languages, indicating that these people were familiar with copper and likely produced copper industrially. Words for copper and bronze are completely intertwined in various languages, indicating that people of the past did not distinguish these metals by their names. Iron cannot be reconstructed to any proto-language. In languages, iron-words are derived either from a root ‘red, bloody’ (Proto-Indo-European *h₁ēsh₂r-no- ‘bloody, red’, underlying, e.g., English iron, Swedish järn), or they are initially designations for other metals, which have received the meaning ‘iron’ (Proto-Uralic *waśke ‘copper, metal’). The Proto-Finnic word for iron is a loan from Proto-Germanic, *rauðan- ‘bog iron’, which originally meant ‘red’.
The value and cultural function of the various metals emerges out of the patterns of how words for metals change their meaning. Gold and silver words change their meaning to (or colexify with) ‘wealth’, ‘currency’,  ‘coin’, ‘shining’, ‘bright’, gold with the colours ‘yellow’, ‘blonde’, and silver with ‘white’. Words for copper change their meaning to (or colexify with) with ‘tin’, ‘zink’, ‘lead’ or other metals, iron with ‘sword’, ‘weapon’, ‘tool’, ‘toughness’, but also ‘red’ and ‘blood’.

Semantic network of colexifications (blue) and coetymologizations (red) or both (purple) of Indo-European metallurgy words (green). Graph by Niklas Johansson.
Semantic network of colexifications (blue) and coetymologizations (red) or both (purple) of Indo-European metallurgy words (green). Graph by Niklas Johansson.
Läs hela inlägget »


All languages contain loanwords. Some languages, such as English, have more of them, other languages, such as German or Icelandic, have few, and other again, such as Swedish, have a moderate level of loans. Why is this the case? The answer is very complex and is a combination of past and present history, geography, language size and power, and language structure. As a rule, languages are affected by contact from their neighbours. No language – and no part of language – is totally “loan-proof”. Any word in a language can potentially be replaced by a word from another language. But why? Why would a language replace a word for ‘mother’, ‘finger’ or ‘sky’, when they already have a word for this object (which all languages actually have)?
There are large differences between languages. Languages with more grammar, such as German or Icelandic, are more reluctant to borrowing – the grammar system of these languages run a risk of breaking down if the influx of loans is too large. Languages with lesser grammar are more open towards borrowing.
There are large differences between words. Words for modern cultural phenomena, such as computer, tea, or latte, are loanwords in almost all languages. There are languages that are exceptions, and these are typically minor languages, which naturally do not like to be overwhelmed by foreign words: these languages run a risk of disappearing anyhow. An example is the word for ‘radio’, in which Swedish borrowed the English term radio, whereas Icelandic interpreted the word as útvarp ‘out-throw’. The word for ‘airplane’ in  Scandoromani comes out as sasster-tjirklo ‘iron-bird’, and ‘webpage’ as khereske-rigg ‘house-of-side’.
On the other side, we have words that are almost never borrowed. Here, we find kinship terms, body parts, numerals (mostly), words for basic bodily functions and sense perception terms, words for natural phenomena (‘sky’, ‘stone’, ‘ground’). Linguists are particularly fond of these words, so-called basic vocabulary: these are good for almost everything in language, from investigating human cognition to establishing language families.
When it comes to cultural words apart from the modern ones, the issue is more tricky. Words that are inherent from a Eurasian perspective, meaning that they are part of the vocabulary for the cultural system of farming and pastoralism that has been present in Eurasia since the Neolithic or the Chalcolithic, are typically borrowed in languages outside of Eurasia (due to colonization).
We wanted to look at these words and compare them inside and outside of Eurasia. We looked at 100 words from the vocabulary of farming, pastoralism, hunting, and technology, which we compiled in 160 languages of different periods (4,000 BP-now) in Eurasia (Europe, Caucasus, Central and South Asia). We selected words that represented items that had been in use at least since the Chalcolithic, such as ‘wheel’, ‘axe’, ‘cow’, ‘horse’. It turned out that words were borrowed to a degree that was relatively high in Eurasia (12%), but that this level was highly diverging between languages. Compared to languages outside of Eurasia (as concluded in the WOLD project), the percentage of loans was low.  
The most interesting result was a significant correlation between language size and tendency to lend or borrow words in our corpus. Large languages were donor languages to all other languages, including other large languages. The level of mutual borrowing between languages decreased all the way down to the smallest languages, who were very frequent as recipients of loans, but which were almost never sources of loans, not even to other small languages. This demonstrates one of the most crucial causes for borrowing: power, defined not only by population size, but also by economic, material, cultural, or political power. Occasional exceptions of reverse borrowing of specialized words – think of lingonberry, fartlek, smorgasbord or ombudsman (Swedish loans in English) remain infrequent exceptions in the larger perspective.  
    

REFERENCE
Presentation by Gerd Carling & Sandra Cronhamn at SLE 2018, Tallin.

Heat map demonstrating frequency of co-occurrence as Target (x) and Source (y) language defined as language size (1-5, with 5=larges and 1=smallest), based on 1308 loan events in 160 Eurasian languages spanning over 4,000 years of documentation (graph by Johan Frid).
Heat map demonstrating frequency of co-occurrence as Target (x) and Source (y) language defined as language size (1-5, with 5=larges and 1=smallest), based on 1308 loan events in 160 Eurasian languages spanning over 4,000 years of documentation (graph by Johan Frid).
Läs hela inlägget »
Liber Vagatorum "Book of Vagabonds" from 1510 Liber Vagatorum "Book of Vagabonds" from 1510


The very concept of ‘secret’ languages appears as if it is taken out of a crime novel. We may think of military secret codes, jargons by criminal inmates, or suburban youth slang. However, are not all languages (except for, let us say, standard English) in some sense ‘secret’, as long as they are spoken by a close group of people and unintelligible to outsiders? This is true in many cases: for instance, minority languages, immigrant languages, local languages or dialects, youth jargons, or ethnolects – these represent communication systems that are restricted to a closed group of speakers and not shared by outsiders. So what makes a language a ‘secret’ language? The answer is complex.
A secret language is no one’s mother tongue – this is probably the most important distinction from a ‘normal’ language. Rather, secret languages represent traditionally a jargon that was transferred from father to son, together with an occupation or a life-style, the purpose of which was not just to keep outsiders, but also members of the own family, outside. Secret languages, connected to various occupations, are found in Europe as well as in Africa and South America. They are very often the idiom of occupations with a distinct social function, most typical occupations that are excerpted within the society but which have with a special, often low, status. In Europe, pedlars, dealers, chimney sweepers or circus people, but also various types of low-status occupations, such as the executioner's henchman or skinners, used to have their own secret languages. In Africa, to mention an example, we have documented secret languages among healers, skinners, and sandal flickers. 
Linguistically, secret languages do not possess their own grammar, like ‘normal’ languages do. Their grammatical system relies on the grammar system of another language, most normally the majority language of the country where they occur. The grammar is often simplified and syntactic patterns are replaced by pidgin-like structures. A frequently occurring phenomenon is to borrow the ‘appearance’ of a language, by means of stress patterns, prosody, dialectal variation and gestures, but to switch all content words, sometimes the entire lexicon. The lexicon is either taken more or less completely from another language, or it is an ad hoc-conglomerate of words from various adjacent source languages. Very often, secret languages ‘distort’ their words by various complex patterns of morphological transformation; for instance, they truncate words and add heavy suffixes, they reverse syllables or letters, or they add epenthetic vowels within words. The result is a language that ‘melts in’ – from distance they appear as if they are a native or indigenous idiom, but not one single word is understandable to outsiders.

On Scandinavian soil, there are several traditional secret languages. One is the pedlars language, which in fact is two, one in the isolated county of Dalecarlia, gråmål ‘grey language’ or monsing, the main pedlars’ secret language, which during the 20th ct. transformed into a prisoners’ language. The vocabulary of monsing is based on multiple languages. Many words are borrowed from Scandoromani, the language of the indigenous Swedish Romani speakers, other words are from Low German, Rotwelsch, the Medieval secret jargon of European outsiders, from Finnish, Russian, as well as from Swedish. Swedish words are totally changed by linguistic distortion. Sources of monsing go back to the 17th ct. and they give us a glimpse of the type of communication that monsing speakers had. Besides communication related to their occupation, much of the content is rude, such as talk is about the farmers (who are supposed to be stupid) and in particular their wives and daughters (who are target of their sexual interest).
Even though there are no ‘real’ speakers of these languages in Sweden anymore, monsing is still, together with Scandoromani and knoparmoj, the secret language of chimney-sweepers, a very important source for words in the Scandinavian vernacular languages.

Reference
Carling, Gerd, Lenny Lindell & Gilbert Ambrazaitis (2014) Scandoromani. Remnants of a Mixed Language. Boston: Brill

Samples of the Swedish secret language Monsing (from 18th and 19th ct. sources). Most of them have found their way into Swedish slang.
Samples of the Swedish secret language Monsing (from 18th and 19th ct. sources). Most of them have found their way into Swedish slang.
Läs hela inlägget »


I found a fun map on twitter, from the Foreign Service Institute (see below), which categorises the difficulty of learning a language identified as number of weeks. According to the map (which applies to (American) English speakers), Swedish and French are languages that are supposed to be very easy to learn, whereas, e.g., Russian is found among the more difficult languages. Even though applied to English speakers, the map would not be very different to a speaker of Swedish or German. Why is that so? If you ask normal people (i.e., without a degree in linguistics), the answer would naturally be that languages like English and Swedish “have no grammar”. If you ask what they mean by “grammar”, many would come up with the answer “they have no cases”.
In learning a language like Russian, we have, early on, to start learning many case forms, and then to learn the rules for how to apply them in language. This is difficult to most of us using a language with prepositions (in, on, on top of, towards) rather than cases. But why do some languages have cases instead of prepositions? Or, to reverse the question, why do some languages have prepositions instead of cases? And are really the usages of prepositions easier to learn than the usages of cases? Very few languages (such as Hungarian or Ossetian or other exotic languages in the Caucasian mountains) have as many cases as any normal language such as Swedish or English has prepositions. The rules of English prepositions are also hard to learn, and speakers of, e.g., Swedish often make mistakes in the use of prepositions.

However, if we take a look at the map of learning difficulties in contrast to the map of case system types below (data from the DiACL database), the correspondence between the two maps (in the parts that overlap) is striking. Analytical systems are the easiest, followed by fusional, and finally by agglutinating and other more complex systems. It would be very interesting to see what the map looks like to native speakers of Finnish or Russian.
 
Case systems are interesting, since they indicate that languages are circular in their evolution. Case systems are basically of three types:

  • isolating (or analytic), with no cases, relations between participants in an event is expressed by prepositions (or postpositions),
  • agglutinating, with cases expressed by affixes with a simple function (plural, dative), which are attached to the noun stem,
  • fusional, where paradigms are built by cases which may mark several functions, such as feminine + dative + plural.


Case systems are typically of one of these types, where isolating systems are small, with 0-2 distinctions (e.g., English, Swedish, Danish), fusional systems are medium-large (e.g., Russian, German), often with many different forms in the language, whereas agglutinating systems tend to be large (e.g., Finnish, Hungarian, Turkic). Agglutinating systems are ruled by the principle, one suffix – one function (e.g., plural, dative).
Systems are seldom ‘pure’: most languages have case systems that are partly isolating, partly, agglutinating, partly fusional. That is what makes them difficult to learn.

Why is the case situation the way it is? The structure of case systems has multiple explanations, and linguists are not yet aware of all the details in the process and development of case systems. One important reason for the outcome is language change and the cyclical behaviour of case systems: Fusional systems (e.g., Russian) tend to break down or erode to isolating systems (e.g., English), which in may merge their combinations of noun + adposition into an agglutinating system (e.g., Turkish). And agglutinating systems, again, may fuse their forms to become fusional. However, in this cycle, languages may become stuck for millennia between states, where various types of mixed, weird and complex systems, with many and irregular forms, become standard.
Besides time and cyclic change, geography and language contact shape case systems. The situation is complex: case systems show clear tendencies of sharing similarities over language, branch and family boundaries. For instance, no case is more frequent in Western Europe, fusional cases are more frequent in Eastern Europe and in various conservative pockets (islands, forests) such as in Iceland, Faroe Islands, Germany, and Dalecarlia, and agglutinating cases are more frequent on the Asian landmass (except for in the east, China). But the map is complex: historical explanations struggle with geographic explanations, which in turn struggle with typological cyclic behaviour explanations, when we try to explain the structure of case systems.


 
 

From https://twitter.com/AmericanGeo/status/1010364347502059520
From https://twitter.com/AmericanGeo/status/1010364347502059520
Distribution of types of nominal case systems in modern (top) and ancient (bottom) languages. Dark red (1) targets no cases, green represent fusional types, pink/purple nuances agglutinating systems.
Distribution of types of nominal case systems in modern (top) and ancient (bottom) languages. Dark red (1) targets no cases, green represent fusional types, pink/purple nuances agglutinating systems.
Illustration of the morphological cycle of case systems. Tocharian is an example of a mixed system which has moved in the opposite direction, from fusinal to agglutinating..
Illustration of the morphological cycle of case systems. Tocharian is an example of a mixed system which has moved in the opposite direction, from fusinal to agglutinating..
Läs hela inlägget »
Pigs from "Historia de gentibus septentrionalibus" by Olaus Magnus (1555) Pigs from "Historia de gentibus septentrionalibus" by Olaus Magnus (1555)


In Scandinavian folklore, there is a story about a lethal pig, the Gloso (‘glowing sow’), which kills lonely hikers on their way home at Christmas Eve. The pig is black with glowing, red eyes, and its back is a sharp saw: running between humans’ legs, the creature cuts humans in two parts. The only way to survive a Gloso is to jump into the ditch as soon as you spot the animal’s glowing eyes from a distance. Stories about lethal pigs are also found in Celtic mythology, in the tale about Mag Mucrime, pigs from the underworld, which haunt and ravage the lands, killing people and destroying the fields. However, the ancient Celts were also very fond of their pigs. Typically, helmets and shields of Celtic warriors were decorated by boars – and we should not forget that a pig is in the centre in one of the most important Celtic epic tales, the story of Mac Da Thó’s pig. Likewise, in Germanic mythology, the pig is the animal of the god of fertility, Frey, and the boar Sæhrímnir, which can be eaten again and again, plays a central role as provider of meat to the dead warriors and the gods od Valhalla.
      
How come that the most important protein source to ancient Neolithic farmers had such different roles in various cultures of the Eurasian continent? Banned in some cultures, worshipped in others, and in other associated with death and the netherworld – apparently, the pig did not stay neutral to ancient people. Our answers are partly found in language.

Like the cow, goat and sheep, the pig belongs to the earliest domesticated animals, dating back to 10-11,000 BP in Anatolia and West Asia. Very likely, the first pigs were wild pigs attracted to human settlements by the waste. The early farmers, who very quickly must have understood the value of pigs as a protein source, successively domesticated them by killing the males and keeping the females for reproduction. In fact, even today, pigs are the most effective protein source of farming, besides chicken. The great danger associated with the hunting of wild boars must have contributed to the early farmers’ high esteem of pig domestication.
Domestication of pigs spread with the spread of farming, but for some reason – maybe that pigs are useless for herding or that they are easily infected by sickness – the domestication did not reach as far as the domestication of cow, goat, and sheep. Pigs are extremely unusual in Ancient Egypt, and pig domestication never reached Central Asia. In parts of West Asia and Anatolia, there was a decline in pig domestication already in early antiquity, something that was later transformed into a complete ban though religion, as in Judaism and later on also in Islam. In cultures where pig domestication was continued (Eastern and Western Europe, and the Mediterranean), the pig received a dual role in cultures: it was both an animal associated with death and the underworld, worshipped in chthonic sacrifices, as well as an animal symbolizing fertility and prosperity. This is found both in Graeco-Roman, Celtic, and Germanic mythology.

Can linguistics help us solving this enigma? There are several ways of investigating cultural patterns by language: either to look at the origin of words and their etymology down to the proto-language, or to consider the colexification patterns (meanings that co-occur in a language) and the meaning change patterns of words in genetically related languages. Stability and spread of cognates, as well as borrowing tendencies are important methods as well.

If we look at linguistic reconstructions, the picture is complex and interesting. Pig words, including a general word for ‘pig’ (generic), which is often the same as ‘sow’, as well as ‘piglet’, can be reconstructed to Proto-Indo-European (PIE *suH- ‘pig’, PIE *porḱo- ‘young pig, piglet’). These lexical roots, which had the meaning of ‘pig’ and ‘piglet’ already in the proto-language, indicate the Indo-Europeans had domesticated pigs. They are represented in a vast majority of pig words in Indo-European languages. Besides, some sub-branches replaced the forms or added new words for the pig terms. In Germanic languages, the male pig was derived from a root meaning ‘infertile’ (PGm *galtan- ‘boar’ < *gald(j)a- ‘infertile’ < PIE *ghol-tó-), indicating that male pigs or boars were gelded rather than killed. Several languages created new lexemes by referring to the grunting sound of pigs, such as Lithuanian čiūkà, kūkà ‘pig’ (Balto-Slavic *kyaw-, *kyū- < PIE *kew-, *kū- 'to howl') or Old and Modern Irish cráin ‘sow’ (Proto-Celtic *krākni- 'sow'). Some languages used the wide-spread Indo-European root for ‘young of animal’ (PIE *wetso- 'young of animal' < *wet- 'year').
The wild boar has its own root in Proto-Indo-European (PIE *h₁pr-o- '(wild) boar'), e.g., Latin aper, but very often, this root comes to represent both the wild and domesticated male pig, such as Croatian vȅpar, German Eber. Several languages use the Proto-Indo-European root PIE *h₂wŕ̥s-en- 'male' for the wild boar, such as Sanskrit varāha-, Hindi varāh, bā̆rāh. Else, a combination of a root meaning ‘wild’ and the root PIE *suH- 'pig' is very frequent, as in Bulgarian díva svinjá, German Wildschwein. In general, words with the meaning ‘wild boar’ also frequently mean ‘(domesticated) boar’, something that indicates that the wild boar was represented by the (male) boar, in contrast to the (female) sow, which represented the domesticated pig.     
Caucasian proto-languages, Proto-Kartvelian, Proto-North-West-Caucasian, Proto-Nakh, and Proto-Dagestanian all have reconstructed words with the meaning ‘pig’ (PKv *ɣor- ‘pig’, PNWC *ɣaw- ‘pig, piglet’, PN *eɣ-ə ‘pig’; PD *bol’- ‘pig’, PKv *burw- ‘gilt (female pig, 3-12 months old); suckling pig’, PNWC *bl˜’-ə ‘sow, female pig’, PN *borl’- ‘colourful’). This clearly points out that the early Caucasians domesticated the pig, something that we know they did early on.
Uralic, on the other hand, borrowed their pig words from Indo-European or Iranian (Proto-Finnic *sika ‘pig’, Proto-Finno-Ugric *porśas, *porćas, loan from Indo-Iranian), indicating that the early Uralic tribes did not domesticate the pig – they adapted pig domestication from Indo-European tribes.

However, the patterns of meaning change and colexification of pig words give an interesting picture. First, pig words often change to the meaning of other animals, often large and ‘chubby’ animals, such as ‘elephant’, ‘stallion’, or ‘camel’. In particular, this is the case in Caucasian languages. The domestic pig occasionally points in the direction of negative connotations, such as ‘filthy person’, ‘immoral person’, ‘fat’, or ‘greedy’. However, meaning changes and colexifications in the direction of power and fertility are frequent, such as 'bull', ‘hero’, ‘powerful’, 'king', ‘manly’, ‘chieftain’ and ‘husband’, in particular with the (wild) boar.

It is obvious that ancient people both worshipped and admired their pigs, but language indicates that they most of all respected the wild boars, probably because they were dangerous and hard to hunt. The domestic pigs were highly evaluated but also, apparently, looked down upon. The dangerous pigs we know from mythology have not given much imprint on language.
 
(Carling To appear (2019); Gamkrelidze et al. 1995; Larson and Fuller 2014; Mallory and Adams 1997)
 
Carling, Gerd (To appear (2019)), Mouton Atlas of Languages and Cultures. Vol. 1: Europe, Caucasus, Western and Southern Asia (Berlin - New York: Mouton de Gruyter).
Gamkrelidze, Tamaz Valerianovič, Ivanov, Vjačeslav Vsevolodovič, and Winter, Werner (1995), Indo-European and the Indo-Europeans : a reconstruction and historical analysis of a proto-language and a proto-culture (Trends in linguistics. Studies and monographs, 99-0115958-X ; 80; Berlin: Mouton de Gruyter).
Larson, Greger and Fuller, Dorian Q. (2014), 'The Evolution of Animal Domestication', Annual Review of Ecology, Evolution & Systematics, 45, 115-36.
Mallory, James P. and Adams, Douglas Q. (1997), Encyclopedia of Indo-European culture (London: Fitzroy Dearborn).

Hyllested, Adam 2017. Again on ancient pigs in Europe.
 

Semantic network of colexifications (blue, purple) and meaning change in etymologies (red) of the core concepts (green) PIG, WILD BOAR, and PIGLET in 85 Indo-European languages. Graph by Niklas Johansson.
Semantic network of colexifications (blue, purple) and meaning change in etymologies (red) of the core concepts (green) PIG, WILD BOAR, and PIGLET in 85 Indo-European languages. Graph by Niklas Johansson.
Läs hela inlägget »
Google Earth view of Xinjiang, with the areas of  Tocharian A and Tocharian B marked on the map Google Earth view of Xinjiang, with the areas of Tocharian A and Tocharian B marked on the map


The Takla Makan desert in Western China is in the middle of nowhere. Being there feels more like having landed on a deserted Tatooine than on earth; most villages are very scarcely populated and sand rocks, red desert sand, and dried salt rivers outdo the surroundings. The climate is horrible: winters are freezing, summers extremely hot and dry; springs and autumns are endurable, but temperatures between day and night often differs by 30̊ C. In a village called Subashi I met a villager, who had used 20 years to dig a well (by hand, I assume, considering the many years he had spent on the project). The well was obviously very deep, but it contained no water.
Nevertheless, French and German expeditions 100 years ago found the remnants of an Indo-European language in the sand-filled grottoes of this desert. The language, which was wrongly labelled ‘Tocharian’, after an Iranian tribe mentioned by the ancient Greeks, turned out to represent a branch of its own on the large Indo-European tree. In recent years, research has revealed new and interesting knowledge of this mysterious people, how they lived, where they came from, and what their language looked like.
During the first millennium ACE, the Tocharian civilization flourished along the Silk Road. By that time, Tocharian had split into two languages, which for the sake of simplicity are labelled Tocharian A and Tocharian B. The Tocharian culture was in important aspects not very different from other early Eastern medieval civilizations: they possessed a warrior class, a nobility, royals, farmers, and a religious class of monks, which lived from welfare in the form of alms by the working population. The Tocharians were Buddhists and learned to write by Buddhist missionaries from India, and the system they used to write their language was an adaptation of the Indic Brahmi script. Accordingly, most texts, which date between 300-1100 ACE, are of Buddhist content. A large part of the literary sources represent Tocharian adaptations of the Indian Buddhist canon – parallels in Sanskrit cannot always be found. After the Islamic conquest of Central Asia and the closing of the Silk Road, the Tocharian kingdoms collapsed, the Tocharian language died out, the area was depopulated, and the desert sand quickly buried all traces of the Tocharian people and their language.   
Even though out texts in Tocharian are of a relatively late date, at least compared to the ancient civilizations of the Mediterranean or the Fertile Crescent, archeology, archaeogenetics and - most of all – language give us rich information about the prehistory of the Tocharians.
It is evident that the Tocharians left the Indo-European homeland very early and migrated towards the East. Even though Tocharian is a centum language and actually has more similarities with western than eastern Indo-European languages, it clearly forms its own branch on the Indo-European tree. The long absence from the Indo-European proto-language, together with a long period in isolation from other Indo-European languages, has resulted in two languages with very weird and complex structures. The languages have many case forms, like Uralic and Caucasian languages, and they have double causatives, like Turkic languages. But even though the Tocharian categories clearly show non-Indo-European impact in the typological structure, the inflectional forms themselves are all of Indo-European descent: the setup of verbs easily matches Greek or Sanskrit in its complexity and variety of forms. Most forms and categories reconstructed to Indo-European are there, but often in a reorganized structure and with changed use and meaning.
Even though most preserved texts are of Buddhist context, the language and the specific Tocharian version of Buddism shows many traces of a pre-Buddhist, pagan faith, not very different from what we assume was present in early Indo-European. We have a sun-god and a moon-god, as well as remnants of the so-called heroic myths and the concept of ‘eternal glory’, which is well represented in epic tales such as the Iliad, the Odyssey, or the Mahabharata.  
Tocharians borrowed words from the Turkic Uighur language, from Chinese, and from Sanskrit; the latter in large amounts – almost half of the Tocharian lexicon has its source in Sanskrit. Uighur also borrowed from Tocharian. However, if we move back in time, Tocharian also borrowed a substantial amount of vocabulary, often administrative terms, from Iranian. In the period between 500 BCE and onwards, Tocharian seemed basically to be a recipient language, something that indicates that Tocharian during this period was a less important regional language than, for instance, Chinese (in the East) or various Iranian languages (in the West). If we look earlier than that, we find interesting and striking language contacts of Tocharian. Early forms of Tocharian are found in Uralic languages, and very likely, a pre-form of Tocharian is responsible for the Indo-European borrowings into Early Chinese. Therefore we may assume that Tocharians had a more important cultural role in the archaic period than in the antique period, when they basically were target of language borrowing.

Archaeological track record in the Tocharian-speaking area is astonishingly rich: most famous are the well-preserved mummies, which look like Celts with their pointy hats, tattoos and red braids. Studies of their DNA indicate several origins, in the earlier layers mainly Western European haplogroups, in later layers preferably Central Asian or Eastern haplogroups. The patrilinear DNA is mainly R1a1, a haplogroup associated with the Proto-Indo-European migration out of Eastern Europe.
However, there are many enigmas that still look for a solution. One of the most complex issues is the large amount of obscure lexemes in Tocharian. Even though the core vocabulary of Tocharian is completely Indo-European, most words of the lexicon (except for the many Sanskrit borrowings, of course) have either no etymology or a very uncertain etymology. It is possible that the Tocharians borrowed words from a long-lost substrate language – but what would that be? There are few traces of significantly different cultures in the area, preceding the Tocharians. Alternatively, Tocharian picked up words from several extinct, unrelated languages of Eurasia on their way from Eastern Europe to the Takla Makan desert. Very few, reliable etymologies in Tocharian can be sourced in any of the living language families of Asia.  
 
Coming up next: Heroic, lethal, or filthy animal? The history of pig words
 
References: (Adams 2013; Carling 2005; Carling et al. 2009; Mallory and Mair 2000; Malzahn 2011-2018; Pinault 2008)
Adams, Douglas Q. (2013), Dictionary of Tocharian B. : Revised and Greatly Enlarged. (Amsterdam: Rodopi).
Carling, Gerd (2005), 'Carling, Gerd. Proto-Tocharian, Common Tocharian, and Tocharian – on the value of linguistic connections in a reconstructed language', in Karlene Jones-Bley, et al. (eds.), Proceedings of the Sixteenth Annual UCLA Indo-European Conference (Journal of Indo-European Studies - Monograph Series; Washington: Institute of Man), 47-70.
Carling, Gerd, Pinault, Georges-Jean, and Winter, Werner (2009), Dictionary and thesaurus of Tocharian A (Wiesbaden: Otto Harrassowitz).
Mallory, J. P. and Mair, Victor H. (2000), The Tarim mummies : ancient China and the mystery of the earliest peoples from the West (London: Thames & Hudson).
Malzahn, Melanie (2011-2018), CEToM - A Comprehensive Edition of Tocharian Manuscripts.
Pinault, Georges-Jean (2008), Chrestomathie tokharienne : textes et grammaire (Leuven: Peeters).

Läs hela inlägget »