In Scandinavian folklore, there is a story about a lethal pig, the Gloso (‘glowing sow’), which kills lonely hikers on their way home at Christmas Eve. The pig is black with glowing, red eyes, and its back is a sharp saw: running between humans’ legs, the creature cuts humans in two parts. The only way to survive a Gloso is to jump into the ditch as soon as you spot the animal’s glowing eyes from a distance. Stories about lethal pigs are also found in Celtic mythology, in the tale about Mag Mucrime, pigs from the underworld, which haunt and ravage the lands, killing people and destroying the fields. However, the ancient Celts were also very fond of their pigs. Typically, helmets and shields of Celtic warriors were decorated by boars – and we should not forget that a pig is in the centre in one of the most important Celtic epic tales, the story of Mac Da Thó’s pig. Likewise, in Germanic mythology, the pig is the animal of the god of fertility, Frey, and the boar Sæhrímnir, which can be eaten again and again, plays a central role as provider of meat to the dead warriors and the gods od Valhalla.
How come that the most important protein source to ancient Neolithic farmers had such different roles in various cultures of the Eurasian continent? Banned in some cultures, worshipped in others, and in other associated with death and the netherworld – apparently, the pig did not stay neutral to ancient people. Our answers are partly found in language.

Like the cow, goat and sheep, the pig belongs to the earliest domesticated animals, dating back to 10-11,000 BP in Anatolia and West Asia. Very likely, the first pigs were wild pigs attracted to human settlements by the waste. The early farmers, who very quickly must have understood the value of pigs as a protein source, successively domesticated them by killing the males and keeping the females for reproduction. In fact, even today, pigs are the most effective protein source of farming, besides chicken. The great danger associated with the hunting of wild boars must have contributed to the early farmers’ high esteem of pig domestication.
Domestication of pigs spread with the spread of farming, but for some reason – maybe that pigs are useless for herding or that they are easily infected by sickness – the domestication did not reach as far as the domestication of cow, goat, and sheep. Pigs are extremely unusual in Ancient Egypt, and pig domestication never reached Central Asia. In parts of West Asia and Anatolia, there was a decline in pig domestication already in early antiquity, something that was later transformed into a complete ban though religion, as in Judaism and later on also in Islam. In cultures where pig domestication was continued (Eastern and Western Europe, and the Mediterranean), the pig received a dual role in cultures: it was both an animal associated with death and the underworld, worshipped in chthonic sacrifices, as well as an animal symbolizing fertility and prosperity. This is found both in Graeco-Roman, Celtic, and Germanic mythology.

Can linguistics help us solving this enigma? There are several ways of investigating cultural patterns by language: either to look at the origin of words and their etymology down to the proto-language, or to consider the colexification patterns (meanings that co-occur in a language) and the meaning change patterns of words in genetically related languages. Stability and spread of cognates, as well as borrowing tendencies are important methods as well.

If we look at linguistic reconstructions, the picture is complex and interesting. Pig words, including a general word for ‘pig’ (generic), which is often the same as ‘sow’, as well as ‘piglet’, can be reconstructed to Proto-Indo-European (PIE *suH- ‘pig’, PIE *porḱo- ‘young pig, piglet’). These lexical roots, which had the meaning of ‘pig’ and ‘piglet’ already in the proto-language, indicate the Indo-Europeans had domesticated pigs. They are represented in a vast majority of pig words in Indo-European languages. Besides, some sub-branches replaced the forms or added new words for the pig terms. In Germanic languages, the male pig was derived from a root meaning ‘infertile’ (PGm *galtan- ‘boar’ < *gald(j)a- ‘infertile’ < PIE *ghol-tó-), indicating that male pigs or boars were gelded rather than killed. Several languages created new lexemes by referring to the grunting sound of pigs, such as Lithuanian čiūkà, kūkà ‘pig’ (Balto-Slavic *kyaw-, *kyū- < PIE *kew-, *kū- 'to howl') or Old and Modern Irish cráin ‘sow’ (Proto-Celtic *krākni- 'sow'). Some languages used the wide-spread Indo-European root for ‘young of animal’ (PIE *wetso- 'young of animal' < *wet- 'year').
The wild boar has its own root in Proto-Indo-European (PIE *h₁pr-o- '(wild) boar'), e.g., Latin aper, but very often, this root comes to represent both the wild and domesticated male pig, such as Croatian vȅpar, German Eber. Several languages use the Proto-Indo-European root PIE *h₂wŕ̥s-en- 'male' for the wild boar, such as Sanskrit varāha-, Hindi varāh, bā̆rāh. Else, a combination of a root meaning ‘wild’ and the root PIE *suH- 'pig' is very frequent, as in Bulgarian díva svinjá, German Wildschwein. In general, words with the meaning ‘wild boar’ also frequently mean ‘(domesticated) boar’, something that indicates that the wild boar was represented by the (male) boar, in contrast to the (female) sow, which represented the domesticated pig.     
Caucasian proto-languages, Proto-Kartvelian, Proto-North-West-Caucasian, Proto-Nakh, and Proto-Dagestanian all have reconstructed words with the meaning ‘pig’ (PKv *ɣor- ‘pig’, PNWC *ɣaw- ‘pig, piglet’, PN *eɣ-ə ‘pig’; PD *bol’- ‘pig’, PKv *burw- ‘gilt (female pig, 3-12 months old); suckling pig’, PNWC *bl˜’-ə ‘sow, female pig’, PN *borl’- ‘colourful’). This clearly points out that the early Caucasians domesticated the pig, something that we know they did early on.
Uralic, on the other hand, borrowed their pig words from Indo-European or Iranian (Proto-Finnic *sika ‘pig’, Proto-Finno-Ugric *porśas, *porćas, loan from Indo-Iranian), indicating that the early Uralic tribes did not domesticate the pig – they adapted pig domestication from Indo-European tribes.

However, the patterns of meaning change and colexification of pig words give an interesting picture. First, pig words often change to the meaning of other animals, often large and ‘chubby’ animals, such as ‘elephant’, ‘stallion’, or ‘camel’. In particular, this is the case in Caucasian languages. The domestic pig occasionally points in the direction of negative connotations, such as ‘filthy person’, ‘immoral person’, ‘fat’, or ‘greedy’. However, meaning changes and colexifications in the direction of power and fertility are frequent, such as 'bull', ‘hero’, ‘powerful’, 'king', ‘manly’, ‘chieftain’ and ‘husband’, in particular with the (wild) boar.

It is obvious that ancient people both worshipped and admired their pigs, but language indicates that they most of all respected the wild boars, probably because they were dangerous and hard to hunt. The domestic pigs were highly evaluated but also, apparently, looked down upon. The dangerous pigs we know from mythology have not given much imprint on language.
Semantic network of colexifications (blue, purple) and meaning change in etymologies (red) of the core concepts (green) PIG, WILD BOAR, and PIGLET in 85 Indo-European languages. Graph by Niklas Johansson.
The Takla Makan desert in Western China is in the middle of nowhere. Being there feels more like having landed on a deserted Tatooine than on earth; most villages are very scarcely populated and sand rocks, red desert sand, and dried salt rivers outdo the surroundings. The climate is horrible: winters are freezing, summers extremely hot and dry; springs and autumns are endurable, but temperatures between day and night often differs by 30̊ C. In a village called Subashi I met a villager, who had used 20 years to dig a well (by hand, I assume, considering the many years he had spent on the project). The well was obviously very deep, but it contained no water.
Nevertheless, French and German expeditions 100 years ago found the remnants of an Indo-European language in the sand-filled grottoes of this desert. The language, which was wrongly labelled ‘Tocharian’, after an Iranian tribe mentioned by the ancient Greeks, turned out to represent a branch of its own on the large Indo-European tree. In recent years, research has revealed new and interesting knowledge of this mysterious people, how they lived, where they came from, and what their language looked like.
During the first millennium ACE, the Tocharian civilization flourished along the Silk Road. By that time, Tocharian had split into two languages, which for the sake of simplicity are labelled Tocharian A and Tocharian B. The Tocharian culture was in important aspects not very different from other early Eastern medieval civilizations: they possessed a warrior class, a nobility, royals, farmers, and a religious class of monks, which lived from welfare in the form of alms by the working population. The Tocharians were Buddhists and learned to write by Buddhist missionaries from India, and the system they used to write their language was an adaptation of the Indic Brahmi script. Accordingly, most texts, which date between 300-1100 ACE, are of Buddhist content. A large part of the literary sources represent Tocharian adaptations of the Indian Buddhist canon – parallels in Sanskrit cannot always be found. After the Islamic conquest of Central Asia and the closing of the Silk Road, the Tocharian kingdoms collapsed, the Tocharian language died out, the area was depopulated, and the desert sand quickly buried all traces of the Tocharian people and their language.   
Even though out texts in Tocharian are of a relatively late date, at least compared to the ancient civilizations of the Mediterranean or the Fertile Crescent, archeology, archaeogenetics and - most of all – language give us rich information about the prehistory of the Tocharians.
It is evident that the Tocharians left the Indo-European homeland very early and migrated towards the East. Even though Tocharian is a centum language and actually has more similarities with western than eastern Indo-European languages, it clearly forms its own branch on the Indo-European tree. The long absence from the Indo-European proto-language, together with a long period in isolation from other Indo-European languages, has resulted in two languages with very weird and complex structures. The languages have many case forms, like Uralic and Caucasian languages, and they have double causatives, like Turkic languages. But even though the Tocharian categories clearly show non-Indo-European impact in the typological structure, the inflectional forms themselves are all of Indo-European descent: the setup of verbs easily matches Greek or Sanskrit in its complexity and variety of forms. Most forms and categories reconstructed to Indo-European are there, but often in a reorganized structure and with changed use and meaning.
Even though most preserved texts are of Buddhist context, the language and the specific Tocharian version of Buddism shows many traces of a pre-Buddhist, pagan faith, not very different from what we assume was present in early Indo-European. We have a sun-god and a moon-god, as well as remnants of the so-called heroic myths and the concept of ‘eternal glory’, which is well represented in epic tales such as the Iliad, the Odyssey, or the Mahabharata.  
Tocharians borrowed words from the Turkic Uighur language, from Chinese, and from Sanskrit; the latter in large amounts – almost half of the Tocharian lexicon has its source in Sanskrit. Uighur also borrowed from Tocharian. However, if we move back in time, Tocharian also borrowed a substantial amount of vocabulary, often administrative terms, from Iranian. In the period between 500 BCE and onwards, Tocharian seemed basically to be a recipient language, something that indicates that Tocharian during this period was a less important regional language than, for instance, Chinese (in the East) or various Iranian languages (in the West). If we look earlier than that, we find interesting and striking language contacts of Tocharian. Early forms of Tocharian are found in Uralic languages, and very likely, a pre-form of Tocharian is responsible for the Indo-European borrowings into Early Chinese. Therefore we may assume that Tocharians had a more important cultural role in the archaic period than in the antique period, when they basically were target of language borrowing.

Archaeological track record in the Tocharian-speaking area is astonishingly rich: most famous are the well-preserved mummies, which look like Celts with their pointy hats, tattoos and red braids. Studies of their DNA indicate several origins, in the earlier layers mainly Western European haplogroups, in later layers preferably Central Asian or Eastern haplogroups. The patrilinear DNA is mainly R1a1, a haplogroup associated with the Proto-Indo-European migration out of Eastern Europe.
However, there are many enigmas that still look for a solution. One of the most complex issues is the large amount of obscure lexemes in Tocharian. Even though the core vocabulary of Tocharian is completely Indo-European, most words of the lexicon (except for the many Sanskrit borrowings, of course) have either no etymology or a very uncertain etymology. It is possible that the Tocharians borrowed words from a long-lost substrate language – but what would that be? There are few traces of significantly different cultures in the area, preceding the Tocharians. Alternatively, Tocharian picked up words from several extinct, unrelated languages of Eurasia on their way from Eastern Europe to the Takla Makan desert. Very few, reliable etymologies in Tocharian can be sourced in any of the living language families of Asia.  
