The proto-language hypotheses

According to etymology's state of the art, words may be classified into four categories. First, words of which the etymology is explicitly unknown or missing. Second, words for which a couple of folk or pseudo-scientific speculations have been made, frequently by anonymous thinkers; such etymologies, although amusing they may be, are otherwise useless. Third, words for which a list of cognates from existing and/or extinct languages has been prepared. Forth, words for which a hypothetical root in some unattested proto-language has been worked out (reconstructed) by comparative linguistics. The most valuable etymologies are third categories because they wave the need to search for cognates. Those etymologies also provide a range of meanings within which our query word may fall and lead to an approximate understanding, provided that we understand the meaning of the cognates. The fourth category does not add any value. On the contrary, it may discourage further research on the false premise that etymology has already been worked out.

Apart from the interest that one may find in phonetic research, a hypothetical root in a proto-language does not improve our understanding of the word or the social and historical circumstances where the word was created. In the best case, projection of cognates onto a proto-language only adds a hypothetical cognate, the hypothetical root. Still, it does not guarantee that this root was the first, de novo form of a word. The question of origin remains. For example, why PIEs would call the water *wod-or, from the root *wed- (wet)? Why not the other way round (*wed from *wodor)? Why there were two words for wet and water? Where did *wed- come from? If it was created from scratch, why did PIEs choose those three sounds, in that combination, to express the notion of water? And why do other populations choose a different combination? Why did Proto-Semitic populations choose to call the water *maʔ-/*may-, of something like that? Is there any relation between *wed and *may? Suppose both descend from a proto-proto-language, and we were to reconstruct the hypothetical common ancestor of proto-roots. In that case, we will logically finish with a combination of fewer than 3 phonemes. Would that be an acceptable solution? If not, we would be forced to admit that words – if not languages – arise spontaneously at different parts of the world and have no common ancestor. But such a conclusion would defeat the purpose of searching for common roots and for proto-languages. Phonetics may explain the evolution of words once they are created but does not explain the origin of the words.

My main objection concerns the implicit and sometimes explicit assumption behind a proto-language hypothesis and the associated comparative reconstruction methods that language is a genetic trait. The theory appeared roughly as Darwin's evolutionary theory and developed parallel with genetic theory. There is, however, a fundamental difference between language and genetic traits. The latter are inherited vertically from parents to children and cannot be transmitted horizontally, say from teacher to pupils (except for some viruses). Language is not a genetic trait. We are born with the ability to hear and reproduce a range of sounds but without any words. Language is transmitted horizontally only. This is an absolute principle without exceptions. Children do not inherit a language from their parents in the biological sense, but they learn it. Thus, language is an acquired trait, like every other bit of knowledge and skill. Even when the first teachers are the parents, the transmission is still horizontal. Obviously, inheritance and genetic segregation laws do not apply in linguistics. As a language unit, a word can be created de novo by anybody, anywhere, like a piece of pottery. If a word is sufficiently fit and given some chance, it can spread throughout the globe within the same generation, like a virus. Take the word Google or any modern scientific term as typical examples. With the assumption that horizontal transmission is the only mechanism of language propagation, which sounds obvious to me, we may expect that languages evolve independently from genetics. They can cross genetic barriers between populations, particularly geographically close populations and constantly interacting. Genetics would be a reasonably reliable predictor only in culturally isolated populations.

One implication of the horizontal transfer principle is that language does not reflect a population's genetic composition and origin. If, for example, Modern Greek consists of x% Ancient Greek words, y% Turkish words, and z% Germanic words, one cannot expect to find similar proportions of DNA polymorphisms of corresponding origins in the Greek population. The figures would instead reflect the relative strength and duration in the history of social interactions of the Greeks with the corresponding nations. The Greek language had a vastly disproportional impact on humanity with Greek genotypes.

Another implication is on the estimates of the distance in the past when languages split. Instead, these distances would represent the intensity of interaction between neighboring populations or lack of it. This is not to claim that a proto-language hypothesis is useless. Language does look like a genetic trait overall because it is transmitted from parents to children most of the time. However, such factors may strongly bias the search for the roots of individual words. 'Genetic' roots may only be assumed when the horizontal transfer has been excluded. In other words, comparative methods should not be applied to a particular group of languages but should include all the potentially interacting languages simultaneously.

The ichnography theory waives the assumptions about the transmission mechanism of the words. The same author may create different words for the same thing and may arbitrarily attach different meanings to pre-existing words. Greek words may look and sound Semitic simply because, at the origin, Greeks used a Semitic alphabet. At the end of the day, this text is less about the authors' identity and more about the successful transition of words. Most important is what the words tell us about their authors and the societies in which they lived.

Poetics - Epilogue

18 January 2022

The proto-language hypotheses