Genes, language, and how we study human pre-history


The Wall Street Journal has a story out today about a study published in the journal Science that claims all modern languages evolved out of the same proto-language—the linguistic version of Out-of-Africa. Meanwhile, the BBC is reporting on a paper published in Nature which suggests features that are shared between languages actually evolved independently, rather than being concepts coded into our brains by biology.

I'm not sure whether these two sets of results can be easily compared to one another. The studies were aimed at answering very different questions, so you can't just line one up against the other. Depending on your point of view, these results may be contradictory ... but that's not necessarily the case. What I do think is interesting about these two studies is the fact that both are based on research methodologies and theories that were born in the fields of evolutionary biology and genetic anthropology. For instance, the Wall Street Journal article says:

His research is based on phonemes, distinct units of sound such as vowels, consonants and tones, and an idea borrowed from population genetics known as "the founder effect." That principle holds that when a very small number of individuals break off from a larger population, there is a gradual loss of genetic variation and complexity in the breakaway group. Dr. Atkinson figured that if a similar founder effect could be discerned in phonemes, it would support the idea that modern verbal communication originated on that continent and only then expanded elsewhere.

And in the BBC story:

Lead author Michael Dunn, an evolutionary linguist at the Max Planck Institute for Psycholinguistics in the Netherlands, said the approach is akin to the study of pea plants by Gregor Mendel, which ultimately led to the idea of heritability of traits. Modern phylogenetics studies look at variations in animals that are known to be related, and from those can work out when specific structures evolved. For their [linguistic] studies, the team studied the characteristics of word order in four language families: Indo-European, Uto-Aztec, Bantu and Austronesian.

I'm curious how widespread this interdisciplinary approach is within linguistics, and whether most linguists think it's a reasonable way to study language evolution. I would think, at the very least, that you have to make some adjustments. After all, as the Wall Street Journal article points out, the forces that shape biological evolution work differently from those that shape cultural evolution.

Dr. Atkinson's approach has its limits. Genes change slowly, over many generations, while the diversity of phonemes amid a population group can change rapidly as language evolves. While distance from Africa can explain as much as 85% of the genetic diversity of populations, a similar distance measurement can explain only 19% of the variation in phonemic diversity. Dr. Atkinson said the measure is still statistically significant.

Wall Street Journal: The Mother of All Languages
BBC: Language Universality Idea Tested with Biology Method

Image: Some rights reserved by bruce-asher


  1. Language “genealogists” have been doing this since at least the 16th century (typically Jesuits running about trying to gain global influence by first learning the language – which was damn smart). So now we push it back one more step from Indo-European and nouns to Africa and phonemes. (Next up: some Bonobo and vowel sounds)

    1. If ever I wished for a “Like” button on BoingBoing, it’s for comments like this. Brilliant finish. I really should have gone into Linguistics in University. It looked like a more satisfying outlet of a person’s Creative Writing impulses. Going back to the Cro-Magnon/Neanderthal split, what would be their equivalent of “The Great Vowel Shift”?

    1. Yup, I was looking at that photo of the lips on the fingers, too; whoever that is really does need to get at least a nail brush … yuck, indeed.

  2. I’m not entirely sure the approach has as much merit as those researchers would like. Especially the one from Science which looks only at phonemes. Without looking comparatively at grammar and the evolution of phonemes within each individual language(of which no mention is made) then I’d say the results are far from solid. Off the top of my head I can think of a dozen languages that have discarded or added phonemes within the last thousand years, usually without clear evidence of outside pressure.

    It seems to me they already had a paradigm in their mind(Out of Africa theory) when they did the study, and that it influenced their methodology. I really don’t see how only a study of phonemes could get them to a universal language hypothesis. Though I am curious if there is a greater connection to this, but I want to see some historical data on known changes within each language before I draw a conclusion.

    1. Language change wouldn’t necessarily mimic human development. Language is quirky that way – look at Icelandic, for example. It shows a great number of earlier features than modern Norwegian or Swedish, but is a product of movement FROM Norway. Meaning that a more conservative form of the language is located far away from the homeland area.

      So, with the Africa theory, what makes anyone think that the conservative ‘proto-language’ people would be anywhere near Africa these days?

      Until they can construct a typology that adequately covers families like Caucasian, Algonquian, and the various Australian language families, they really have nothing interesting to say to me.

      1. Your Icelandic observation is a perfect example of what the Science article claims: Norwegian has changed more in the time since Icelandic pioneers left the mainland than the Icelandic language has — that is, modern Icelandic is closer to the Old Norse that the original Icelandic colonizers from Norway brought with them than modern Norwegian is. In other words, a mother tongue tends to change more than a colonial language.

        In the same way, the Science study found that African languages in the Bantu family have the most complex phonemic systems, suggesting that they have changed the most. Since the other languages in the study are less complicated phonemically, they’re probably the “colonial” languages to the African “mother tongue.”

        Your other claim about the dubiousness of studying phonemes and word order since they don’t accurately describe all world languages is irrelevant. The authors chose these language families because they’re large and well studied; they looked at phonemes and word order because those are standard methods in linguistics for comparatively studying languages.

        From the Nature article:

        “We selected four large language families for which quantitative phylogenies are available: Austronesian (with about 1,268 languages and a time depth of about 5,200 years), Indo-European (about 449 languages, time depth of about 8,700 years), Bantu (about 668 or 522 for Narrow Bantu, time depth about 4,000 years) and Uto-Aztecan (about 61 languages, time-depth about 5,000 years). Between them these language families encompass well over a third of the world’s approximately 7,000 languages. We focused our analyses on the ‘word-order universals’ because these are the most frequently cited exemplary candidates for strongly correlated linguistic features, with plausible motivations for interdependencies rooted in prominent formal and functional theories of grammar.”

    1. Yes, but that would have led to a completely weird collection of comments. But then it could get interesting…

      My thoughts then turn to the film, The In-Laws. The original, with Peter Falk, not the remake.

  3. I have to say, the variables they always track in these studies are really dubious. ‘Phonemes’ and ‘Word order’ look like pretty silly choices to someone like me who works with a language family that has no clear evidence for either. Word order does not indicate grammatical role (i.e. ‘subject’ ‘object’) in this language family (this is the core assumption of ‘word order’ studies, which track subject and object wrt to the verb), and there are no minimal pairs (like ‘bat’ and ‘pat’), which means you can’t pick out phonemes in any straightforward way.

    Even in languages that use word order like they expect, there’s still quite a few decisions that have to be made before you can categorize a language as ‘SVO’ or ‘SOV’. Turkish, for example, has plenty of evidence for SVO and even OVS in natural speech, but the syntacticians posit ‘SOV’ as the ‘basic’ order and then derive the other ones by movement. Meaning that ‘SOV’ is a theoretical construct, not some ‘basic’ thing that you can easily test across languages.

    Of course, my language family is never used for these studies. ;) They always pick ones that conform more easily to the test. Bantu is pretty easy for phoneme-type treatments, for example. Indo-European behaves well for word-order. What a coincidence!

  4. Thanks I love this type of thing. Did you catch Oliver Sack’s “Man of Letters” article in a recent New Yorker?

    Basically he talks about how the area of our brain that is good at language (spoken and written) evolved EONS before the invention of anything close to language.

    Sooo.. what did it evolve to accomplish? Maybe memorizing and recognizing which plants are safe to eat.. that sort of thing.

    Here is the link, although you need a subscription for the whole text:

  5. Everyone would like more data, but there is one awful problem: we can’t go back in time. It is very possible that at one point, all the world’s languages were one and the same, we can’t altogether dismiss it. But unless the people who spoke this language thirty thousand years ago wrote it down, we will never know for sure.
    Until then, such hypotheses will wow the masses while hundreds of modern languages are dying peacefully and you won’t find their study published in Science magazine.

  6. Fascinating. I’m not familiar with the subject matter but two things stick out to me. One, at some point in history we absolutely had a ‘common language’, silence, or rather the common absence of language. However, as we see with animals today, this does not mean we didn’t communicate. The second thing is, based on our shared physical make up, mouth, larynx, etc, we all have the same set of sounds we are capable of making. Does this somehow represent a ‘proto language’ of a set of sounds we are all capable of making?

    1. >One, at some point in history we absolutely had a ‘common language’, silence, or rather the common absence of language.

      I doubt there was a point when humans were silent. I mean the first thing a baby does is yell, and that’s not for communicative purposes, doesn’t mean that it doesn’t signify anything. Gestures count as language as well, hitting someone in the face is very meaningful, probably much more so than a hand gesture. Animals have languages also, what would make us so different?

      The second thing is, based on our shared physical make up, mouth, larynx, etc, we all have the same set of sounds we are capable of making. Does this somehow represent a ‘proto language’ of a set of sounds we are all capable of making?

      No, because cultural practices barges in. We do not all have the vocal capabilities, and we do not all use these vocal capabilities the same way. Communication goes through signs, which may or may not be arbitrary depending on who you ask, but two people will never interpret the same sign exactly the same way. We all have different histories, different brains.

  7. Indo-European is old by our standards, but written languages – including classical Greek which was known when Indo-European was discovered, and Hittite which has been discovered since then – cover a substantial fraction of that history.

    A hypothetical common proto-language would be at least an order of magnitude older. It’s not the same kind of research at all. It’s intriguing, but it’s going to take other studies to give it any kind of context.

  8. Genes change slowly, over many generations, while the diversity of phonemes amid a population group can change rapidly as language evolves.

    Genes and memes both change at the same rate: with each reproduction. While the amount of change per reproduction can vary wildly, that’s true for both genes and memes. The important distinction is that memes reproduce much more frequently than genes.

    I’d forward the argument that genes and memes aren’t really all that different, and that genes are, in fact, a special case of meme, like phonemes are a special case of meme. In both cases, we’re talking about a unit of information that reproduces.

    1. Digital and analog information are both information, but the difference is still important.

  9. For my money, the phoneme-loss-of-diversity theory pointing towards one language originally in Africa, makes more sense than the word-order-theory pointing towards parallel language development.

    This is because word order is far less important to meaning than phonemes. Consider these sentences:

    I got milk.

    Got milk I.

    Milk I got.

    Got I milk.

    They all say the same thing, don’t they? Now change the words.

    E fet Pogt

    Pogt fet E.

    …doesn’t make a lick of sense, does it?

    Order is less important than phonemes, because phonemes contain the meaning and our brains derive the meaning from them.

  10. Suggested reading: linguist Prof. Richard Sproat’s comments/critique of the article: “_Science_ does it again” –

    “Atkinson’s thesis is striking, but as I said above such striking conclusions require striking support, and I believe that the paper in its current form does not provide enough support.”

  11. Actually, philologists were doing these kinds of “genetic” analyses *before* biologists. The metaphor transfer was originally in the opposite direction.

    Whether it’s more appropriate to language or biology is another question. Language is more Lamarckian (acquired traits can be passed on) and multilingualism presents some interesting challenges to genetic metaphors. However, biologists are now finding out that some microbes are able to swap genes without hereditary ties. That makes those genes more like language, something that you can acquire from the environment.

  12. “I’m curious how widespread this interdisciplinary approach is within linguistics, and whether most linguists think it’s a reasonable way to study language evolution.”

    Answer: no.

Comments are closed.