The bit rate of human languages
Language scientists think they have determined that the universal bit rate for transmitting information across multiple languages is about 39 bits per second.
Scientists started with written texts from 17 languages, including English, Italian, Japanese, and Vietnamese. They calculated the information density of each language in bits—the same unit that describes how quickly your cellphone, laptop, or computer modem transmits information. They found that Japanese, which has only 643 syllables, had an information density of about 5 bits per syllable, whereas English, with its 6949 syllables, had a density of just over 7 bits per syllable. Vietnamese, with its complex system of six tones (each of which can further differentiate a syllable), topped the charts at 8 bits per syllable.
Next, the researchers spent 3 years recruiting and recording 10 speakers—five men and five women—from 14 of their 17 languages. (They used previous recordings for the other three languages.) Each participant read aloud 15 identical passages that had been translated into their mother tongue. After noting how long the speakers took to get through their readings, the researchers calculated an average speech rate per language, measured in syllables/second.
Some languages were clearly faster than others: no surprise there. But when the researchers took their final step—multiplying this rate by the bit rate to find out how much information moved per second—they were shocked by the consistency of their results. No matter how fast or slow, how simple or complex, each language gravitated toward an average rate of 39.15 bits per second, they report today in Science Advances. In comparison, the world’s first computer modem (which came out in 1959) had a transfer rate of 110 bits per second, and the average home internet connection today has a transfer rate of 100 megabits per second (or 100 million bits). [emphasis mine]
When I went to Russia the first time in 1995 I used to joke with my caving friends there that the real reason the U.S. won the cold war was that English words routinely used one syllable for the three required by Russian and thus we could get things done in one third the time. I would start listing comparable words, (for example “good” vs “khah-rah-shoh” and “please” vs “pa-ZHAL-sta”) and challenge them to come up with any example where the Russian word had fewer syllables. It drove them crazy because they couldn’t do it.
I was joking of course. It makes sense that the information rates should actually be pretty much the same, as this study suggests. However, the highlighted words also suggest that the subtle differences should also not be ignored.
The support of my readers through the years has given me the freedom and ability to analyze objectively the ongoing renaissance in space, as well as the cultural changes -- for good or ill -- that are happening across America. Four years ago, just before the 2020 election I wrote that Joe Biden's mental health was suspect. Only in this year has the propaganda mainstream media decided to recognize that basic fact.
Fourteen years ago I wrote that SLS and Orion were a bad ideas, a waste of money, would be years behind schedule, and better replaced by commercial private enterprise. Even today NASA and Congress refuse to recognize this reality.
In 2020 when the world panicked over COVID I wrote that the panic was unnecessary, that the virus was apparently simply a variation of the flu, that masks were not simply pointless but if worn incorrectly were a health threat, that the lockdowns were a disaster and did nothing to stop the spread of COVID. Only in the past year have some of our so-called experts in the health field have begun to recognize these facts.
Your help allows me to do this kind of intelligent analysis. I take no advertising or sponsors, so my reporting isn't influenced by donations by established space or drug companies. Instead, I rely entirely on donations and subscriptions from my readers, which gives me the freedom to write what I think, unencumbered by outside influences.
Please consider supporting my work here at Behind the Black.
You can support me either by giving a one-time contribution or a regular subscription. There are five ways of doing so:
1. Zelle: This is the only internet method that charges no fees. All you have to do is use the Zelle link at your internet bank and give my name and email address (zimmerman at nasw dot org). What you donate is what I get.
2. Patreon: Go to my website there and pick one of five monthly subscription amounts, or by making a one-time donation.
3. A Paypal Donation:
5. Donate by check, payable to Robert Zimmerman and mailed to
Behind The Black
c/o Robert Zimmerman
P.O.Box 1262
Cortaro, AZ 85652
You can also support me by buying one of my books, as noted in the boxes interspersed throughout the webpage or shown in the menu above. And if you buy the books through the ebookit links, I get a larger cut and I get it sooner.
Language scientists think they have determined that the universal bit rate for transmitting information across multiple languages is about 39 bits per second.
Scientists started with written texts from 17 languages, including English, Italian, Japanese, and Vietnamese. They calculated the information density of each language in bits—the same unit that describes how quickly your cellphone, laptop, or computer modem transmits information. They found that Japanese, which has only 643 syllables, had an information density of about 5 bits per syllable, whereas English, with its 6949 syllables, had a density of just over 7 bits per syllable. Vietnamese, with its complex system of six tones (each of which can further differentiate a syllable), topped the charts at 8 bits per syllable.
Next, the researchers spent 3 years recruiting and recording 10 speakers—five men and five women—from 14 of their 17 languages. (They used previous recordings for the other three languages.) Each participant read aloud 15 identical passages that had been translated into their mother tongue. After noting how long the speakers took to get through their readings, the researchers calculated an average speech rate per language, measured in syllables/second.
Some languages were clearly faster than others: no surprise there. But when the researchers took their final step—multiplying this rate by the bit rate to find out how much information moved per second—they were shocked by the consistency of their results. No matter how fast or slow, how simple or complex, each language gravitated toward an average rate of 39.15 bits per second, they report today in Science Advances. In comparison, the world’s first computer modem (which came out in 1959) had a transfer rate of 110 bits per second, and the average home internet connection today has a transfer rate of 100 megabits per second (or 100 million bits). [emphasis mine]
When I went to Russia the first time in 1995 I used to joke with my caving friends there that the real reason the U.S. won the cold war was that English words routinely used one syllable for the three required by Russian and thus we could get things done in one third the time. I would start listing comparable words, (for example “good” vs “khah-rah-shoh” and “please” vs “pa-ZHAL-sta”) and challenge them to come up with any example where the Russian word had fewer syllables. It drove them crazy because they couldn’t do it.
I was joking of course. It makes sense that the information rates should actually be pretty much the same, as this study suggests. However, the highlighted words also suggest that the subtle differences should also not be ignored.
The support of my readers through the years has given me the freedom and ability to analyze objectively the ongoing renaissance in space, as well as the cultural changes -- for good or ill -- that are happening across America. Four years ago, just before the 2020 election I wrote that Joe Biden's mental health was suspect. Only in this year has the propaganda mainstream media decided to recognize that basic fact.
Fourteen years ago I wrote that SLS and Orion were a bad ideas, a waste of money, would be years behind schedule, and better replaced by commercial private enterprise. Even today NASA and Congress refuse to recognize this reality.
In 2020 when the world panicked over COVID I wrote that the panic was unnecessary, that the virus was apparently simply a variation of the flu, that masks were not simply pointless but if worn incorrectly were a health threat, that the lockdowns were a disaster and did nothing to stop the spread of COVID. Only in the past year have some of our so-called experts in the health field have begun to recognize these facts.
Your help allows me to do this kind of intelligent analysis. I take no advertising or sponsors, so my reporting isn't influenced by donations by established space or drug companies. Instead, I rely entirely on donations and subscriptions from my readers, which gives me the freedom to write what I think, unencumbered by outside influences.
Please consider supporting my work here at Behind the Black.
You can support me either by giving a one-time contribution or a regular subscription. There are five ways of doing so:
1. Zelle: This is the only internet method that charges no fees. All you have to do is use the Zelle link at your internet bank and give my name and email address (zimmerman at nasw dot org). What you donate is what I get.
2. Patreon: Go to my website there and pick one of five monthly subscription amounts, or by making a one-time donation.
3. A Paypal Donation:
5. Donate by check, payable to Robert Zimmerman and mailed to
Behind The Black
c/o Robert Zimmerman
P.O.Box 1262
Cortaro, AZ 85652
You can also support me by buying one of my books, as noted in the boxes interspersed throughout the webpage or shown in the menu above. And if you buy the books through the ebookit links, I get a larger cut and I get it sooner.
This part
Japanese, which has only 643 syllables
confuses me. There are roughly 50 syllables in Japanese, each represented by one hiragana character (used if the word is of Japanese origin) and one katakana character (used if the word is of foreign origin). They must mean something different than what is conventionally considered a syllable.
Imagine a new language that got concepts across (e.g. words) using the fewest number of syllables. By increasing the number of bits per syllable and reducing the number of syllables per word, more infuriation could be translated in the shortest period of time.
It’s probably an impractical concept because learning a new language is not easy and a new language has few others that speak it to make it worth learning.
I seem to recall — I don’t speak the language actually — that Russian leaves out the definite (“the”) and indefinite (“a”, “an”) articles which are fixtures in most English speech. Presumably context supplies non-speech equivalences. “I’ll have big red apple” works as well as “I’ll have that big red apple.” So that would increase the efficiency of Russian.
mike shupp: You are correct. Russian does not have articles like “a,” “an,” and “the.” However, there is a big difference between asking for “an apple” or “the apple.” Lacking the article means you have to use a lot more words to refer to that specific apple. Thus, the efficiency you refer to doesn’t really help much.
I often wonder if the reason for the stereotype of Russian bluntness comes from this lack. When they learn other languages, they are not used to using articles, which means it is not rare for them to routinely leave them out. And to an English ear, speaking without articles always appears blunt and brutish.
Russian also lacks some forms of “to be”, especially in present tense, leading to Hamlet’s quandary, “to be, or not to be?” became an existential question, to exist or not exist. The “Hamlet Question” became an ethical question among minor Russian nobility stuck in the army, pondering whether one had an obligation commit suicide. Russia is certainly a fascinating culture.
Semi-related, I read Clockwork Orange a few years ago, after seeing the movie when it first came out. The thug gangs have a lot of slang that is nearly impenetrable, unless you’ve studied a bit of Russian and realize that virtually all the slang is simple substitution of Russian words for English. One fascinating exception is a bilingual rhyming slang, transforming “khorosho” (good) into “horror show” (also meaning good).
If the bit rate is similar for most or all languages, it seems to me that there must be an optimal rate for comprehension by the listener. Has anyone else heard Ben Shapiro speak? How about those rapid-fire disclaimers on radio?
Previously noted: best form on the ‘Net.
And the thread earned a Diane E Wilson comment.
Edward —
Nice points. Thinking about it, I’d suggest that an awful lot of rapid-fire speech isn’t intended for effectively conveying data. The disclaimers that come at the end of pharmaceutical commercials is mostly boiler plate; people skim over this unless they catch a word or phrase that reverberates with them. A lot of political speech or similar commentary is designed to get the listener’s acquiescence rather than communicate details — if I’m trying to persuade an audience that Germany must expand to the East, the last thing I want is some pedant in the back arguing that my interpretation of medieval Hungarian land ownership patterns is incorrect.
Thinking further, I’ve been watching a lot of television recently, after years of generally ignoring it, and one of the changes I notice is that a lot of newscasters speak really quickly these days.
mike shupp wrote: “The disclaimers that come at the end of pharmaceutical commercials is mostly boiler plate; people skim over this unless they catch a word or phrase that reverberates with them.”
This reminds me of the time I was listening to the radio (but not very carefully to the ad that was playing), and when the disclaimer came, I heard the phrase, “including death.” That got my attention, but I wasn’t quite sure what the drug’s name was, and I never heard the ad again. Since then, I have been avoiding all medications that begin with or contain the syllables “vita-” because that is what I (mis)remember hearing from the ad.
“the last thing I want is some pedant in the back arguing that my interpretation of medieval Hungarian land ownership patterns is incorrect.”
Wait, mike shupp, you think that your interpretation of medieval Hungarian land ownership patterns is not incorrect? Or did I miss your point, and I should have read that so fast that I didn’t catch that part to get all pedantic about it?
“I’ve been watching a lot of television recently, after years of generally ignoring it”
I’ve been generally ignoring television recently, after years of watching a lot of it. Could it be that the newscasters speak really quickly these days because they are just excited about the prospects of overthrowing — er — de-electing Trump (e.g. impeachment, Twenty-fifth Amendment, Meuller’s report, Strzok’s insurance policy, Comey’s coup, whatever-this-week’s-scandal-is, etc.).