Friday, April 14, 2017

Languages in 2117

Charlie Stross, a Scottish science fiction writer, recently posted some speculations on predictions for 2117 that touch rather heavily on the domain of linguistics. Linguists who like science fiction may want to consider commenting over there; he's got some good ideas, but some elements are clearly off. The basic conclusions are:
[B]y 2117, [t]here's [g]oing to be a decline in the number of languages spoken: the main world languages will be down to English, Mandarin, Spanish, and some dialect of Arabic (Arabic is highly fragmented), plus surviving secondary languages with large bodies of adherents (over a hundred million each: for example German, Russian, Japanese).

We're also going to see the widespread deployment of deep learning driven machine translation and, most importantly, near-real-time interpretation. There'll be less reason for a native speaker of an apex language to learn other tongues [...]

And the apex languages will have changed considerably [...]

I suspect that over the next century (assuming we don't lose our technological infrastructure) current mechanisms for writing will be supplanted by newer ones--e.g. the replacement of discrete mechanical keys on keyboards with multitouch keyboards and then with gestural/swipe interfaces, where each dictionary word is replaced by a directional ideogram swiped across a QWERTY keymap, until eventually the ideogram replaces the alphabetic word or is auto-replaced by a corresponding emoji.

So: gradual obsolescence of some grammatical forms, appearance of entire new writing systems, unforseen changes due to the vagaries of machine translation, assimilation of loan words from other cultures, and the 2117 equivalent of "don't drone me, bro" (new shorthand to describe stuff that has become the new normal).

What am I overlooking?

My immediate thoughts would be:
  • Actually, a lot of languages with less than 100 million speakers each will still be around 100 years from now. Even if the Netherlands decided overnight to stop teaching, broadcasting, or providing government services in Dutch - and it won't, quite the opposite - it would take more than 100 years for the language to die out. If anything, the fragmentation of mass media into social media already makes it easier to maintain small languages, and to the extent that e-learning becomes a thing, it will have similar effects. On the other hand, only a handful of Native North American or Australian Aboriginal languages seem likely to make it as far as 2117: right now most of them are already down to elderly speakers only, and revitalization efforts are not likely to succeed without a really drastic rethinking of the school system. This is because of grossly coercive educational policies inflicted on them decades earlier. Chinese educational policy has become significantly less tolerant of minority languages over the past few years, and if that trend continues, I suspect many currently viable languages of China are likely to be in a similar situation by 2117: not yet extinct, but reduced to the point that they seem doomed. More broadly, what to predict about language survival worldwide 100 years from now depends fundamentally on two factors: how compulsory education changes, and how much of the population ends up in big cities. The former, at least, is more than anything else about political decisions.
  • Adequate machine translation does seem likely - not good enough for contexts where precision counts, but easily sufficient for casual conversation or listening to speeches. I wouldn't expect this to have any really major effects on languages, but it might allow literal translations of new idiomatic expressions to spread faster between languages.
  • Emoji are basically discourse markers: they won't become ideograms, they'll become punctuation. If they really catch on, our descendants may be as puzzled by how we get by with just half a dozen punctuation marks as we are by how people used to read with no punctuation at all.
Finally, a line that's calculated to get a lot of linguists up in arms: "[L]anguages are vanishing, and to the extent that we can only reason about things we have words for, this may be a subtle but far-reaching loss." Obviously we can reason about things we don't have words for, and equally obviously not having words for them makes it more cumbersome to talk about them. But more to the point, even where languages are in rude health, words for certain things are vanishing at a rapid pace in them. Algerian Arabic isn't going anywhere, but the vocabulary it used to have for wild plants, for traditional farming technologies, for family relationships that are only relevant in a three-generation household? I don't even think most people my age know them, much less their grandchildren in 2117. Large written languages with sufficiently developed institutions can maintain such vocabulary precariously at the margins by having specialists use it - botanists, agricultural experts, historians, etc. Most languages can't.

9 comments:

PhoeniX said...

Interesting stuff. Only thing I found funny is that he mentions:

We're also going to see the widespread deployment of deep learning driven machine translation and, most importantly, near-real-time interpretation. There'll be less reason for a native speaker of an apex language to learn other tongues [...]

And at the same time predicts massive language loss of languages under 100 million. I think it's obvious that any language with a large enough written corpus for deep learning to be effective (to which my native Dutch obviously belongs) will actually perhaps even be strengthened by these developments, as it becomes less and less important to learn to speak fluently one of the 'big five' to be an effective user of the internet, or even in foreign countries in public life.

So yeah, I doubt many European national languages are going to become obsolete anytime soon. Chances might even increase that it becomes less important for its speakers to become bilingual.

Jim said...

"But more to the point, even where languages are in rude health, words for certain things are vanishing at a rapid pace in them. Algerian Arabic isn't going anywhere, but the vocabulary it used to have for wild plants, for traditional farming technologies, for family relationships that are only relevant in a three-generation household? I don't even think most people my age know them, much less their grandchildren in 2117. "

Over at Languagehat in some other connection someone said that their Tlingit in-laws had only about one coat of paint on English but they knew every bit of salmon terminology in English, most of which no L1 speaker knew.

David Marjanović said...

Predictions are very difficult, especially about the future.

how we get by with just half a dozen punctuation marks

We "get by" in the sense that we quietly suffer the consequences as if we somehow had to. It's really quite cumbersome. Even just introducing the Chinese inverted comma for lists would make a lot of sentences a lot more legible.

Lameen Souag الأمين سواق said...

Phoenix: That's a good point, but it becomes even weightier when you take into account one of this author's favourite themes: universal surveillance, and how easy and inevitable it becomes when everyones is carrying around mobile phones all the time. By 2117, unless a major resource crunch forces technological regression, chances are that phone companies (or at least security agencies) will have at their disposal massive corpora of every language that's still spoken at all. Annotating it may be harder, but by that time, I would bet that they can feed a print dictionary and a basic morphology into an automatic annotator and get back something good enough to at least massively speed up the process.

Jim: Nice example; that's the difference between language shift and culture shift.

David: Perhaps people working on intonation should take up the cause. Imagine a scientifically justified inventory of punctuation we need to be able to do in writing what we do in speech.

David Marjanović said...

^ That would be fascinating!

Jim said...

Geoff Pullum's take on machine translation: http://languagelog.ldc.upenn.edu/nll/?p=32233

"David: Perhaps people working on intonation should take up the cause."

Considering how crucial intonation is in English, basically a stand-in for final particles/discourse markers, you would expect that some more adequate system would evolved to represent.

David Marjanović said...

Many pages on europa.eu are machine-translated by something other than Google. You can still tell, but the translations are much better than I expected.

Anonymous said...

Definitely off-topic, but do you happen to know the etymology of "naharda" in Egyptian Arabic? I've been searching but cannot find.

Lameen Souag الأمين سواق said...

Simple: nahar is day, da is this.