Showing posts with label translation. Show all posts
Showing posts with label translation. Show all posts

Tuesday, December 26, 2023

"The Sound of Music" across three languages

You may well be familiar with The Sound of Music, an American musical from the 1950s loosely based on the von Trapp family's memoirs. It features a neat little song for teaching musical notes, "Do, a Deer", which has been translated into a number of languages. Let's contrast three versions - English, Japanese, and Arabic - and see what they suggest.

EnglishJapaneseArabic
Do, a deer, a female deer, ドはドーナツのド
Do is for "donut" (dōnatsu),
دو دروب ومعاني
Do is "paths" (durūb) and meanings,
Re, a drop of golden sun; レはレモンのレ
Re is for "lemon" (remon);
ري ربيع الأغنيات
Re is a "spring" (rabīʕ) of songs;
Mi, a name I call myself,ミはみんなのミ
Mi is for "everyone" (minna);
مي مـوسيقى وأغاني
Mi, "music" (mūsīqā) and songs;
Fa, a long long way to run; ファはファイトのファ
Fa is for "fight" (faito);
فا فـجر الذكريات
Fa, a "dawn" (fajr) of memories;
So, a needle pulling thread;ソは青い空
So is blue "sky" (sora);
صوتنا ملء الفضاء
Our "sound" (ṣawt) is a filling up of space;
La, a note to follow So; ラはラッパのラ
Ra is for "trumpet" (rappa);
لم يزل فينا الوفاء
In us is "still" (lam tazal) loyalty;
Ti, a drink with jam and bread; シは幸せよ
Si is "happiness" (shiawase)
سوف تبقى يا غناء
You, O song, "shall" (sawfa) remain;
That will bring us back to Do! さぁ歌いましょう
So let us sing!
لنغنّي نغنّي.. لحن الحياة
Let us sing, sing... the tune of life!

As should be obvious, the Arabic version is derived from the Japanese one (via a popular anime of the 1990s) rather than directly from the English one. However, it contrasts sharply with both in the choice of note-mnemonics. In English, each note name (well, except "la") is mapped directly to a near-homophonous monosyllabic word, taking advantage of English's relatively short minimal word length; most of these are widely familiar, high-frequency items. In Japanese, the word choices are necessarily longer and perhaps more obscure (the syllable fa is found only in relatively recent loanwords anyway), but in each case the note is mapped perfectly to the first syllable of a single word, usually referring to something readily visualisable. In Arabic, the note is again mapped (increasingly approximatively) to the first syllable, not of a word, but of a 2-4 word phrase; not a single one of these phrases refers to anything concrete enough to visualise. High-flown slogans replace the original's homely whimsy.

I have no way of proving it, but I believe this is symptomatic - certainly of the Arabic dubbing in the cartoons I used to watch in the early 1990s, and plausibly of Modern Standard Arabic discourse in general: an imagination based on recitation rather than visualization, preferring stirring abstractions to concrete details. After all, concrete details travel poorly in this diglossic context.

Thursday, December 08, 2016

How Tunisia ruined its PISA performance

PISA 2015 is an OECD-run survey intended to evaluate education systems worldwide by giving the same test to (almost) all students of the same grade across a large number of countries and comparing the results. This years' results have gotten a lot of coverage, notably for the dismal perfomance of all the Arabic-speaking countries participating. The UAE did least badly in terms of combined scores, managing 48th place out of 70; it was trailed by Qatar (59th), Jordan (61st), Lebanon (65th), Tunisia (66th), and, most ignominiously, Algeria at 69th place, barely beating the Dominican Republic.

Laudably, PISA have made their science tests publicly available online in many languages, including four Arabic versions labelled Israel, Qatar, Tunisia, and the UAE - don't ask me what happened to Algeria, Jordan, and Lebanon. Browsing through these, one immediately notices that the Tunisian translation (unlike the Gulf ones) has a remarkable number of grammatical errors, typos, and phrasings so awkward as to be barely comprehensible. For instance:

  • Bird Migration 1: "يستعملون العدّ الذي يقوم به المتطوّعين" - wrong case: should be المتطوّعون
  • Bird Migration 1: extremely awkward phrasing: "هجرة الطيور هي حركة موسمية كبيرة، يتنقل أثناءها الطيور نحو أماكن تكاثرها أو هي تعود منها." ("Bird migration is a great seasonal movement, during which birds move to the places of their reproduction and they come back from them.") Contrast the clearer phrasing in the Qatar version: "هجرة الطيور الموسمية هي انتقال واسع النطاق للطيور من وإلى مناطق تكاثرها. وفي كل عام يتولى متطوعون إحصاء عدد الطيور المهاجرة في مواقع محددة."
  • Bird Migration 3: the bird's name is "الزقزوق الذهبي" in the text, but in the question it turns into "الزقزاق الذهبي".
  • Running in Hot Weather 1: Garden path title: anyone looking at "العدو في الطقس الحار" is going to read it as "the enemy in hot weather", at least until the context is established. Contrast the Qatari translation "الجري في الجو الحار", using a better known, graphically unambiguous term for "running".
  • Running in Hot Weather 1: Grammatical error in "يدل على ذلك {كمية العرق | ضياع الماء | درجة حرارة الجسم} العداء بعد ساعة من السباق": for the sentence to make sense (even in dialectal Arabic!), none of the alternatives should contain the definite article, since they form part of an idafa genitive. Contrast the Qatari version, which avoids the problem by putting "للعداء".
  • Running in Hot Weather 2: Garden path sentence: "شرب الماء خلال السباق يمكن أن يكون له تأثير على حصول تجفّف وضربة حرارة بالنسبة إلى العداء. أيّهما؟ " Anyone reading this will start by reading the first word as šariba "he drank", giving "he drank water during the race, it can have an effect..." and only after the fifth word will they be in a position to read it, as intended, as "Drinking water during the race can have an effect on the occurrence of dehydration and heatstroke for the runner. Which of the two?" Having gotten that far, they'll still be given pause by the need to decide the intended referents of "Which of the two?" Contrast, yet again, the much easier to read Qatari version: " ماهو تأثير شرب المياه خلال الجري على تعرض العداء للجفاف وضربة الشمس ؟ " (What is the effect of drinking water during the race on the runner's exposure to dehydration and heatstroke?")

I could keep going, and no doubt more fluent Arabic speakers can find problems I haven't even noticed, but the pattern is clear: Compared to Qatari students, to say nothing of Western ones, Tunisian students were systematically disadvantaged in the PISA 2015 science tests by bad translation.

Whose fault is this? Clearly there was a failure at the level of PISA's international verification, which should have eliminated such problems. But the translations themselves are carried out at the national level (PISA2012 Technical Report Ch. 5). In other words, this mess was produced by Tunisian translators under the direction of the Tunisian government.

How is that possible? Simple: in Tunisia, appallingly enough, science is taught in French from the start of secondary school onwards. Science teachers have little need to keep up their Standard Arabic proficiency. Which raises the question of why this test, targeted at 15-year-olds, was administered in Arabic there to begin with.

Sunday, February 28, 2016

Translating a pseudo-Welsh accent into French (or, over-explaining a joke)

Recently I came across Accros du Roc, a French translation by Patrick Couton of Terry Pratchett's comic fantasy Soul Music. In the original, Imp y Celyn ("Bud of the Holly" in Welsh) is a young musician from Llamedos, a small country full of druids and stone circles and harps where it rains all the time. He has a conspicuous Llamedos accent, which seems to consist mainly of doubling all his l's: "Not ellvish at allll, honestlly". For British readers, it's fairly obvious what's going on here: Welsh makes extensive use of the letter combination ll (transcribing a lateral fricative not found in English), so doubling the l's gives it a vaguely Welsh look without actually attempting the difficult task of representing a Welsh accent using an orthography as phonetically inexact as English's. Not a terribly funny joke, really, but it plays some small part in establishing our expectations for this character. But how could it be translated into French, or for that matter any other language?

Conveniently enough, France does have a sort of equivalent to Wales, a rainy, mountainous, coastal region with its own Celtic language and a lot of stone circles: Brittany. Breton does not make much use of the combination ll, but it does have a few characteristics that appear equally exotic to French speakers - in particular, the combination c'h (transcribing the velar fricative /x/) and the frequent use of the letter k (for /k/, reasonably enough). So Kreskenn Kelenn (one guess as to the name's meaning in Breton) talks like this:

Je vois un homme ki tient une hac'he de jet !

So in this case, it works out quite well - though I imagine the joke is lost on readers from, say, Quebec.

I gather that Soul Music has been translated into quite a few languages, but I don't think Arabic is one of them. What on earth would a translator do in this case? It would be kind of tempting to go for equating Celts with Berbers - there are a few stone circles in North Africa - and have Imp substitute ث ذ for ت د. But I don't think any Arab reader east of Algeria would get the allusion, and I doubt that the Middle East contains any ethnic group that can be satisfactorily thought of as playing the role for the Arabs that the Welsh do for the English. Then again, if I were an Arabic translator asked to take on Soul Music, I would give up immediately - any of the few Arabic speakers capable of getting enough of the rock music history allusions to be entertained by the book would be more comfortable reading it in English or French anyway. But that objection is not insuperable: after all, The Wasteland and Finnegan's Wake have been translated into Arabic (for some reason). Perhaps some day a genius will come along sufficiently reckless to give it a try...

Sunday, August 25, 2013

Why having "no word for X" can matter

The nice thing about French, from an English speaker's perspective, is that its lexical structure is so much like that of English that you can often translate a sentence without having to think much about what it means. Let's try this sentence, for example:

"Process and Reality presents a system of speculative philosophy which is based on a categorical scheme of investigation designed to explain how concrete aspects of human experience can provide a foundation for our understanding of reality."

Without seriously contemplating whatever it is that the author of this sentence is trying to say, I can render this in French as:

"Procès et Réalité présente un système de philosophie spéculative qui est fondé s'appuie sur un plan catégorique d'investigation destiné qui vise à expliquer comment des aspects concrets de l'expérience humaine peuvent fournir une base pour notre compréhension de la réalité."

No doubt there are some issues with this translation – my French has a long way to go. (fixed) But producing it was a relatively easy, almost mechanical task. Translating it into Standard Arabic I have to think a good deal more about the sense of each word (and also have less confidence in the results since I don't own a philosophy-focused dictionary) but I can still readily make it nearly word-for-word:

"كتاب السيْر والواقع يقدم نظام فلسفة نظرية مبني على مشروع فحص تصنيفي معمول ليفسر كيف يمكن لبعض الجوانب الملموسة لتجربة الإنسان أن تعطينا أساسا لفهم الواقع.
("kitābu s-sayri wa-l-wāqiʕ yuqaddimu niđ̣āma falsafatin nađ̣ariyyatin mabniyyun ʕalā mašrūʕi faħṣin taṣnīfiyyin li-yufassira kayfa yumkinu li-baʕđ̣i l-jawānibi l-malmūsati li-tajribati l-'insāni 'an taʕṭiyanā 'asāsan li-fahmi l-wāqiʕi.")

Now suppose I want to translate this into Algerian Arabic. What am I going to do about words like "process", "reality", "speculative", "concrete"? Plenty of Algerians have studied such notions, but they've done so in French or in Standard Arabic. What I would normally do in such cases is simply substitute a Standard Arabic word wherever I can't think of one that would count as Algerian Arabic, yielding something like this:

"كتاب السير والواقع يقدّم واحد النظام تاع الفلسفة النظرية اللي مبنية على مشروع تصنيفي تاع الفحص، خدمُه باش يفسّر كيفاش الجوانب الملموسة نتاع تجربة الإنسان تقدر تعطيلنا أساس باش نفّهمو الواقع."
("ktab əs-sayr w-əl-wāqiʕ yqəddəm waħəd ən-niđ̣am taʕ əl-fəlsafa n-nađ̣aṛiyya lli məbniyya ʕla məšṛuʕ təṣnifi taʕ əl-fəḥṣ, xədmu baš yfəssər kifaš əl-jawanib əl-məlmusa ntaʕ təjribt-əl-'insan təqdər təʕṭi-lna 'asas baš nəffəhmu əl-wāqiʕ.")

On the other hand, what a lot of other educated Algerians would do is something more like this, filling in all the gaps from French:

"كتاب بروسي إي رياليتي يقدّم واحد السيستام تاع لا فيلوزوفي تيوريك اللي مبنية على أن پلان كاتيڤوريك دانفيستيڤاسيون خدمُه باش يفسّر كيفاش ليزاسپي كونكري نتاع ليكسبيريانس إيمان يقدرو يعطولنا إين باز باش نفّهمو لا رياليتي."
("ktab pRose e Reạlite yqəddəm waħəd əs-sistam taʕ lạ-filozofi teoRik əlli məbniyya ʕla ãn plõ kạtegoRik d-ãvestigasyõ xədmu baš yfəssər kifaš lizạspe konkRe ntaʕ l-ekspeRyõs üman yəqqədru yəʕṭu-lna ün bạz baš nəffəhmu lạ-Reạlite.")

Neither of these rather macaronic passages would be comprehensible to any monolingual speaker of Algerian Arabic; they're essentially parasitic on the speaker's knowledge of Standard Arabic or French. Granted, probably most Algerian Arabic speakers are not really monolingual; but even then, there is no guarantee that a speaker who understands one version will understand the other. If you really wanted to produce a consensus-friendly Algerian Arabic version, that a monolingual speaker would understand – then, basically, you need to completely rephrase the whole sentence to explain these notions in advance. And before I can do that, I need a clearer notion of what the writer means by things like "concrete aspects of human experience". My job has morphed into something that's not so much translation as totally rewriting, and frankly, for a sentence like this I'm not even willing to try it.

Now suppose you're dealing with a language none of whose speakers have ever studied academic philosophy, or for that matter gotten into high school. You can no longer expect to get away with the dodge of code-switching at appropriate moments. How much effort do you think it would take to translate this sentence, compared with the amount of effort it takes to translate it into French? What effect do you think this would have in practice on the cross-cultural transmission of such ideas?

That's one reason why having "no word for X" can matter. The absence of the word – or more precisely, of a fixed expression for it – impedes translation, and hence impedes the transmission of foreign ideas to monolingual speakers. And fixing the problem isn't just a matter of inventing or borrowing a word; to be able to do either, you need to have formulated the corresponding concept, and, in the case of abstract words like these, that presupposes putting a lot of speakers into an originally foreign system of education, with a lot of associated time and expense and all-round hassle.


(Chain of thought prompted by How would you say that in Derja?).

Saturday, October 04, 2008

Translating from linguists' English to normal English

Machine translation between languages is hard, obviously. There are all sorts of reasons why just looking words up and constructing syntactic trees and changing orders appropriately isn't enough to produce a good output - mainly, the fact that to disambiguate ambiguities you often need real world knowledge, and different vocabularies are not always organised in the same way. How much that matters is really emphasised by thinking about a slightly different problem: translation from a technical vocabulary to a non-technical one within the same language.

Take the following sentences, pulled at random from a grammar on my shelf (Stroomer's Grammar of Boraana Oromo):
"Nouns ending in -ni (mostly -aani) have ultimate or penultimate stress in free variation."

"Verbs with the verb extension -ad'd'-, -at- have an AFF.IMPER.sg: -ád'd'i, -ád'd'u and a NEG.IMPER.sg: -atín(n)i, see 10.10." (p. 72)

If you are, say, a foreign worker about to be posted to northern Kenya, or a second-generation emigrant Oromo planning to go back and visit, you may well want to try and learn some Oromo from this book. But the odds are you will not know what either of these English sentences means, and that applies to quite a lot of the book.

How could you translate these sentences into terms a wider audience would understand? If you can assume a certain amount of basic knowledge (traditional parts of speech, consonants and vowels) then that makes things easier:
"Nouns ending in -ni (mostly -aani) get stressed on the last or second-to-last vowel, it doesn't matter which."

"Verbs with -ad'd'-, -at- added at the end have an imperative singular: -ád'd'i, -ád'd'u and a negative imperative singular: -atín(n)i, see 10.10."
Realistically, you can't assume that level of knowledge, certainly not in Britain at any rate (I still can't believe that what little grammar gets taught in schools here only ever seems to get taught in foreign language classes, not in English ones; that no doubt explains part of the country's comparatively low foreign language skills.) So what does that leave you with? Something like:
"When you say a word that refers to a person, place, or thing* and ends in -ni (mostly -aani), you put the emphasis at the end or just before the end, it doesn't matter which."

"If you have a word that means doing something* that has -ad'd'-, -at- added at the end, then to order one person to do that you add -ád'd'i, -ád'd'u, and to order them not to do that you add -atín(n)i, see 10.10."
(*Yes, I know that syntactic tests like whether they can be the object of a preposition yield more accurate definitions, but in practice these are a good first approximation, and the former does work even on gerunds: "Killing is a bad thing", so "killing" is a noun, but *"Kill is a bad thing", so "kill" isn't.)

Could this be done algorithmically? A simple substitution table would certainly not be enough. Just try it with any set of definitions you can think of:
"Words referring to a person, place, or thing ending in -ni (mostly -aani) have final or pre-final emphasis such that it doesn't matter which."

"Words that mean doing something with the words that mean doing something extension -ad'd'-, -at- have an agreeing order-giving one-entity: -ád'd'i, -ád'd'u and a denying order-giving one-entity: -atín(n)i, see 10.10." (p. 72)
Not terribly helpful, I think you'll agree... To come up with something a little more helpful (and I'm sure my renditions could be improved on) we had to change the whole structure of the sentence. Even then, at some point it's probably going to be more effective to just teach the person the grammatical notions and let them go forward from there than to keep giving brief explanations of the same notion over and over again.

The problem is certainly not unique to linguistics. Medicine, law, ecology - most fields have technical vocabularies that pose an obstacle to non-specialists, who will often have good reason to be interested in trying to make sense of them. Is there any role for algorithms in this (apart from obvious things like hyperlinking technical terms to dictionary entries)? It's well outside my usual field, but it would be interesting to hear of any attempts.

Monday, October 01, 2007

Berber Qur'an translations

Thanks to the efforts of various press agencies, there has been a story floating around the Internet this year about the "first Tamazight Quran". In reality, it's more like the last first Tamazight Quran. I'll try to describe the situation to date as best I can; if any readers know of relevant material I have omitted, please tell me!

You will find occasional reports online that the medieval Berghouata kingdom put together a Berber Qur'an translation; these are misunderstandings. If you look at what al-Bakri (the oldest source I can think of offhand for this) actually says about the Berghouata, he says their second king Salih ibn Tarif claimed to have received a revelation in Berber in 80 chapters which he called a Qur'an, but whose contents (some of which al-Bakri gives translated into Arabic) had nothing to do with the Qur'an. In fact, a later Berghouata king massacred thousands of Muslims in his kingdom for refusing to convert from Islam to the Berghouata religion. It would not surprise me at all to learn of a medieval Berber translation of the Qur'an; I know of such works for Turkish, Spanish, Persian, and Kanuri. However, discounting occasional ill-sourced reports of a no longer extant Almohad one, the earliest reference to such translation that I have come across is a fatwa by the Moroccan shaykh Al-Ḥasan bin Mas`ūd al-Yūsī in 1102 AH (1691 AD) judging translation of the Qur'ān into Tamazight to be permissible, mentioned in Jouhadi Hocine's translation's foreword; such a fatwa implies sporadic translation, but, as far as I am aware, no full written translation from the period has turned up.

Oral translations may be another matter. In Mali, there is reportedly a longstanding tradition of oral translation of the Qur'an into Tamasheq, the Berber language of the Tuareg; this was recorded in a series of 44 cassettes in 1989 by the Ahmed Baba Historical Documentation and Research Centre. Similar cases may well have existed elsewhere.

Serious published efforts at Qur'an translation seem to begin in the 1990s. The earliest partial translation to be printed seems to be Kamal Nait Zerrad's 1998 Lexique religieux berbère et néologie : un essai de traduction partielle du Coran. This work is primarily an effort to design a "purist" Berber religious vocabulary, one drawing on native lexical resources rather than Arabic borrowings, with a translation of a selection of suras added essentially as a proof of feasibility (the book's author, a well-known Berber linguist, does not in fact appear to be particularly strongly committed to Islam.) While the translation is basically into the author's native Kabyle, neologisms and words from other Berber varieties are so frequent as to make the translation rather difficult for native speakers of Kabyle to follow. This work uses the Latin orthography that has become more or less standard in Kabyle usage.

In 2003, with the Moroccan government's decision to raise the position of Tamazight and bring it into the school system, the first complete Berber Qur'an translation (strictly speaking, translation of the meanings of the Qur'an), Jouhadi Lhocine Baamrani's Tarjamat ma`ānī lqur'ān billuġati l'amāzīġiyyah: nūrun `alā nūr / tifawt f tifawt, many years in the making, finally appeared. This complete Moroccan translation (described years earlier, along with the political controversy surrounding it, by The Economist) has priorities more in accordance with one's expectations of such a work: the author's preface concerns itself primarily with reassuring the reader of the work's interpretative accuracy (the author uses the Warsh reading, and, in cases of difficulty, relies on examination of relevant hadith and well-known commentaries), and of the work's religious justification. However, conservative readers have expressed unease at his relative lack of religious training. The work is written in the Tashelhiyt of southern Morocco, a considerably less Arabic-influenced dialect than Kabyle; nonetheless, like Nait-Zerrad although not to the same extent, the author often chooses to use pure Berber vocabulary even when obscure in preference to Arabic loanwords, explicitly drawing an analogy to Fusha Arabic. "Some may say: I do not understand much of the Tamazight in which he has written, and I am Amazigh! I reply that not everyone who speaks Arabic, for example, understands the Qur'an which came down in faultless Arabic. Do not forget, dear reader, that a child spends much effort in gradually learning his native language, so why should you expect to know literary/pure (faṣīħ) Tamazight in a single go?" Apart from some Tifinagh on the cover, the author uses Arabic characters, regularly used by Tashelhiyt authors to write in their native language since the sixteenth century, although he substitutes a variant of Chafik's new orthography (writing all vowels as long instead of short, and using zay with three dots for the emphatic ẓ) which has grown in popularity. He has also published a translation of an-Nawawi's Forty Hadith, as well as some poetry.

Also in 2003, correlating to the Algerian government's gradual expansion of the role of Berber in efforts to conciliate opposition in Kabylie, a Kabyle translation of six hizbs, by Si Muḥend Muḥend Ṭayeb of the Ministry for Religious Affairs (with help from Said Bouziri, Djafar Oulefki, and Mohamed Tahar Ait Aldjet), was published by the King Fahd Complex for the Printing of the Holy Qur'an. This translation uses Arabic characters, but not in the systematic way of Jouhadi Lhocine's translation; rather than establishing a fixed phonemic orthography, it gives the impression of trying to fit Kabyle into Arabic characters in much the way that many people try to fit it into French ones, without any consideration for the phonemic rules of the language. For example, strictly phonetic assimilations across word boundaries, like n+r > rr, are written with shadda, and phonetically short a and ə are both written in the same way, with fatha. It was criticised by activists for its extensive use of Arabic vocabulary - although I rather suspect this makes it more readable to the average Kabyle speaker than the strict purism of other editions. A complete translation by the same people is to appear shortly; it is this which has been being carelessly reported as "the first Tamazight Qur'an".

However, when it does appear, it's not even going to be the first complete Kabyle translation. In late 2006, the poet and chemist Remḍan At Menṣur beat the Ministry to it; I saw copies of his complete translation in shop windows in Algiers and Paris, but have not yet got one. This work uses the Latin and Neo-Tifinagh orthographies on facing pages, and comes with an audio CD. The more extreme anti-Islamic wings of the Kabyle autonomy movement criticised the very fact of his translating this as promoting "Arabisation and Islamisation" (huh, who would have thought that translating the Qur'an might be construed as promoting Islam?) A more conservative reader, while praising the work, suggested that it would have been better off using Arabic script, and that the difficult task of translating with an eye to the correct interpretation required the efforts of a whole committee rather than a single man.

More translations are no doubt to be expected, and their quality - both interpretative and linguistic - will hopefully improve. But this cannot take place in isolation; the form Berber translations of the Qur'an end up taking will inevitably be heavily influenced by the form of the language that ends up being taught in the schools and used in other publications, and politics will continue to affect both whether and how the text is translated. It will be interesting to see how the situation develops.