Thursday, February 27, 2014

Korandjé music video (Algeria's other language)

As regular readers will know, for some years I've been working on the only language of Algeria that's neither Arabic nor Berber – Korandjé, spoken in a tiny oasis betwen Bechar and Tindouf called Tabelbala. A little while ago, we saw a brief video of one of its closest relatives, the Tagdal language of Niger. Today, for the benefit of anyone who may have wondered what Korandjé sounds like, I'd like to present the first music video in Korandjé to reach YouTube – a nostalgic song in a Middle Eastern style by Abdou Makhloufi:

Musically and poetically, it's rather derivative (and not derivative of Tabelbala's traditions either). But I salute the author's efforts anyway; it's not easy to go against the flow, and the trend in Tabelbala is very much to leave music (and most other domains of life) to Arabic. Here's an attempt at a transcript, minus most of the repetition (almost every line is repeated at least twice):

عباعمير كُارا عباعمير
ʕ-baʕam-yər kʷạrạ, ʕ-baʕam-yər
I-wanna-return town, I-wanna-return
I wanna go back home, I wanna go back,

تكُّاري ندا ادرا ن لهوا ابيسحر
tsəkkʷạrəy ndz’ adṛạ n ləhwa a-b-yisħər
sand and mountain ’s air it-IMPF-enchant
The sand and the mountain air are enchanting,

عباعمير كُارا عباعمير
ʕ-baʕam-yər kʷạrạ, ʕ-baʕam-yər
I-wanna-return town, I-wanna-return
I wanna go back home, I wanna go back.

ومّوغيسي، عباعميرنيسي
wə-ṃṃə̣w-ɣəy-si, ʕ-baʕam-t-ndzi-si
y’all!-listen-me-to, I-wanna-say-y’all-to
Y'all listen up, I wanna tell you all,

اوغ اكّس ان كُارا، توغا اڤُّاسي
uɣ əkkəs an kʷạrạ, tsuɣạ ggwạ-a-si
who abandon his town, what remains-him-to
Someone who abandons his hometown, what's left for him?

عباعمير كُارا عباعمير
ʕ-baʕam-yər kʷạrạ, ʕ-baʕam-yər
I-wanna-return town, I-wanna-return
I wanna go back home, I wanna go back.

الله الله، ڤايو زّينيو بايو
əḷḷạh əḷḷạh, gạ-yu zzin-yu gạ-yu
God God, house-s old-s house-s
O Lord O Lord, the old houses,

ڤايو، بايو ندا لغاديايو
gạ-yu, bạ-yu ndza lɣadya-yu
house-s, person-s and ???-s
The houses, the people and the ???s,

ڤُند عفّكّر كُارا، عاهيو
gundz ʕa-f-fəkkəṛ kʷạṛạ, ʕa-hyu
when I-ed-remember town, I-cry
When I remembered the hometown, I cried.

عباعمير كُارا عباعمير
ʕ-baʕam-yər kʷạrạ, ʕ-baʕam-yər
I-wanna-return town, I-wanna-return
I wanna go back home, I wanna go back.

تاميسا عباعمدغنني، تامسّخ ما كُنّاني
tsamis a ʕ-baʕam-dɣən-ni, tsaməssəx ma kunna ni!
how FOC I-gonna-forget-you, how what find you!
How could I forget you, how – what's wrong with you!

ڤُا بايباهنڤاني، آ نمبغسي واراني
gwạ bạ-i-ba-hanga-ni; a nən bə̣nɣ-si wara ni
stay person-s-have-follow-you, ah your head-to even you
Stay, people are following you; ah, (stay) for yourself too!

تاميسا عباعمدغنني، تامسّخ ما كُنّاني
tsamis a ʕ-baʕam-dɣən-ni, tsaməssəx ma kunna ni!
how FOC I-gonna-forget-you, how what find you!
How could I forget you, how – what's wrong with you!

اقّا عقّوم عمزوني، عمزوني
əgga ʕa-ggum ʕa-m-zəw-ni, ʕa-m-zəw-ni – əgga ʕa-ggum
PAST I-swear I-'d-take-you, I-'d-take-you – PAST I-swear
I had sworn to marry you, to marry you

نزّو افيط نكّسغي
nə-zzəw a-fyəṭ nə-kkəs-ɣi
you-take an-other you-abandon-me;
You married another and left me;

تامسّخ ما كُنّاني
tsaməssəx ma kunna ni!
how what find you!
How – how could you!

نن لقبيلت اسبغغي، اسبغغي - نن لقبيلت
nn ləqbilət a-s-bəɣ-ɣəy, a-s-bəɣ-ɣəy – nn ləqbilət
your tribe it-not-like-me, it-not-like-me – your tribe
Your tribe doesn't like me, doesn't like me – your tribe;

إدرامن اسباغيسي
idṛạmən əs-bạ-ɣəy-si
money not-be-me-to
I don't have money

آغي عمبين اكُّاري، نبّي مسّخ من بكري
aɣəy ʕan bin ək-kwạrəy, nə-b-bəy məssəx mən bəkri
Me my heart is-white, you-PF-know thus from long_ago
But my heart is clear, you've always known that

تامسّخ ما كُنّا ني
tsaməssəx ma kunna ni!
how what find you!
How – how could you!

Wednesday, February 26, 2014

18th century Zenaga poetry and language change

By far the most distant Berber variety from the rest – a separate language by even the most generous standards, as the lines quoted below will probably convince you – is Zenaga, the Berber of Mauritania. In an old article by Harry Norris (1969), "Znaga Islam during the 17th and 18th centuries", I recently came across an passage in a photograph of a page from a 20th-century Mauritanian manuscript called Dhāt alwāḥ wa-dusur, discussing a poem written in Zenaga by Wālid bin Khālunā al-Daymānī (d. 1797), and containing words already obsolete by the commenter's time. The article says this was to be published by James Bynon, but that doesn't appear to have happened. While I can make out much of it, especially with the help of two partial translations into Arabic quoted in the article, I cannot fully parse the few lines given there – perhaps some commenters will join in the fun of decipherment. The author also throws in some unexpectedly insightful observations on language change...

وأما الثانية فيعسر ضبطها جدا لأن الفاظها كلها عجمية ومع ذلك فتلك الالفاظ قد اندرست اليوم وعدم من يعرفها لأن اللغات تتبدّل فكل سنة تنسى كلمات ويوتى بآخر غير معهودة ولولا محافظة الناس على اللغة العربية في الدهر الذي نزل فيه الوحي تبدلت بالكلية حتى لا يوجد من يعرفها ويدلّ على ذلك ان العرب الاقاح في هذا الدهر الذي نحن فيه قد تغيرت السنتهم حتى لا يتكادون يفهمون العربية الاصلية الا ان يتعلموها وتسمى هذه الثانية بالمزروف ومطلعها:

اترگ نئك اراكلئذ * ايشذ ننتا شد اذچان
ايش اتؤچش اذ تنجگفئذ يسگذان اشرن يستغان

قوله اكلئد اي السلطان
وقوله اتؤجش اي وجوده
وقوله تنجگفئذ اي القدم
وقوله نِ اي انا اي القائم بنفسه

"As for the second [poem], it is very difficult to determine it, because its words are all non-Arabic, yet those words have become rare today and no one knows them any more – since languages change. Every year some words are forgotten, and others, little-known, are brought forth. If people had not preserved the Arabic language at the time when the revelation came down, it would have changed completely, to the point that no one would know it. This is shown by the fact that the tongues of the Arabs of our time have changed, until they can barely understand original Arabic unless they have studied it. This second [poem] is called "al-Mazrūfa", and it opens with:

əttäräg niʔk är ägälliʔḏ – äyš äḏ nəttä šd äḏžān
äyš ätuʔž-əš äḏ tənd'əgfiʔḏ – yässəgḏān āš ni yəstəġān

("I ask of the Sultan * He who is my owner
Whose existence is eternity without beginning * who is rich, who needs nothing")
  • His saying ägälliʔḏ means "Sultan".
  • His saying ätuʔž-əš means "his existence".
  • His saying tənd'ägfiʔḏ means "eternity without beginning".
  • His saying ni means "I" ie "the independent"."

From Taine-Cheikh (2007), we find that ättər is "ask", and əttär-äg therefore perfective "I ask"; niʔk is "I" (note the carefully written glottal stop!); and är is "from". Perhaps unsurprisingly given this passage, ägälliʔḏ has not made it into the modern era, so the vocalisation is conjectural, but it is obviously cognate with Tashelhiyt agllid "Sultan". äyš is a relative complementiser ("that") normally combined with a resumptive pronoun; äḏ is the copula ("is"); nəttä "he" is presumably the expected resumptive pronoun (the text actually clearly has two n's, but I'm assuming one of them is a typo). The rest of the line is a bit of a mystery; my best guess is that it involves the perfective participle of the verb "own", äyi(ʔ) in Taine-Cheikh (note that her y is often ž in other Zenaga varieties, from original *l), but then I would expect a glottal stop to be written. äyš "that" we have already seen, and -əš is "his/her/its". ätuʔž, explained as "existence", must be derived from the verb y-uʔy "exist", but the t is surprising. äḏ "is" we have already seen. We are given the meaning of tənd'ägfiʔḏ, but even its vocalisation is conjectural, and I can't find an appropriate root to relate it to. yässəgḏān (vocalisation conjectural again) must be a participle of the verb corresponding to Ould Hamidoun's eʔssəgḏīh "richesse", quoted by Taine-Cheikh (note that vowel length, phonemic in Zenaga, is transcribed accurately!). The rest is another blur, except that yəstəġān (?) may be from Arabic istaġnā "not to need".

If this isn't enough of a challenge, there's several other lines of Zenaga poetry quoted in that article...

Saturday, February 22, 2014

The Arabic Script in Africa

An article of mine that's been in the pipeline for almost four years has finally come out: "Writing 'Shelha' in new media: Emergent non-Arabic literacy in Southwestern Algeria". I discuss the usage of non-Arabic languages (Berber and Korandjé) in Southwestern Algeria in digital media, looking at the orthographic solutions adopted and the purposes of those writing it. The results suggest that, under appropriate circumstances, a high degree of orthographic uniformity is possible without any formal training in writing the language in question – but that the existing sociolinguistic marginalisation of these languages in speech is taken even further in writing.

I received a copy of the book recently, and found the rest of it very interesting. Maarten Kossmann and Ramada Elghamis discuss the traditional Arabic orthography of Tuareg, which shows several unexpected features. Two articles discuss the writing of Afrikaans in Arabic script, which – hard as it may seem to believe – predates its writing in Latin script. Nikolai Dobronravine discusses the use of Arabic to write African languages (as well as the Arabic language) in the Americas – the archives of Brazil, for example, contain a surprising number of letters confiscated from slaves. Other articles examine Fulani, Kanembu, Manding, and Swahili, as well as the history of Arabic writing in general and its distribution in Africa.

On a related note, if you're interested in Libyan Berber, it turns out there's a surprisingly large number of people writing even some of the least well-known varieties on Facebook, often in Arabic script; see my recent post on Awjili negation for Awjila, or Awal n ɛdeməs for Ghadames.

Monday, February 03, 2014

Aljazeera video of mixed Tuareg-Songhay language, Tagdal

Aljazeera's documentary Orphans of the Sahara is worth watching for anyone interested in Tuareg language, as well as Tuareg politics – the producers took the very commendable decision to do most of the interviewing in Tuareg, giving a much more representative picture of Tuareg opinions than if they had stuck to interviewing French speakers as many other journalists do. Other languages of the region, apart from Arabic (and other ethnic groups' opinions) are rather less well represented, but about ten minutes into the first video, my ears perked up as I realised that I wasn't hearing Tuareg any more. As the camera follows Mohammed Igdali's first meeting in many years with his grandmother, somewhere outside Agades in Niger, you hear them speaking in a language that sounds oddly like Tuareg yet has a completely different grammar (from about 10min13s to 10min52s): In fact, this language is Tagdal, the language of the Igdalen tribe – a close relative of Korandjé, the Algerian language I studied for my doctorate. Most of its vocabulary is from Tuareg (or sometimes other Berber varieties), but its grammar and a few hundred of the commonest words are from Songhay, a language family spoken mainly further south along the Niger River. I can make out "Maxámmad Xásan, nənn áahay. – ɣann áahay ah?" (Mohammed Hassan, your grandchild. – My grandchild?), in which "grandchild" (áahay) is Tuareg and the possessive pronouns "your" (nən) and "my" (ɣan) are Songhay, as well as "nən bárar ɣo ggóra nə́n moo ka" (your child who is sitting in front of you), in which only bárar "child" is Tuareg, while the rest is Songhay. This is the first recording of Tagdal I've ever heard.

Tagdal is extremely inadequately documented – there are only three published resources on it that I know of (see my Northern Songhay bibliography), none of which provides even a sketch grammar (although a sketch grammar by the missionary linguist Carlos Benítez-Torres should be coming out in a couple of years, in The Oxford Handbook of Language Contact). It would be a rather interesting language to study, both as a case study in extremely intense language contact and for what it indicates about regional history. (Unlike most Tuareg tribes, the Igdalen are thought to have come from the west, and they seem to have played a prominent role in early medieval history; their original language, like that of the Idaksahak, was quite likely not Tuareg.) Unfortunately, the political situation described in that documentary makes fieldwork rather difficult to undertake for the moment.

Sunday, January 12, 2014

Yennayer and the influence of writing on oral tradition

Aseggas ameggaz! Today (12 January), many Algerians are celebrating Yennayer, the start of the Berber New Year. Nowadays this seems like a quintessential example of the stubborn maintenance of authentic popular tradition in the mountains, defying pressure from religious scholars and governments. The reality, however, is more complex. It's true that many North African Islamic scholars have been condemning New Year's celebrations since at least the 11th century. But this should not obscure the fact that the Berber calendar itself has been maintained, for most of its history, by literate North African scholars writing mainly in Arabic, not just by oral tradition – and even the orally transmitted month names derive partly from written tradition.

This calendar, originally Julian, is of obvious Latin origins (the Kabyle forms of the month names are Yennayer, Furar, Meɣres, Yebrir, Yunyu, Yulyu, ɣuct, Ctember, (K)tuber, Nu(ne)mber, Duǧember/Buǧember). But it is fairly widely found in medieval Arabic writing, starting as early as the 10th-century Andalusi book al-`Iqd al-Farīd, which quotes the doctor Isḥāq ibn `Imrān of Qayrawān giving health advice for each month:

So in Yennayer drink strong drink each morning; in Fubrayr do not eat chards [...] in Nubambar do not enter the bathhouse; in Dujambar do not eat rabbit.
فيفي شهر يناير تشرب شراباً شديدياً كل غداة. وفي شهر فبرير لا تأكل السلق. وفي مارس لا تأكل الحلواء كلها وتشرب الأفسنتين في الحلاوة. وفي أبريل لا تأكل شيئاً من الأصول التي تنبت في الأرض ولا الفجل. وفي مايه لا تأكل رأس شيء من الحيوان. وفي يونيه تشرب الماء البارد بعد ما تطبخه وتبرده، على الريق. وفي يوليه تجنب الوطء. وفي أغشت لا تأكل الحيتان. وفي سبتمبر تشرب اللبن البقري. وفي أكتوبر لا تأكل الكراث نيئاً ولا مطبوخاً. وفي نبنبر لا تدخل الحمام. وفي دجنبر لا تأكل الأرنب. (source)
Note that he uses the month names 'Aghusht (August) and Dujambar (December), aligning him firmly with the North African month name tradition rather than those of other Arabic-speaking regions. In such works, the calendar is usually called `ajami, as in the 12th-century history of Ibn Ṣāḥib al-Ṣalāt:
...he returned to Marrakech on the morning of Saturday 11 Rabī` II (sic, should be Rabī` I), corresponding to the `Ajamī date 15 Yennayer, of the year 561.
فكان وروده حضرة مراكش ضحوة يوم السبت الحادي عشر من ربيع الآخر الموافق للخامس عشر من يناير العجمي من عام واحد وستين وخمس مائة. (source)
It's a lot easier to keep a calendar if you know how to read and write, and Coon's account of traditional Riffian Berber society make it clear that, there at least, the local religious scholar was charged with this duty:
Agriculture in the Rif and in the Senhaja country is conducted on the basis of the old Roman calendar, the names of the months surviving in a form very little altered from its original character. I was unable to transcribe these names exactly since I could find no one who both knew them and was willing to reveal them. Knowledge of them is confined to the fḳih, the preceptor or religious head of each group of villages, and to his students. The fḳih, while delivering sermons at the mosque on Fridays, reveals the agricultural program for the following week and tells the farmers just what activities the season merits. To reveal this calendrical system and the agricultural annotations that go with it would be to relinquish a part of the awe in which the religious leader is held. (Coon 1931:49)
Writing not only helped preserve this calendar, but affected its form. In parts of Morocco, June and July are not called Yunyu and Yulyu, as elsewhere, but Yunyuh and Yulyuz. The distinction is impossible to explain from Latin, where they are simply Iunius and Iulius. The first academic article to explain it seems to be Van den Boogert (2002), who shows that the new names are based on a neat mathematical trick.

If you know the date and the day of the week for 1 January, how would you go about figuring out the day of the week of any given date in the year? Well, 1 February is 31 days later; take away four weeks (28 days), and you get 3, so it must start on a day of the week three days later than 1 January. Repeat that procedure for each month in turn, and you get how many days of the week later than 1 January each of them starts. But that's tedious work even with a calculator, much less without it, and if you plan to do it at all often, you'd better just memorise the figures – February starts 3 weekdays later than January, April starts 6 days later, etc. But how can you make it easier to associate 12 numerals with 12 different months?

In Arabic, letters of the alphabet can be used as numerals, following the same old order used in Hebrew and Aramaic: a = 1, b = 2, j = 3, d = 4, h = 5, w = 6, z = 7, ḥ = 8... So what some unsung calendricalist did was tack the appropriate numbers onto each month name, setting the weekday of 1 January = a, as in this medieval Moroccan example: innayr-a, fubrayr-ad, marṣ-ad, abrīl-az, mayyu-b, yunyu-h, yulyu-z, ghusht-aj, shutambir-aw, aktubar-aḥ, nuwambir-ad, dujambir-aw. Memorise this list, and you're sorted. And, as a bonus, it gives you a way to better distinguish the dangerously similar-sounding month names Yunyu and Yulyu.

Thursday, December 26, 2013

Does Arabic have the most words? Don't believe the hype.

For some time, I've been hearing rumours (from Arabs, of course) that Arabic has the largest number of words of any language. Recently I found one vector for this rumour: Comparison of the Number of Words in Languages of the World, a poster put together by Azzam Aldakhil which has the merit of at least giving the sources for its figures, namely Muʕjam ʕAjā'ib al-Lughah by Shawqī Ḥamādah, 2000. (In a follow-up comment he gives the page numbers, 83-84.) This poster claims that "Arabic has 25 times as many words as English".

Unfortunately for this claim, if you go to the book cited, what you actually find is a calculation of the number of possible roots in Arabic, without regard to whether or not the root actually has a meaning. Such a count includes huge numbers of unused roots such as بزح bzḥ or قذب qḏb, while at the same time lumping together all words derived from the same root; كتاب book, كاتب writer, and مكتب office are three words, but only one root. The result of such a calculation might tell us something about the potential for expanding Arabic, but absolutely nothing about the state of the Arabic language. And since in practice both Arabic and the languages it is being compared to on that poster allow arbitrary long words without real roots, if only in loanwords, it doesn't even tell us much about its potential.

Both the number of Classical Arabic roots with actual meanings and the number of words can be estimated from the classic dictionaries: according to Sakhr's statistics, there seem to be around 10,000 roots, and up to 200,000 distinct words. Roots don't play such a major role in the lexicography of most non-Semitic languages, so it's difficult to compare the number of roots cross-linguistically. But in terms of words, that would be slightly fewer than English (250,000 in the OED, although the poster cites 600,000) and slightly higher than French (over 100,000 excluding proper nouns, according to the Académie Française).

However, such comparisons can hardly fail to be misleading. For one thing, English is much more hospitable towards dialectal and colloquial usages than Arabic is – the OED is full of words marked as Scottish or Northern or slang or whatnot, the equivalents of which would never be accepted by an Arabic dictionary. For another thing, the whole enterprise of counting words across languages runs into apparently insuperable problems, especially when it comes to compounds, which Arabic dictionaries do not normally treat as words. If you include compounds, then compound-friendly languages like German or Turkish or Inuktitut are automatically going to beat all the rest – and all the available statistics that I've seen for, say, English happen to include compounds.

So the best answer is that we don't really know, and that word count, even if we could measure it better, is not a very good measure of a language's expressive power anyway. Some missing words make a genuine difference, as I've discussed here before. But is English really missing out by not having distinct words for male camels (جمل) vs. female camels (ناقة)? Is Arabic really missing out by not having a special word for cornpone, or for scones?

Wednesday, December 11, 2013

Tadaksahak

Tadaksahak, a heavily Berber-influenced Northern Songhay language spoken in northern Mali and Niger and closely related to Korandjé, is a remarkable example of how far language mixture can go. While the core grammar remains Songhay, causatives and passives can only be formed using Berber morphology attached to Berber stems, so every non-Berber verb in the language has a suppletive causative and passive (there are only a couple of hundred of those left, though, so it's not that impossible to learn.) I recently finally finished a review of Regula Christiansen-Bolli's Grammar of Tadaksahak (you can read the review here). For various reasons, I ended up taking the opportunity to write an overview of the general problem of how the language came into being. I don't have a final answer, but I did find that it was even more complicated than it looks.

You see, Tadaksahak speakers are currently mostly bilingual in Tuareg, and well integrated into Tuareg culture. Most of the Berber loanwords in Tadaksahak are from one or another Tuareg variety. But quite a few – including some of those irregular causatives and most numerals up to 20 - are demonstrably not from Tuareg, but from some other Berber language, closely related to Tetserrét (Niger). Today, Tetserrét is nearly extinct, and nobody speaks it as a second language; obviously things must have been different in the past. It looks like most Tadaksahak speakers are visibly of Berber descent, so probably they shifted from Tetserrét to Northern Songhay and then came under Tuareg influence. But why would anyone want to adopt Northern Songhay, currently barely hanging on in one or two remote towns of northern Niger, as a first language? Again, obviously things must have been different, but it's not easy to see how. My best guess for the moment is that they did so in order to reinforce their identity as religious specialists (ineslemen, "marabouts"), since Songhay was the language of the urban centres where advanced religious studies could be pursued, but there are a lot of question marks over that. To confuse matters further, their neighbours like to claim that Tadaksahak speakers are of Jewish descent - probably just to undermine their religious specialist status, but possibly reflecting some more complex history.

Oral tradition isn't much help; there is no firm consensus within the group on their history, and such genealogies as have been circulated, by themselves or by their neighbours, look very much like efforts to push self-serving agendas. About the only common theme across them is that they came from the west. Genetic testing might give firmer data, but the results could be politically sensitive. More lexical data, both for Tadaksahak and for other minority languages of the region, would certainly help, but the problem is ultimately cross-disciplinary - historians, archeologists, anthropologists, etc. take note! Any ideas?

Monday, December 09, 2013

wləd/wlid- "boy, son": An irregular development

There's a curious feature I recently noticed about the Arabic of Dellys in Algeria (I can't imagine what took me so long, since it's in my own idiolect as well!). In Morocco and western Algeria "boy" and "son" are both ولْد wəld, corresponding regularly to Classical Arabic وَلَد walad. In Dellys, "boy" is ولد wləd, again corresponding regularly (in Morocco, CaLaC and CaLC, where C is any consonant and L is a sonorant, both end up as CəLC; in central Algeria, the former becomes CLəC, the latter CəLC). But with a possessor – ie, in the sense of "son" – is not wləd, but وليد wlid. You can say وليد خويا wlid xu-ya "my brother's son" or وليدك wlid-ək "your son", but not *wləd xuya or *wəld-ək. It's not obvious how to explain this historically; on the face of it, it looks like a completely irregular development. There are a few other nouns derived from the pattern CaCaC – for instance حنش ħnəš "snake", حبق ħbəq "basil" – but I can't think of any cases offhand which frequently occur in the construct state (that is, with a possessor directly following them). It might be compared to the diminutive, but in present-day Dellys Arabic anyway, the diminutive is وليّد wliyyəd, not wlid.

Has anyone come across a similar phenomenon in any other Arabic variety?

Saturday, December 07, 2013

19th c. Songhay sources from Tanzania and the US

A while ago, I posted about the earliest European source for Songhay. Shuichiro Nakao, who's been doing some interesting work on the 19th-century development of Arabic-based creoles, recently sent me a link to an early record of Songhay from an even more surprising source: the journal Tanganyika Notes and Records. The article in question is a summary autobiography of Adrien Atiman, who spent most of his life working as a Catholic missionary in central Africa. Apparently, as a child he was taken (sold or kidnapped? he would never know which) as a slave from Tindirma (modern Mali) and brought north to Metlili, where he was "ransomed" by a Catholic priest in 1876, converted, trained for priesthood, and finally sent off to a completely different part of Africa to be a missionary. He gives a few words, the only ones he could still remember of his native language after so many years: ""Coro" meaning lion, "Boro" man, "Elham" meat, "Bri" bone, and "Kunduhari" beer." These are easily recognisable as the Koyra Chiini forms (after Heath 2005): kooro hyena, boro person, ham meat (crossed here with Algerian Arabic lħəm "meat"), biri bone, and kundu "bourgou grass" + hari "water" (a syrup is traditionally made from bourgou grass). But it is striking that, even for these last holdouts, the meanings are not remembered exactly. Your first language is not necessarily the language you are most fluent in!

As it happens, another Songhay-speaking slave also left us his biography, from slightly earlier in the nineteenth century (1854): Mahommah Gardo Baquaqua. A native of Djougou (modern Benin), he was taken prisoner while visiting a different town and sold south to the coast, ending up as a slave in Brazil, but eventually managed to escape while passing through New York, which had already abolished slavery. He gives the numbers from one to a thousand in Dendi, as well as a few vocabulary items scattered throughout the book. (Not all the latter are Dendi – some are Hausa, eg "cofa" (properly ƙofa) for "gate".)

I've managed to trace a few Songhay loanwords in North Africa, but as far as I know no one has ever reported a Songhay loanword in the Americas. That is probably to be expected, since most slaves there would have come from regions closer to the sea – but it would be interesting to look more closely...

Friday, December 06, 2013

Propaganda and grammatical gender

I try my best to avoid reading products of the propaganda wars currently raging in the Middle East, but today I found that they had managed to leak into the usually apolitical world of linguistics blogging. In a recent post about the way grammatical gender affects how we imagine anthropomorphic characters, Asya Pereltsvaig alludes to a fatwa supposedly arguing that "the word for ‘sea’ is grammatically masculine in Arabic, and so when a woman goes swimming and “the water touches the woman’s private parts, she becomes an ‘adulteress’ and should be punished”." This is sourced to an article in India Today, based on Al Masry Al Youm, which in turn cites a report by Dr. Sayyid Zayed of Al-Azhar titled "The Errant Fatwas of the Muslim Brotherhood and the Salafis" (الفتوى الضالة عن الإخوان والسلفيين). This report is not online, and none of the links identify the author of the fatwa in question, but Google provides an answer - an article from 17 September 2012 gives a screenshot of a Tweet allegedly posted on 11/5/2011 by @AliAlirabieeii saying "It is one of the greatest sins for a woman to go down into the sea, even covered, since the sea is masculine, and when the water goes into her private parts she thus becomes an adulterer and liable to the stipulated punishment." There is in fact a Dr. Ali Al-Rabiei, a vocal Saudi imam, and he does have a Twitter account - @DrAliAlrabieei. On this Twitter account, he tweeted on 28 May 2012 that "The Shia are counterfeiting a sixth fake account in my name - @AliAlirabieeii - to display smears and fakery; we call upon you to inform about it and get it closed."

In some ways, this brief odyssey through the sad world of Twitter warfare was superfluous. The slightest knowledge of Middle Eastern politics should be enough to tell anyone that a story run by Al Masry Al Youm, or a report by Al Azhar, published not long after the ouster of Morsy and explaining how the Brotherhood are completely crazy, might need to be taken with a pinch of salt. In the current political battles of the Middle East, attributing horrifying fake quotes to leaders from the other side has become a rather popular tactic. I don't know what the background is for the Iraqi fatwa cited later in the same post (a slightly different account is sourced by the Daily Telegraph to the observations of a Sunni leader from Anbar), but common sense tells us it's more likely to be hostile propaganda than to be anybody's actual belief, no matter how crazy. Salafis are known for being especially strict about the need to separate men and women; whoever was behind these stories must have decided that the idea of extending this to separating grammatically masculine things from feminine things would be just plausible enough to fool ordinary people while at the same time ridiculous enough to horrify them. Apparently, he was right.

[Addendum: Looking at this post again, it occurs to me that it's missing the human dimension; you can probably reconstruct it from the facts, but just in case, here are the basics. The Twitter accounts were very likely intended as satire, notwithstanding Alrabieei's furious response – and he may well have deserved satire, if his positions on the Shia are as extremist as they seem to be. The fact that a number of sketchy Arabic news sources picked it up as if it were real might be an honest mistake, but much more likely was simply because they were looking for any opportunity, honest or dishonest, to embarrass someone on the opposite side of the current culture wars. The Egyptian media then picked it up because what they wanted to do was paint opponents of the current government as insane fanatics, but left out his name and identity because he's Saudi, and the Saudi government is strongly on the side of the current Egyptian government. That's dishonesty any way you spin it.]

Tuesday, November 05, 2013

APiCS online, ASJP

Any readers interested in pidgins, creoles, or mixed languages (one of those things is not like the others!) will want to know that the data for the Atlas of Pidgin and Creole Languages, APiCS, is finally online and publicly browsable. Think of it as WALS for pidgins and creoles, basically – lots of pretty maps, with the nice bonus that language-internal variation in features like word order can be represented proportionally by a pie graph instead of having to choose a single value per language.

Also released lately is the data underlying the ASJP (Automated Similarity Judgement Program). The program's results itself remain thoroughly unreliable as a guide to classification – as of the latest version, it auto-classifies Songhay with Masa (Chadic), Berber with East Chadic, Kanuri with various Biu-Mandara (Chadic) languages (and not with Teda-Daza), Turkic with some New Guinea language named Kuot, and Hebrew with Tigre and Tigrinya against the rest of Semitic. For low-level subgroupings they aren't always too bad, though – their Berber tree has become surprisingly plausible. In any event, having the data, you can analyse it yourself, or try running your own algorithms if you feel up to it...

Saturday, November 02, 2013

Lingua franca / Sabir

Mi star trovato un bonu libro sopra il sabir: Dictionnaire de la langue franque ou petit mauresque. Avanti l'attaca del Fransis, l'Algerino parlar con il Rumi ne in esbagnol ne in italiano ne in fransis, ma in questa lingua, una miscolantza dell'italiano e dell'esbagnol, muchu facile anche per un muchachu. Il mariniero parlar il sabir non solo in Algieri ma in tutto portao straniero. Ma doppo 1830, il genti star imparato fransis, presto scordato il sabir. Ellu star lasciato giusto qualche parola in l'arab del mariniero, come in Dellys "timpu" (il tempo bello). Per ancora imparar, andar a A Glossary of Lingua Franca, di Alan Corré.

Sunday, October 27, 2013

CORVAM, Ghomara recording

I was happy to learn of a new, if still rather small, corpus of audio files for North Africa: CORVAM. There is a good deal of Moroccan Arabic and a little Tunisian and Libyan Arabic, but the most exciting recording from my perspective is a short one of Ghomara Berber (a variety spoken in northern Morocco, very interesting both for Berber historical linguistics and for general language contact, previously discussed here: Berber words in Roman times, and Ghomara Berber material). It makes a nice complement to the much older SemArch, for Semitic languages.

Of course, these days you can find a surprising range of recordings just from YouTube. For example, several interviews in the Berber variety of the Blida Atlas south of Algiers; a rap song in Tunisian Berber; an interview in Libyan Berber (Yefren). But those don't come with transcriptions, much less translations...

Sunday, October 20, 2013

Language policy and Islam: what should have been said?

Following up on my last post, what should a chapter on "Language policy and Islam" have looked like? It's not exactly my field, but here are a few basic notes – a more complete version would have to cite specific rulings from the major madhhabs, and discuss more extensively the realisation of these ideas in everyday practice, but this should give a general idea.

First of all, insofar as we can speak of Islam as having a formal language policy at all, that policy would be defined by the extensive body of jurisprudence on which languages may or must be used in particular religious contexts. Ṣalāt, ritual prayer, has to be in Arabic (Mawdudi 1957 notes a few arguable exceptions to this). Duʕā', asking favours of God, may be in any language. The adhān, the call to prayer, has to be in Arabic according to most scholars, although Atatürk briefly forced Turkish mosques to make it in Turkish (Atalay 2012). For the khuṭbah, the Friday sermon, scholars' opinions differ – to keep on the safe side, it's common for the imam to deliver a sermon in the congregation's language followed by a much shorter sermon in Arabic. The Qur'ān may be translated, and since early times frequently has been, but no translation of it can be considered authoritative, or substituted for the original in ritual contexts; in fact, such translations are viewed more as commentaries than as versions of the original. Everyday religious formulae – bismillah (in the name of God), alhamdulillah (thank God), inshallah (if God wills), etc – are ordinarily in Arabic, though I don't know what the jurists have to say about that.

As a result, the ordinary believer is commonly exposed to Arabic in religious contexts, and is individually required to memorise a certain number of formulae and chapters of the Qur'ān in Arabic. Quite frequently, the latter in particular are learnt by heart early with only cursory explanation of their meaning, since reciting them verbatim is a precondition for proper prayer, but understanding them is only really vital at a more advanced stage. What does need to be understood immediately – the basic religious obligations, creed, etc – is explained in a language the student understands. However, the further a student advances, the more important it becomes to have direct access to the original source texts; thus learning Classical Arabic is a basic prerequisite for becoming a serious religious scholar, although the vast majority of Muslims never get that far, and indeed a majority of Muslims do not speak Arabic. Regionally, other languages may also come to assume a secondary position in religious education – for example, Urdu in Pakistan, even though most students there have a different first language. A remarkable example of this is to be found in northeastern Nigeria, where advanced religious education requires mastering not just Classical Arabic but also Classical Kanembu, an extremely archaic variety of Kanembu currently used only for explaining Classical Arabic texts (Bondarev & Tijani 2013).

Interpreting the notion of "language policy" more broadly, one might also talk about the influence of Islam on attitudes to language. In this connection, the obvious point to discuss would be the (very weakly supported) claim commonly heard that "Arabic is the language of Paradise", and the even more obviously fabricated claim sometimes heard east of Iraq that "Arabic and Persian are the languages of Paradise". Yet the weakness of the religious evidence for both assertions is a strong indication that the causality is the other way around: religious positions on language, in Islam as elsewhere, have often been influenced by extra-religious prejudices. The universal consensus that some Islamic rituals must be performed in Arabic make it difficult for any Islamic society to assert strongly negative attitudes to Arabic, but beyond that minimum, language attitudes are determined more by social and political factors than by Islam specifically.

Friday, October 18, 2013

How not to write about "Islam and Qur'anic Arabic"

(Attention conservation notice: This post is probably only of interest if you're reading The Cambridge Handbook of Language Policy.)

A title like The Cambridge Handbook of Language Policy carries a reassuring message of solid reliability. The first chapter I happened to open it to, however, rather belies this reputation: "Language policy and religion", by Christina Bratt Paulston and Jonathan M. Watt. I'm sure its authors have plenty of expertise in, respectively, sociolinguistics and Biblical Studies. Unfortunately, they decided to pick a case study to which their expertise very clearly does not extend: "Islam and Qur'anic Arabic". This produced some rather serious misapprehensions, of which I'll explain the worst here for the benefit of any readers of the article.

"Presumably the existence of Allah and Jehovah are considered mutually exclusive by their believers" (p. 340) is self-evidently absurd. Muslims necessarily believe that they worship the God worshipped by Abraham and Moses, and that there is no other God. The Qur'an instructs Muslims to tell Christians that "our God and your God is one", and Arabic-language Bibles or Torahs call "Jehovah" Allah. (Malaysia's bizarre and unjust recent court decision to ban non-Muslims there from calling God "Allah" might suggest otherwise, but as far as I can tell, no one involved is claiming that Jehovah is a different entity from Allah; rather – as far as I can reconstruct their tortured reasoning from the brief sound-bites in the news – they're claiming that, at least in Malay, the word "Allah" ought to be exclusive to Muslims.)

"The insistence that the sacred book was transmitted from heaven in this language, and none other, appears never to have been challenged from within this religion" (p. 341). Obviously, the Qur'ān got here in the language that it's written in (unless you subscribe to the philologically untenable fantasies of Luxenberg). But the Qur'ān is not the only book which Islam acknowledges as a divine revelation - just the last, and the only one considered to have been preserved in its original form up to the present. And the Qur'an is rather explicit regarding the language of previous prophets: "We have not sent any Messenger except with the language of his people so he can make things clear to them". As the great 11th-century jurist and writer Ibn Ḥazm put it: "This means that Allah’s words and revelations were sent down in every language. He sent the Torah, the Gospel, and the Psalms. He spoke to Moses in Hebrew. He sent the Scrolls to Abraham in Syriac. Therefore, languages are equal in this regard."

"[The Qur'an] is an unflinching sequence of pronouncements, blessings, commendations, condemnations and exhortations: absent are narrative tales, devotional songs and meandering reflections" (p. 346). I can't see how this sentence could have been written by anyone who had actually read the Qur'ān, which is full of narrative tales and includes a good deal of reflection. (Not singing, of course, but anyone who has heard the Qur'ān recited will understand how it might take the place of "devotional songs".)

"The very name of Islam's book means 'that which is recited' or 'the collected things', and, as Cooper (1985: 55) notes, it shows a preference for the Qurayish (sic) tribal dialect" (p. 342): Qur'an could be rendered as "recitation", but has nothing to do with "the collected things", much less with the dialect of Quraysh.

"formal public readers of the Qur'an are clerics, never laymen" (p. 343): actually, in Islam there's no hard and fast dividing line between "clerics" and "laymen" in the first place. Any Muslim can and often does lead public prayers (which include the recitation of parts of the Qur'ān). Admittedly, the more public the setting, the stronger the preference for people who have memorised the whole book and studied its meaning and pronunciation in detail – I suppose you might call them "clerics", if you want to ignore the fact that they don't necessarily have any formal role at the mosque, and as likely as not have day jobs.

The presence of such errors, and more pervasively of strange gaps and perspective problems, become more understandable when you take a look at the references. In the whole section, only six works are cited on Islam and Qur'anic Arabic, apart from translations of the Qur'ān: Abdalati 1975 (Islam in Focus, an introduction to Islam for the general reader); Cooper 1985 (Ishmael My Brother, an elementary introduction to Islam for Christians); W. M. Watt 1968 (What is Islam?, an academic introduction to Islam); Ibn Warraq 1995 (Why I Am Not A Muslim); Rippin and Knappert 1990 (Textual Sources for the Study of Islam); and Speight 1989 (God is One, another introduction to Islam for Christians). That makes four beginners' introductions, one polemic, and one scholarly sourcebook. This is a reference list fit for a first-year undergrad's essay, not a published academic article.

If you haven't read this article, you're not missing anything. But if you find it on a reading list, consider forwarding this to whoever assigned you it.

Monday, October 14, 2013

A little mystery: an unidentified Indic language in the Genizah collection

In 1896, Cambridge bought a huge archive of documents from a synagogue in Cairo, starting as early as the 11th century: the Genizah collection. Most of them are in Arabic in the Hebrew script - or just in Hebrew - but the rest cover a wide variety of languages. One of them should be an interesting puzzle for any readers familiar with South Asian languages: the fragment below is obviously in Devanagari or some derivative, but so far no one has been able to determine what language it is written in or what it says. Given the trade connections revealed by the letters, it would probably have come from Kerala, or maybe later on Bombay, but there are no guarantees...

The image is from T-S AS 159.248, T-S AS 159.247: an unidentified Indian language; see there for two other similar fragments.

Any ideas?

Friday, September 27, 2013

Minkaohan - a Chinese word Algerians need

I read a fascinating and depressing article recently (The Strangers) - in which a linguist plays a lead role - about the worsening situation in Xinjiang. The author makes comparisons to Algeria at one point, but not for the following, which will surely strike a chord in anyone familiar with North African educational policy:
"Among the Uighur, however, the policy has created two distinct groups: the minkaohan, minorities educated in Mandarin, and the minkaomin, educated in their own language. Minkaomin education is not taken seriously by non-Uighur employers, and not speaking Mandarin shuts minkaomin graduates out of jobs. In turn, they often resent minkaohan students as opportunistic and unfaithful to their own heritage."
There seems to be a fair amount of scholarship on this issue, judging from a quick skim. The minkaohan have been analysed as a "hybrid identity", sometimes feeling a "sense of shame regarding their ethnic background" and often seen by their minkaomin peers as irreligious or potentially disloyal - but, of course, ambitious parents who want their children to be middle-class often see minkaohan education as the only way forward. Chinese is required for university, although 82% of Uyghur adults can't read Chinese, and students often have difficulty adjusting to the Chinese-speaking world of the university.

Sounds like a remarkably effective way to exacerbate social tensions, right? The irony is that, in North Africa, both governments and employers expanded or even created a very similar system after independence!

It hasn't escaped the Chinese government's notice that this is problematic, so they're addressing the problem by cutting way down on Uyghur teaching, in the hope of eventually making everyone "minkaohan": "'bilingual' classes in many areas have already developed from using Mandarin to teach math, physics, and chemistry to the new model of using Mandarin for all classes except for mother-tongue [language arts] classes." North Africa hasn't quite reached that second stage for Arabic, I'm glad to say - although that's actually the best it's ever managed for Berber - but that "solution" does have some proponents.

It's often been noted that Chinese has contributed surprisingly few loans to English. I think I'd nominate "minkaomin" and "minkaohan" for borrowing: they have no commonly used English equivalent, and are relevant to describing post-colonial situations in many countries.

Sunday, September 22, 2013

Adposition borrowing at SLE 2013

The Societas Linguistica Europae's annual conference finished today. The plethora of parallel sessions forced me to miss a lot of potentially interesting talks, but here are some highlights from the workshop I was participating in: adposition borrowing. This workshop was organised around a generalisation proposed by Edith Moravcsik 25 years ago, which has held up remarkably well (better than probably any other structure-based generalisation proposed about language contact):
"A lexical item that is of the 'grammatical' type (which type includes at least conjunctions and adpositions) cannot be included in the set of properties borrowed from a language unless the rule that determines its linear order with respect to its head is also borrowed." (source)
Eitan Grossman presented a number of apparent counterexamples – in fact, he reported that fully one-third of his sample of languages with borrowed adpositions displayed counterexamples. His effort to systematically test the hypothesis is laudable. However, the results cannot be taken at face value. Many examples, on closer examination, turn out to be amenable to one of three alternative explanations:
  1. The adposition was originally borrowed as a preposition, and turned into a postposition in the course of a more general typological realignment of the language. (This applies to Sri Lanka Malay dative nang, ultimately from a Javanese preposition; Authier et al. presented a new example, an apudlocative preposition possibly borrowed from Tatic into Georgian: Tatic (b)-tan N > old Georgian tan-a N > modern Georgian N-tan.)
  2. The source language order is not necessarily as postulated. (Thus the Khorasan Turkish postposition is assumed to be from Persian, in which it is a preposition, but could also derive from neighbouring Mazanderani, in which it is a postposition.)
  3. The "adposition" is also used without a complement in the source language (eg as a noun or adverb), and hence was not necessarily borrowed as an adposition. (This applies, for instance, to the Brahui postposition savā "without", connected at some remove to Persian سوا sevā "separate, other", or to the Manambu postposition wantaim "with", from Tok Pisin wantaim "together (adv.) / with (prep.)". In some cases the adverb is unambiguously the source, for instance Turkish raǧmen "despite", from the Arabic adverb raghman رغما rather than the preposition raghma رغم.)
1 and 2 merely illustrate the need for in-depth historical linguistic investigation of each case, which should go without saying. 3, however, is more interesting in principle. If an adposition can readily occur as a noun or adverb, are we justified in classing it as "of the 'grammatical' type"? The answer I gave in my presentation, before discussing the borrowing of adpositions in Northern Songhay, was: no. Not all adpositions are functional, as various authors have been pointing out since at least 1990, and we should not expect the generalisation to apply to lexical adpositions. In fact, we need at least a three-way distinction (cp. Littlefield 2005): purely functional adpositions such as of, in, to; purely lexical items used in complex adpositions such as front, back, middle (Svenonius's (2010) "axial parts"); and mixed items which simultaneously express the meanings of both a nominal/adverbial stem and a functional adposition governing it, such as beside (by the side of), inside (on the inside of). Functional adpositions should be subject to Moravcsik's generalisation; mixed items should be able to go both ways; and lexical items should be subject to the recipient language grammar alone. This proposal appears to eliminate all the few genuine exceptions to Moravcsik's generalisation so far proposed; however, it remains to be seen whether this criterion can be defined unambiguously for all adpositions in all languages.

Petros Karatsareas gave a nice summary of the situation in Cappadocian Greek (cf. Dawkins 1916), which has taken advantage of Greek's word order flexibility to move a long way towards developing postpositions; relational nouns which in Medieval Greek normally preceded their complement came to obligatorily follow it, yielding circumpositions (governing the genitive) whose prepositional component then became optional. This strategy was in turn used for borrowing Turkish adpositions.

Riho Grünthal pointed out the striking rarity of borrowed prepositions on Finno-Ugric, even in languages such as Finnish or Saami that (as a result of contact) have developed prepositions. This seems to confirm a point that I had also made in regard to Northern Songhay: that it's much easier to borrow adpositions when they have the same syntax in the source and target languages. He did find one or two cases, though, notably Livonian pa, from Latvian. Brigitte Pakendorf showed that Even borrows a fair number of Yakut postpositions (with varying degrees of acceptance among speakers), but no Russian prepositions, which at first sight seems to confirm the role of congruence even more. However, it's also true that Yakut has influenced Even much longer than Russian has.

Edith Moravcsik herself finally gave a summing-up address, in the form of an outline of relevant factors that need to be considered in the typology of adposition/case marker borrowing, with allusions to the talks given; she didn't focus particularly on her original generalisation, and she gave the impression of seeing it as being only statistically true in light of the proposed counterexamples.

I won't go into detail on the contributions that did not directly address Moravcsik's generalisation here, since this is already getting too long for a blog post, but some were also very interesting. Notably, Bakker and Hekking revealed that, whereas Quechua and Guarani make little use of Spanish adpositions, Otomí has massively adopted them – probably because Otomí, unlike the other two, had no morphemes serving such a function before contact, leaving it to context.

Much work remains to be done on the topic. Do you know any prepositions that have been borrowed as postpositions, or vice versa?

Friday, September 13, 2013

Anachronistic Arabic in Algeria

In general, I tend to think that conflating Modern Standard Arabic with Classical Arabic is fairly harmless, since they differ far less from each other than from any spoken dialect. However, occasionally that conflation can lead people really badly astray. The following sentence, which I was shocked to read in "The Language Planning Situation in Algeria" (Benrabah, 2007, in Language Planning and Policy in Africa), is a perfect example:
"For example, [in Algerian Arabic] common Arabic words such as mekteb ("office"), tawila ("table"), mistara ("ruler"), and siyara ("car") were replaced by their French counterpart pronounced [biro], [tabla], [rigla], [tomobil] respectively." (p. 49)
The automobile was invented in 1886, 56 years after the French conquered Algiers - and the word sayyārah سيارة wasn't proposed to describe it until 1892, by the Egyptian Ahmad Zaki Pasha. There was no pre-existing Arabic word in Algeria for ṭumubil to replace. A quick look at a dictionary of Algerian Arabic from 1838 reveals that the word ṭabla طابلة was already being used for (tall) tables then, so there's no reason to assume it came from French rather than some other Romance language (it's attested in Andalusi Arabic as ṭablah طبلة "table"). More to the point, Standard Arabic ṭāwilah طاولة is not to be found in pre-modern Arabic dictionaries, and in fact is a later borrowing into Egyptian Arabic of Italian tavolo. There is no reason to suppose that it ever existed in the Arabic of Algeria. Only the other two are real cases of replacement, and not precisely from the Modern Standard Arabic forms either: the 1838 dictionary gives "m'sèteur" مسطر for "ruler", and "makhzenn" مخزن for "office".

Algerians often assume a dialectal word is non-Arabic when in reality it's easily found in the classical dictionaries, simply because it's fallen into disuse in Modern Standard Arabic (for an egregious example, see my post Les Algériens qui ont oublié les dictionnaires de leurs ancêtres). Cases like this one illustrate that the converse is also true: we tend to assume that at some ill-defined point in the past Algerians were speaking to each other in the Arabic we learned at school , and forget that Modern Standard Arabic includes many words and expressions that were invented within the past century.

Friday, September 06, 2013

Y-chromosomes and language shift in North Africa

The other day I finally came across an easy-to-follow comparative presentation of North African genetic data, on Wikipedia of all things: Y-DNA haplogroups by populations of North Africa. I'm no geneticist, and welcome input from better-informed readers, but here's what that data looks like at first glance to a historical linguist.

As you might know, a man gets his Y-chromosome exclusively from his father (his mother doesn't have one). In North Africa, your ethnic/tribal/familial/etc identity – an important predictor of your language – is likewise traditionally supposed to be inherited from your father, not your mother. So it's illuminating to compare them.

A haplotype called E-M81 (or E1b1b, E3b) is frequent in Northwest Africa, and is held by large majorities of the Berber-speaking populations examined in Morocco or in the western/central Sahara; it is much less frequent in the Middle East. It seems reasonable to associate this haplotype with the spread of Berber. By contrast, haplotype J1 is very frequent in the Arabian Peninsula, but gets rarer and rarer as you go west; it seems reasonable to associate this haplotype with the Arab expansion. (Neither Berbers nor Arabs were ever completely homogeneous, so other, less frequent haplotypes may also be associated with one or the other of these events.)

The table gives four Algerian populations: Oran, Algiers, Tizi-Ouzou (Kabyle), and Mozabites. Mozabites, as might be expected, have a really high frequency of E-M81 (87%) and a really low frequency of J1 (1.5%). The other three, however, all have about 45% E-M81 (45%, 43%, 47% respectively) – in terms of the frequency of this presumably Berber marker, there is almost no difference between the Arabic speakers of Algiers and Oran and the Berber speakers of Tizi-Ouzou. In terms of the frequency of the originally Arab J1, the difference is hardly greater – 23% in Oran and Algiers vs. 16% in Tizi-Ouzou. Since we aren't sure about the historical interpretation of the rest of the haplotypes found, it may be more useful to consider the ratios of "Berber" E-M81 to "Arab" J1: 2:1 for Oran and Algiers vs. 3:1 for Tizi-Ouzou (and 29:1 for Mozabites).

What this tentatively tells us, in brief, is that:

  • In Algeria, plenty of Berber fathers adopted Arabic; if you are an Arabic speaker, you're very likely patrilineally Berber. (No surprise there!)
  • In Kabylie, a fair number of Arab fathers adopted Berber; if you are a Kabyle speaker, you may well nonetheless be patrilineally Arab. (Many readers will be surprised by this, but they shouldn't be: read about the history of the Sebaou valley in and after the Turkish period sometime, for example, let alone the more controversial example of the maraboutic families.)
  • Arabic was more likely to be adopted where more Arabs had come in, even though genetically, Arabs remained a minority. (In other words, Arabisation wasn't just about language shift.)
  • It's really rare for an outsider man to become Mozabite. (No surprise there either.)
A slightly different language shift situation is indicated by the comparison of Arab and Berber groups on Djerba (southern Tunisia). They do indeed differ on the frequency of J1 – the "Arabs" have it at 8.7%, while the Berbers have none at all. The Arabic speakers of Djerba appear to be genetically less Arab than the Kabyle speakers of Tizi-Ouzou! But, more importantly, we have what looks like a classic case of elite-led language shift: in this case, unlike Kabylie, the groups that incorporated Arab men simply ended up considering themselves Arab, while the ones that didn't stayed Berber. (I almost said kept speaking Berber, but actually, many Berber speakers of Djerba have been shifting to Arabic.)

Finally, one Berber-speaking population stands out radically in this table: Siwa. There is no significant presence of E-M81 there, and not much J1 either. The haplotypes best represented there are R1b – usually associated with Western Europe and, for some reason, with Chadic speakers – and B2a1a, usually associated with central and eastern sub-Saharan Africa. R1b has a reasonable frequency in Kabylie and Niger Tuareg, and to a lesser extent in Egypt, so we might suppose that it reflects the oasis' Berber roots, or that it reflects immigration from the east; we'd need non-Tuareg Libyan Berber genetic data to test that hypothesis. B, however, isn't common anywhere else in North Africa; does it derive from the slave trade, or from some older population of the region? Again, I think more data from Libya will be needed to make sense of this.