Friday, September 06, 2013

Y-chromosomes and language shift in North Africa

The other day I finally came across an easy-to-follow comparative presentation of North African genetic data, on Wikipedia of all things: Y-DNA haplogroups by populations of North Africa. I'm no geneticist, and welcome input from better-informed readers, but here's what that data looks like at first glance to a historical linguist.

As you might know, a man gets his Y-chromosome exclusively from his father (his mother doesn't have one). In North Africa, your ethnic/tribal/familial/etc identity – an important predictor of your language – is likewise traditionally supposed to be inherited from your father, not your mother. So it's illuminating to compare them.

A haplotype called E-M81 (or E1b1b, E3b) is frequent in Northwest Africa, and is held by large majorities of the Berber-speaking populations examined in Morocco or in the western/central Sahara; it is much less frequent in the Middle East. It seems reasonable to associate this haplotype with the spread of Berber. By contrast, haplotype J1 is very frequent in the Arabian Peninsula, but gets rarer and rarer as you go west; it seems reasonable to associate this haplotype with the Arab expansion. (Neither Berbers nor Arabs were ever completely homogeneous, so other, less frequent haplotypes may also be associated with one or the other of these events.)

The table gives four Algerian populations: Oran, Algiers, Tizi-Ouzou (Kabyle), and Mozabites. Mozabites, as might be expected, have a really high frequency of E-M81 (87%) and a really low frequency of J1 (1.5%). The other three, however, all have about 45% E-M81 (45%, 43%, 47% respectively) – in terms of the frequency of this presumably Berber marker, there is almost no difference between the Arabic speakers of Algiers and Oran and the Berber speakers of Tizi-Ouzou. In terms of the frequency of the originally Arab J1, the difference is hardly greater – 23% in Oran and Algiers vs. 16% in Tizi-Ouzou. Since we aren't sure about the historical interpretation of the rest of the haplotypes found, it may be more useful to consider the ratios of "Berber" E-M81 to "Arab" J1: 2:1 for Oran and Algiers vs. 3:1 for Tizi-Ouzou (and 29:1 for Mozabites).

What this tentatively tells us, in brief, is that:

  • In Algeria, plenty of Berber fathers adopted Arabic; if you are an Arabic speaker, you're very likely patrilineally Berber. (No surprise there!)
  • In Kabylie, a fair number of Arab fathers adopted Berber; if you are a Kabyle speaker, you may well nonetheless be patrilineally Arab. (Many readers will be surprised by this, but they shouldn't be: read about the history of the Sebaou valley in and after the Turkish period sometime, for example, let alone the more controversial example of the maraboutic families.)
  • Arabic was more likely to be adopted where more Arabs had come in, even though genetically, Arabs remained a minority. (In other words, Arabisation wasn't just about language shift.)
  • It's really rare for an outsider man to become Mozabite. (No surprise there either.)
A slightly different language shift situation is indicated by the comparison of Arab and Berber groups on Djerba (southern Tunisia). They do indeed differ on the frequency of J1 – the "Arabs" have it at 8.7%, while the Berbers have none at all. The Arabic speakers of Djerba appear to be genetically less Arab than the Kabyle speakers of Tizi-Ouzou! But, more importantly, we have what looks like a classic case of elite-led language shift: in this case, unlike Kabylie, the groups that incorporated Arab men simply ended up considering themselves Arab, while the ones that didn't stayed Berber. (I almost said kept speaking Berber, but actually, many Berber speakers of Djerba have been shifting to Arabic.)

Finally, one Berber-speaking population stands out radically in this table: Siwa. There is no significant presence of E-M81 there, and not much J1 either. The haplotypes best represented there are R1b – usually associated with Western Europe and, for some reason, with Chadic speakers – and B2a1a, usually associated with central and eastern sub-Saharan Africa. R1b has a reasonable frequency in Kabylie and Niger Tuareg, and to a lesser extent in Egypt, so we might suppose that it reflects the oasis' Berber roots, or that it reflects immigration from the east; we'd need non-Tuareg Libyan Berber genetic data to test that hypothesis. B, however, isn't common anywhere else in North Africa; does it derive from the slave trade, or from some older population of the region? Again, I think more data from Libya will be needed to make sense of this.


John Cowan said...

Might the R1b be from English and German soldiers during the World Wars? Siwa was apparently occupied by both.

Lameen Souag الأمين سواق said...

A little of it maybe, but it's very unlikely that that can account for a frequency of 28% – particularly since contemporary reports indicate that, then as now, Siwi women were extremely secluded. Southern Egypt and Kabylie both have R1b frequencies of about 15%, and the very small medieval population of Siwa meant a particularly successful man at that time could easily make a disproportionate contribution to the gene pool.

Firmus said...

R1b from Siwa and from Tizi Wezzu is not the same. The first one is characteristic of Chadic-speakers and the second one is typical of Iberians (might be remanants of Ibero-Maurisians ?). The thing is also that the sample from Tizi Wezzu lacks J2, which is a typical Arabian lineage (it goes up to 85% in Yemen and is steadily high among Bedouins i.e "pure" Arabs).
For the record, J1 was also found among Guanches remanants (mummies) which did knew any Arabic influence (16.7%).

However, we cannot yet draw a definitive conclusion about genetics in North Africa as it lacks a lot of samples to crunch comprehensive data yet.
But I very much doubt of the existence of Berberized Arabs outside some local exceptions, it it always Berbers who adopt Arabic rather than the opposite. Actually Arabizing a whole Berber region can be as fast as 3/4 generations. Look at Setif, Oum El-Bouaghi and even Batna or the Moroccan far east.

Lameen Souag الأمين سواق said...

You may be right about R1b; I couldn't find any information on the specific variant found in Siwa. But as for J, actually, it's J1 rather than J2 that is characteristic of Arabs; J2 is most frequent in the Fertile Crescent and the Caucasus, whereas J1 peaks in Arabia proper. Still, if J1 was found among Guanches, then that does complicate the picture - and makes it odd that J1 should be so rare among Moroccan Berbers...

Even today, in towns like Tizi-Ouzou and Bejaia, you can find Arabic speakers who have adopted Berber. Doutté and Gauthier too noted such cases among the Amraoua back in the early 20th century. If that can still happen occasionally even in the modern era, with Berber in retreat overall, then it must have been all the more likely in earlier periods – and that's exactly what oral tradition tells us.

Lameen Souag الأمين سواق said...

One obvious explanation for the presence of J1 among Guanches would be a Phoenician contribution – the ages of the mummies are consistent with that. Of course, that still complicates the interpretation of J1 in North Africa itself, but it would fit well with its rarity in Morocco.

S.Haouchine said...

Hello Lameen, it's still Firmus.

You're right, I can testify it as I am from Tizi :D, . The Amraouas were indeed "Kabylized" in the 20th century but only eastwards of Tizi Wezzu. Towns Tawerga, Sidi Naamane or even Redjwana) are Kabylizing nowadays but immediate surroundings of Tizi Wezzu (like Bukhalfa, Tala 3atman, most of DBK) kept Arabic because they have traditional links with the Deshra as we call it. On the other hand, lower-side of Tizi Wezzu (Nouvelle-Ville, Buaziz, etc) are predominately Kabyle-speaking while downtown and a bit of uptown (I speak of its inhabitants) are truly bi-lingual. I am from these "natively" bilingual families.

If you want more lingustic informations about the town/region feel free to ask further questions.

Anonymous said...

The impression I get, very much from being an outsider, is that J1 is associated with Semitic expansion into the Maghreb, rather than Arabic specifically (I myself am J1, and of non-Arabic (Jewish) Middle Eastern ancestry); so might it not be the case that some of the J1 gene lines come from the Phoenician expansion rather than the Arabic?

Lyamin Benshrif said...

There is a more meaningful genetic study of the region than Wikipedia (Although I can't seem to retrieve it). It statistically analysed genetic data and came to the conclusion that the spread of "Non-North African" Haplogroups (in particular J1/J2) in the region happened during the neolithic. Which seems to give credit to the theory that some Berbers, are of middle-eastern origin (likely the Zenata) whilst others may have come from the horn of Africa (the Senhaja)
The study also came to the conclusion that North Africa was populated by a group of less than 1000, their arrival corresponds mostly to the neolithic period.

As far as I know, Zenata and Senhaja may have settled in the region in that period, whilst the Masmuda (found only in Morocco) may have come a little earlier, likely from Iberia. The 3 groups then homogenised through thousands of years of cohabitation, to form a common culture and language which we know today as Berber.

In all, the spread of Arabs have little to do with J1 presence IMO, since they were so outnumbered by the Berbers. Ibn Khaldun report numbers of Hilalians/Banu Sulaim as less than 3,000 most of whom remained in Libya and southern Tunisia. The Maqil were less than 300. The original Muslim armies that conquered the region in the 8th century had only a few hundred Arab knights leading an army of non-Arab Mawalis. Most of whom went back to the ME. Whilst some Persian Mawalis, fleeing persecution in Baghdad/Damascus, would form an aristocracy in cities like Fes/Ceuta and al-Andalus, and most likely pretended to be Arab.

My theory is that the region was actually arabized from the North (not the East) following the cultural influence of the Andalus, a process which increased after the fall of the most populous part of this region (al-Gharb, Sevilla, Cordoba) in 1248. The resulting numerous refugees gradually arabized the region especially Morocco & the coastal parts.

Anonymous said...

Could the Siwa datum be merely reflecting a founder effect? In other words, could it be that it was founded by a small population (with maybe just a couple of men) which happened to have the R1b marker? I would expect other small isolated communities to exhibit more genetic diversity for the same reason.

Regarding the Guanche, what's the current thought on the linguistic position of the language within Berber?

Lameen Souag الأمين سواق said...

Firmus: Thanks for the details! I didn't realise they used to speak Arabic in Tawerga.

Lethargic-man: Quite possibly; in fact, a little of it might even be Jewish, given the region's history. I wonder if there's a way to unravel that?

Benshrif: If you have a reference for that paper I'd be glad to see it, but my understanding is that dating methods in genetics are so far a long way being reliable. In any case, J1 is practically absent from all the Zenata samples in that table based on the results available, so if Middle Eastern ancestry is to be associated with any subgroup on the basis of that haplotype, it would be the Senhaja.

As for the small numbers of Arabs, I don't find such figures very credible; it should take more than 3,000 people to defeat a state with the resources of all Tunisia at its disposal. Andalusi refugees played a role in Arabization, especially in northern Morocco, but comparing Maltese to modern Tunisian Arabic makes it easy to tell that they didn't Arabize Tunisia, much less Libya. And that just pushes the question back a step: if there were practically no Arabs for them to learn Arabic from, how did the Andalusis end up speaking Arabic? In fact, the only Andalusi sample in the table – from Zaghouan in Tunisia – has the highest percentage of J1 (44%) of all the North African samples, suggesting that Arab immigration wasn't so negligible after all.

Anon: Yes, a founder effect is probably at work. As for Guanche, as Galand reminded us recently, we don't even know for sure that it is Berber; all we know is that it contains a number of obviously Berber words, alongside a number of very basic words that look nothing like Berber. That could reflect common ancestry, or it could reflect later contact.

David Marjanović said...

a number of very basic words that look nothing like Berber

Now I'm intrigued. Reference, please? :-)

Lyamin Benshrif said...

AFAIK dating with statistical DNA analysis is adequately accurate (see the dating of Y-chromosomal Adam). I'll let you know as soon as I can retrieve that paper.

What is the state of linguistic research on Andalusi Arabic and its similarities with different accents in the region?

Based on available Andalusi Jazal, It seems northern Morocco Ghomara Arabic is actually further from Andalusi than Fes/Marrakech Arabic.
I know Tetuan/Chefchaouen were cities founded by Granada refugees but the most important wave of immigration happened after the fall of Cordoba which often gets overlooked relative to the fall of Granada which demographically wasn't as much of a big deal. Additionally there were other waves of immigration which happened during periods of famine/war.

I wouldn't give much weight to demographics in language shifts as opposed to political/economic might and cultural influence. Just look at how French is such a dominant language in the Maghreb even decades and soon centuries after the French disengaged militarily. How many of those who speak today so spontaneously in press/schools/media/business have French ancestry? perhaps less than 1 in a thousand...

Lameen Souag الأمين سواق said...

David: try Galand's "Berberisch - Der Schlüssel zum Altkanarischen?" for some discussion of the problem.

Looking through Corriente's grammar of Andalusi Arabic, you'll find a number of peculiar features absent from the Maghreb: for example, phonemic stress, and a negator las < laysa. To expand on the point I made earlier, Malta was conquered by the Normans in 1127 - decades before the fall of Cordoba, and only a short time after the arrival of the Banu Hilal - yet what they speak is far closer to modern Tunisian Arabic than the Andalusi Arabic of Ibn Quzman or Pedro de Alcala was. Likewise, the 11th century diaries of al-Baydaq - also predating the fall of Cordoba - show that Moroccan Arabic already had many of the characteristics it has now ( There was already an Arabic dialect spoken in North Africa proper before the fall of Cordoba and the arrival of Banu Hilal, although in most regions it was probably restricted to large towns.

Lyamin Benshrif said...

The Spanish studies are rather disappointing. They will always fall victim for their nationalistic view of Hispanic Arabic. Particular example I find odd in the paper you linked, is the author making Ceuta part of al-Andalus in 12th century when in fact the whole Andalus was ruled from Marrakech at the time.

Their whole reasoning is flawed, starting with the assumption that Andalusi Arabic was -by obligation- different from Moroccan Arabic. What if lack of Moroccan Arabic sources in that period is attributable to the fact that Most of Morocco/Western Algeria weren't Arabic speaking at the time?
Additionally differences inside al-Andalus were completely omitted (culturally Valencia =/= Cordoba =/= Al-Gharb) as well as continuous reciprocal influence as both shores were never actually disconnected; tens of waves of immigration in both directions.

The linguistic examples are also unconvincing, al-Baydaq who like Ibn Tumart studied in Cordoba probably learned Arabic there, what they see as spelling errors are merely different spelling conventions in comparison to modern standard Arabic. Replacement of hamza (ء) with ya' (ي) in plural is found in Ibn Khaldun writings and as far as Egypt and is by no means a Moroccan specificity. "las < lays" is also found in Moroccan Jazal (Malhoun).

It is certain that Arabic was spoken in north Africa before the fall of Cordoba, although limited to the very few cities of the time, but Arabic spread quickly after the 13th-centry. If it isn't for the sudden influx of Andalusis what triggered this sudden acceleration of the process?

Lameen Souag الأمين سواق said...

North Africa is rather bigger than Morocco. For Morocco and a few Algerian ports, Andalus no doubt played an important - though not exclusive - role in encouraging Arabisation. For Ifriqiya (Tunisia and northeastern Algeria), Arabic was already widespread before any Andalusi refugees arrived. And in most of Algeria (including all of western Algeria except Tlemcen and Nedroma), and all of Libya and Mauritania, people speak with a g, like Bedouins, instead of with a q, like both Andalus and Ifriqiya.

Incidentally, Corriente is a vocal critic of the nationalistic view of Spanish Arabic.

Anonymous said...

Genetic Structure of Algerian Populations
The paper isn in english

Anonymous said...

génétique population algérienne

Le lien est ici :

génétique population algérienne

Lameen Souag الأمين سواق said...

Thank you - that looks like a much more extensive study! Unfortunately it doesn't look at the Y-chromosome in particular, so its results aren't directly comparable.

Stellaritic said...

I like both population genetics and linguistics.
As far as I am concerned the genetic genesis of Algerians is somehow complicated,the presence of haplogroup J in Kabylia isn't necessarily due to the Arab presence IMO.
I have participated in SMGF and had both my Autosomal and Y chromosome DNA tested by ftDNA.
As a Central Algerian from Laghouat,my father had led a nomadic lifestyle for quite sometime,I was expecting to be either an E-M81 or a J1 carrier,to my surprise I turned out to be an E-m84 carrier which is also found at moderate frequencies among Kabyles.
This had led me to think that my forefathers were of Berber ancestry rather than SouthWestAsians(Bedouins).
Autosomally,I am no different than other Algerians except that I am slightly SouthWestAsian-shifted,presumably due to the recent Arab expansion.
It also implies that males in my lineage mated with the newcomers from the Arabian Peninsula most likely due to proximity and similar (nomadic)lifestyles.
I drafted a spreadsheet based on Polako's K36 calculator:

E-m84 D cluster

Lameen Souag الأمين سواق said...

Interesting. There's still a lot to learn!

andrew said...

The R1b found in Chadic peoples like the Hausa is R1b-V88 is which branches off from other kinds of R1b very close to the root of the R1b family tree. The R1b found in Western Europe (R1b-M269) is a highly derived variant of R1b with many subsequent mutations from the R1b-V88 version that are shared by Western Europeans but not Africans.

R1b-V88 probably arrived at the time of Chadic ethnogenesis around 5700 BCE (which correlates to an archaeological culture around Lake Chad). The Siwi almost surely represents a language shift of a Chadic language tribe to Berber for reasons unknown.

Lameen Souag الأمين سواق said...

You're right about the R1b variants, but note that Niger Tuareg, Egypt, and the Mzab all have the "Chadic" variant rather than the European one found in Tizi-Ouzou: . A direct Chadic input is thus not necessarily the only plausible explanation. We really do need more Libyan data to make sense of this.