Thursday, August 21, 2014

Sondage pour les algériens bilingues arabe/français

Même si j'écris généralement en anglais, je suis sûr que ce blog a quelques lecteurs algériens qui sont bilingues arabe/français. Si vous appartenez à cette catégorie, et si vous avez quelques minutes pour aider une doctorante algérienne à l'Université de Florida en ses recherches linguistiques, vous pouvez faire ce sondage. Je joins la lettre que j'ai reçue.


Nous menons une étude sur les algériens bilingues arabe/français. Si vous souhaitez participer, connectez-vous sur le lien ci-dessous. Soyez sure que vos réponses seront anonymes.

Si le lien ne s'ouvre pas, copier et coller le dans votre navigateur.

Pour ceux d'entre vous qui souhaiteraient terminer le sondage en deux fois, n'envoyez pas vos réponses, simplement quittez le sondage en fermant votre navigateur. Une fois connectés à nouveau vous pouvez continuer là où vous vous êtes arrêtés (les phrases peuvent apparaître dans un ordre différent).

Le sondage a plusieurs listes de phrases. Après avoir complété et envoyé le sondage, vous pouvez (si vous le souhaitez) entrer dans le lien une nouvelle fois et compléter une autre liste. Nous tenons à vous rappeler que vous ne devez pas compléter la même liste. Si on vous donne la même liste, veuillez quitter le sondage.

Vous pouvez transmettre ce message à d'autres algériens bilingues, mais s'il vous plaît ne l'affichez pas sur Facebook.

S'il vous plaît essayez de compléter l'enquête le plutôt possible avant sa fermeture.

Merci d'avoir partagé votre temps et vos idées.


University of Florida

Monday, August 18, 2014

A South Arabian loan into Libyan Berber?

From Morocco to Oman, there is a long tradition of imagining that the Berbers of North Africa and the Mehris of South Arabia speak the same language. This is by no means confined to pan-Arab nationalists - Siwis have told me more than once that some friend of a friend had met non-Arabic-speaking Yemenis and understood their language, and I'm told many Mehris have the same belief. I've previously discussed some possible reasons for this belief, as well as the more obviously propagandistic claim that Arabic descends from Berber; both are false.

Nevertheless, it is true that significant numbers of Yemenis participated in the Arab migrations to North Africa during the Islamic era, and it's not inherently implausible that some should have brought their languages with them. In fact, I just came across what looks very much like a South Arabian loan into the northwestern Libyan Berber variety of Zuwara (At Willul).

In Zuwara, the usual word for "father" is baba, as in many other Berber varieties, but in a few collocations such as əg tíddart n ḥíbi-s "in her father's house", a different term ḥibi is substituted (Mitchell 2009:303, 341). This word is unlikely to be proto-Berber, since proto-Berber did not have a phoneme /ḥ/ and since it is quite unusual within Berber. And as far as I know, it is not used anywhere in Arabic (although Libyan dialects are not that well documented). One could try to link it to ḥabīb-ī "my beloved", but that would be phonetically irregular and semantically unlikely, since this term is normally used in the context of romantic love or of a child by their parents.

However, the normal word for "father" in Mehri is ḥīb "father" - ḥayb-ī "my father", ḥīb-as "his father" (Watson 2012:149). In fact, Mehri adds this prefix to a number of kinship terms: ḥāmē "mother", ḥabrē "son", ḥabrīt "daughter" (ibid), as well as a number of other common nouns. Its function is to mark definiteness (ibid:64). But no such definite article has ever existed in Arabic or in Berber, so the only possible explanations for the similarity of Zuwara ḥibi are pure coincidence or borrowing from Mehri into Berber (perhaps via an Arabic dialect?). It will be interesting to see if other cases turn up.

And as long as I'm talking about Libyan Berber, I really ought to mention Marijn van Putten's new book A Grammar of Awjila Berber (see his announcement at Oriental Berber).. This careful analysis of all the unfortunately limited data available on the very unusual Berber variety of Awjila, in the far east of Libya, is an important resource for Berber historical linguistics. I hope that things settle down in Libya soon enough to make a fuller description possible, but for the moment, this work appears unlikely to be superseded.

Saturday, August 09, 2014

Some minority languages of the Mosul Plain

For most of the past decade, while first the rest of Iraq and then Syria (150,000 dead, 2.5 million refugees) have burned, Northern Iraq has seemed like a relative oasis of calm. That has changed rather suddenly: with ISIS' religious persecution, and now American airstrikes, Northern Iraq and its minorities are suddenly prominent in the headlines. The headlines throw into sharp relief the region's status as perhaps the most religiously diverse place in the Middle East - but what they may not show is that this region is also a small-scale "residual zone" preserving rather more linguistic diversity than is typical for such a small area in the modern Fertile Crescent (not just Arabic and Kurdish!)

The most endangered language of the region is certainly Northeastern Neo-Aramaic (NENA), or Sûreth (ܣܘܪܝܬ). Once, Aramaic was the lingua franca of the Middle East, spoken in various dialects from Gaza to Basra, and written as far afield as China and India. By the early 20th century, it was restricted to a few hundred far-flung mountain villages; the largest dialect group, NENA, was centered on the Christian (Assyrian and Chaldean) villages of the Mosul Plain, such as Tel Kef (Telkepe) and Qaraqosh, and across the border in Iran and Turkey; a detailed map is available at Cambridge's NENA Database. Today, those who have stayed behind in ever harder conditions are substantially outnumbered by their diaspora in cities such as Detroit or Sydney, whose children increasingly just speak English - and, as of the past couple of days, media accounts suggest that fleeing refugees have left the Mosul Plain villages practically empty. Their exodus is rather reminiscent of what happened about a century ago: during the Armenian/Assyrian Genocide, the NENA-speaking Assyrians of Hakkari fled from Turkey never to return, taking refuge in Iraq and finally in Syria. It remains to be seen whether this exile will be as lasting as the previous one. If you're wondering how the language sounds, the NENA Database site has a number of recordings, some transcribed, such as The Story of the Cobbler; others can be heard at Semitisches Tonarchiv.

While Kurds prefer to consider Kurdish as one language, the two main Kurdish varieties of northern Iraq - Sorani and Kurmanji - are strikingly different from one another, and are usually considered as separate languages by academics. The smaller Gurani language, (see DOBES), spoken in northwestern Iraq and also commonly labelled Kurdish, doesn't even belong to the same branch of Iranian as Sorani and Kurmanji. Many of its speakers belong to loosely Shia-affiliated minority religions, such as the Ahl-i Haqq and the Shabak, considered by ISIS as beyond the pale.

The other minority group unfortunate enough to have been pitched into the headlines, Yezidis, do not have a language of their own; they speak Kurmanji Kurdish. However, the Yezidis are associated with a unique writing system. In the early 20th century, manuscripts summarising Yezidi beliefs written in a unique alphabet (such as the Meshefa Resh "Black Scripture") came into the possession of Western researchers, and the alphabet in question duly found its way into compendia such as Diringer (1968). Later research, though, suggests that both these manuscripts and the alphabet they were written in were created for Western consumption, likely by a non-Yezidi bookseller, rather than representing a Yezidi tradition (Kreyenbrook and Rashow 2005, EI).

The region's Turkmen, many of whom have also apparently been persecuted by ISIS for their Shiism, speak a Turkic variety close to Turkish and Azeri. From what little information I've seen, it seems unlikely to qualify as a separate language, but does not seem to have attracted much research.

The Arabic dialects of northern Iraq - the so-called qeltu dialects, for their unique pronunciation of the word "I said" - are also quite interesting in their own right; the spoken Arabic dialect of Abbasid Baghdad seems likely to have belonged to this group. However, that is another story for another day...

Monday, July 14, 2014

Northern Songhay comparative wordlists

Linguistically, the northern and southern shores of the Sahara have remained surprisingly distinct, and most Saharan groups are easily identifiable as outposts of one or the other. Occasionally, however, a greater degree of language mixture is found. Nowhere is trans-Saharan language mixture more prominent than in Northern Songhay, a group of languages spoken in Niger, Mali, and Algeria combining a Songhay base with an enormous Berber superstratum, including Korandjé, a southwestern Algerian language I've been working on for a few years now.

Following an inquiry I recently received, I've been comparing Korandjé data to the Northern Songhay comparative wordlist in Rueck and Christiansen (1999). In the spirit of open data, you can view the wordlist (with a few remaining gaps to be filled) here: Korandjé 380-word list for Northern Songhay lexical comparison. Draft version, 14 July 2014. The results should be treated as provisional, since the Tasawaq part of this wordlist in particular appears a bit unreliable and since a few gaps remain in the Korandjé and even Tadaksahak lists, but are nevertheless interesting.

Counting cognates makes it very clear that Korandjé is the outlier, as might be expected based on geography:


The other three Northern Songhay varieties (treating Tagdal+Tabarog as one variety) form a linkage, which, following Wolff and Alidou's suggestion, we might label Azawagh Songhay - from west to east: Tadaksahak, Tagdal+Tabarog, then Tasawaq. On this wordlist Korandjé is clearly closest to Tasawaq, but that's only because Korandjé and Tasawaq have both kept more Songhay vocabulary, a fact irrelevant for subgrouping. The only innovation in vocabulary that Korandjé and Tasawaq share to the exclusion of the rest is the borrowing of numerals from 5 up from Arabic, and if you look at the sound correspondences it's clear that Tasawaq and Korandjé each borrowed their current numerals separately from different dialects of Arabic. Tadaksahak, Tagdal, and Tabarog all show almost the same number of items shared with Korandjé due to common borrowing from Berber, and most of that is due to shared borrowings of widespread Berber words that could easily have happened independently. The use of a Berber form originally meaning "weaver" for "spider" in Korandjé and Tadaksahak alone is striking, but very likely coincidental.

Another way to look at this is to note that 188 of the 332 items are shared across all of Azawagh Songhay, whereas only 108 are shared across all of Azawagh Songhay plus Korandjé. Of the latter, only 9 are Berber or Arabic loans, while 99 are Songhay retentions:

eye, ear, mouth, head, hair, neck, milk, belly, foot, hand, skin, blood, urine, liver, person, man, woman, owner, name, dog, cow, donkey, (venomous) snake, louse, meat, fat, stick, grass, rope, salt, pot, pit (hole), iron, fire, smoke, ashes, night, sun, day, yesterday, wind, water, stone, one, two, hot, cold, long, old, lots, red, black, white, dry, full, what, where, near, far, and, sit down, stand up, lie down, sleep, bite, eat, drink, suck, laugh, cry, see, hear, know, love, give, steal, hide, give birth, die, kill, walk, run, fall, wash, pierce, hit, tie, do, sew, bury, sandals, horse, truth, falsehood, finish, dig, stand, find.
This list is dominated by basic, rarely loaned words: nearly half of it overlaps with the Leipzig-Jakarta list. However, more culturally specific shared retentions such as "iron", "owner", "cow", "donkey", "horse", "pot", "sew", and "sandals" remind us that the split of Northern Songhay is after all rather recent (much more so, in fact, than these words alone might suggest).

These pan-Northern retentions, however, by no means exhaust the Songhay lexicon of Northern Songhay. Korandjé alone retains some 183 list items of Songhay origin, at least 135 of them shared with Tasawaq, while for many words (eg "four", "green"), only Tasawaq has kept Songhay forms. Well over 227 items have Songhay equivalents in at least one Azawagh Songhay variety, and more than 241 have equivalents either in the Azawagh or in Korandje. If the even more conservative (but extinct) Emghedesie variety were added to the list, that number would no doubt be even larger. Proto-Northern Songhay certainly had a significantly larger Songhay lexicon than any of its descendants does.

[Later addendum]: Removing all words with Arabic-derived Korandje forms from the list makes no difference to the classification; the table ends up like this:


Saturday, June 28, 2014

Grammatically analysing "Sahha Ramdankoum!"

Sahha Ramdankoum صحّة رمضانكم!‍ ‍This Darja phrase, which might be rendered as "happy Ramadan!", is familiar to any Algerian. It groups with a few others - notably Sahha Ftourkoum صحة فطولاركم "happy fast-breaking dinner!" and Sahha Eidkoum صحة عيدكم "happy Eid!" - as an example of a not very productive template "Sahha X+2nd person possessive" expressing good wishes on the occasion of X. But what is "sahha" doing in such forms?

In many contexts, "sahha" is a noun meaning "health"; we can be sure it is a noun, since it can be the object of a preposition and take personal possessive endings, as in b-sahht-ek بصحتك "good for you" (with your health). But there is also a defective verb, taking 2nd person perfective endings: sahhit صحيت (to a man), sahhiti صحيتي (to a woman), sahhitou صحيتو (to a group) "thanks / well done" (a little stronger than sahha "thanks"). The expected 3rd person masculine singular form of this verb would be sahh صح or sahha صحى; sahh actually is attested as an impersonal verb (ysahh-lek يصحلك "it is appropriate for you"), but its meaning is sufficiently distant that it's not necessarily part of the same paradigm. So in principle, "sahha" in "Sahha Ramdanek" could be interpreted as a noun, or a verb. Is there any way to decide which?

If it's a noun, then the phrase's syntax is bizarre - the literal interpretation would then be "Health is your Ramadan", whereas to make it fit the actual meaning we want at least something like "Your Ramadan is health", which would be the opposite order (?Ramdanek Sahha رمضانك صحة). If it's a verb, on the other hand, the syntax is fine - subjects in Algerian Arabic routinely follow the verb, and perfective verbs are routinely used to express states, so we could interpret it as something like "Healthy is your Ramadan!" or even, if we allow the perfective to be optative as in Classical Arabic, "May your Ramadan be healthy!"

On the other hand, if it's a verb, then it should agree in gender and number with what follows it, with feminine "sahhat" صحات and plural "sahhaw" صحاو. This can't actually be tested directly: in all such expressions that I can think of, the noun happens to be masculine and singular, and this expression cannot normally be extended to congratulate people on other occasions. But if we imagine using this formula to congratulate someone on their happiness, I for one would much sooner say "Sahha Farhatkoum" صحة فرحتكم than "Sahhat Farhatkoum" صحات فرحتكم, which suggests that my mind, at least, is not analysing it as a verb.

Perhaps it's neither noun nor verb, then? There are a few words in Algerian Arabic that form predicates and comme at the start of the clause, but do not take verbal morphology - for instance, makash ماكاش "there is no" or oulah ولاه "no need (for)". Putting it in this class would take care of the problem, but just leads us to a different one: can this class of non-verbal predicators be given a coherent positive definition, or is it just whatever happens to be left over from defining the major word classes?

Be that as it may, best wishes to all readers for this coming month, and, for those fasting it, Sahha Ramdankoum!

Tuesday, June 24, 2014

From Figuig to Igli: Berber in the Algerian-Morocco borderland

The number of good Berber descriptive dictionaries has been slowly but steadily increasing in recent years, but Hassane Benamara's new Dictionnaire amazigh-français : Parler de Figuig et ses régions (Rabat: IRCAM, 2013), which I was lucky enough to be lent a copy of lately, is surely one of the best. Apart from being quite unusually large (800 pages), it incorporates examples, multiple senses, pictures of items difficult to describe, an appendix with encyclopedic information on culturally specific words such as festivals and childrens' games. It incorporates a few neologisms useful for schooling, but takes a fairly inclusive attitude towards Arabic loanwords. There are barely 15,000 people in Figuig, but, astonishingly enough, this is actually the second dictionary of Figuig Berber published by a native speaker; the first, Ali Sahli's معجم أمازيغي-عربي (خاص بلهجة أهالي فجيج) (Oujda: Al Anwar Al Maghribia, 2008), was a good effort, but is substantially shorter and used a less accurate transcription. (There's even another linguist from Figuig, Mohamed Yeou, threatening to make a third dictionary – if he goes ahead with the project, he'll have a high hurdle to clear.)

Across the border in Algeria, the situation is rather different. A number of towns across a wide area around Bechar and Ain Sefra speak Berber varieties closely related to that of Figuig, collectively imprecisely termed "Shelha". Some of them seem to be shifting to Arabic (on my latest trip, I was told that in Lahmar they had stopped speaking Berber with their children, and for Igli I had heard the same much earlier.) But little effort – and no official effort, as far as I know – is being made to document them. The only (very) partial exceptions of which I am aware are Igli and Boussemghoun.

For Igli (population 7000), I have already described the local Scouts' efforts to put together an online dictionary. More recently, however, I came across a laudable local attempt at approaching the problem academically: Fatima Mouili's The Berber Speech of Igli, Language towards Extinction. After a very brief summary of Igli grammar and phonology, unfortunately made frequently illegible by font problems, the author discusses the reasons for language shift. Corresponding to my impressions for the region, including Tabelbala, she cites emigration and the desire to ensure educational success as important drivers; others are more surprising, including the immigration of refugees expelled by the French from a nearby village during the Algerian War of Independence. Apparently, her thesis discusses similar issues, for those with 59€ to spare...

For Boussemghoun (population 4000), a few articles and a book by Mohamed Benali may be cited, all focusing – as far as I can see – exclusively on the sociolinguistic situation of Berber in the town. A local Berber-language poet billed as "the Ait Menguellet of Boussemghoun", Bashir Oulhaj, has a considerable presence on YouTube, eg here; he's even been interviewed, by Figuig News. It seems to be treated as the centre for Amazigh identity in the region; the HCA has even organised a symposium there. Nevertheless, little if any descriptive work has been published on its variety of Berber.

Taken together, there are probably more speakers of Berber in southwestern Algeria than in and around Figuig. Why the difference, then? Is it because linguistics is better represented in Moroccan universities than in Algerian ones? (Notwithstanding some interesting work coming out of Algeria, I think that is fair – it would be hard to think of any linguist working in Algeria with a profile comparable to Abdelkader Fassi Fehri, for example.) Or is it because the Amazigh movement in Morocco is less closely associated with one side in the "culture war"? (Benali observes that, while most Semghounis wanted Berber to be taught in schools, they rejected the installation of an HCA office due to distrusting their politics.) Or are there more specific, purely local factors explaining the difference? That would be worth a study in itself – though perhaps not as much so as the Berber varieties in question!

Tuesday, June 17, 2014

Why Yiddish is not Slavic, and language families are not families

Recently I came across a popular article, Where Did Yiddish Come From?, discussing Paul Wexler's eccentric claim that Yiddish is a "relexified" Slavic language (and Modern Hebrew, in turn, "relexified" Yiddish). To make any sense of this claim, we have to stop and consider what historical linguists mean when they talk about language origins.

If you want to learn a language perfectly, the best way to start is to pick it up as a child from your family and the community they're part of. That way, you and your generation end up speaking the same language as your parents and their generation, modulo a few little innovations you threw in just to annoy them. As those little innovations pile up, generation on generation, sooner or later you end up speaking something that the first generation wouldn't have been able to understand. In such a scenario, everyone agrees, the latest generation's language – let's call it B – is descended from the first generation's (A). If some of the children of that first generation moved far away early on and went through the same process of gradual change, their descendants speak another language, C, which speakers of B can't understand, but which is also descended from A. So we say that B and C belong to the same language family, just as their speakers belong at some remove to the same extended family.

If you're reading this, it's probably too late to learn a language that way. (Sorry.) You can still learn another language, say B, but the odds are that, at best, you'll always speak it with a bit of a foreign accent, and keep using expressions that make sense in English but sound weird to native speakers. If you're just an individual migrant learning it to fit in, that won't matter in the long run – your kids will learn the language in the playground and come back speaking it better than you do. But what if it's not just you that's learning it, but also your spouse, and your brothers, and almost everyone you know? What if your whole community is starting to prefer to speak this language with their kids, instead of the one they grew up with? In that case, the kids will still end up speaking it – but instead of speaking it like natives, they'll probably end up speaking it with your foreign accent and all those expressions of yours that native speakers laugh at. In that scenario, does the kids' language (let's call it D) belong to the same language family as B and C, or not? That's the ambiguity that Wexler is playing with.

The obvious answer – and the one most linguists would give – is yes*. For one thing, assuming you did a half-decent job of learning B, it's the same language – speakers of D can understand speakers of B, and vice versa, even if they laugh at each other's crazy accents. The influence of Gaelic may pervade Irish English, but Irish English is still English, not some Celtic language. It's the vocabulary and the morphology that really make English understandable – a weird accent or a funny way of putting things is just not that big an obstacle on its own. Wexler proposes exactly the opposite criterion: "Yiddish – in contrast to its massive German vocabulary – has a native Slavic syntax and sound system – and thus must be classified as a Slavic language" (1993:5). The origins of Yiddish syntax and phonology I can't comment on, but there's a good reason why historical linguists normally prioritise the vocabulary and the morphology over the syntax and phonology, even apart from the one just given. Vocabulary and morphology are eminently reconstructible, using the comparative method. Phonology, on the other hand, can only be reconstructed from vocabulary, and syntax is notoriously hard to reconstruct at all. If language families were to be defined based on phonology and syntax, it would hardly be possible to define them, much less reconstruct them or state regular correspondences between them.

In short, saying that Yiddish (much less Modern Hebrew) belongs to the Slavic language family is just a word game – in the sense that historical linguists normally use the concept of "language family", it doesn't, and wouldn't even if every last Yiddish speaker happened to be of Slavic ancestry and to speak Yiddish with a heavy Slavic accent. But such word games do not vitiate Wexler's work. After a large enough community has shifted to a different language, it is usually possible to find traces of their former language – although identifying them as such, rather than as later borrowings, may be hard. That's what Wexler is trying to do for Yiddish, and that's how he supports his claim that Yiddish speakers' ancestors used to speak a Slavic language.

* However, the question can easily be made more controversial. Suppose you and your community didn't learn it that well to start with, and aren't trying to imitate native speakers anyway? In that case, the kids will end up speaking something that sounds utterly ridiculous to native speakers; the basic words are recognisable, but the way they're put together seems all wrong. Whatever Tok Pisin is, most people would agree that it's not English. A few people would defend the claim that Tok Pisin belongs to the same family as English, on the basis that that's where the vocabulary comes from, but most would say that it doesn't belong to a language family. The language family model presupposes that the language is being passed on reasonably well as a whole, including not just vocabulary but also some amount of grammar; if all that's learned is a bunch of words, the model breaks down. The border must be drawn somewhere between the extremes of Irish English and Tok Pisin, but linguists can and do disagree on where exactly to draw it.