Jabal al-Lughat

Saturday, July 03, 2010

The unreliability of Afroasiatic etymologies

The fact that Semitic, Egyptian, Berber, Cushitic, and Chadic all belong to a single family - Afroasiatic - is fairly secure, based on striking correspondences in basic morphology. However, it is often not appreciated just how difficult it is to find reliable lexical comparisons between these families, and just how primitive the current state of AA reconstruction is. The easiest source of AA etymologies online is Militarev's database on Starling, so I'm going to pick on it for this post (Orel & Stolbova and Ehret reveal similar issues, but the latter doesn't even include Berber, and I'm focusing mainly on Berber entries here for convenience.)

Suspiciously many entries are listed as having a cognate in only one Berber language (eg earth, hide, skin, run away); given the general closeness of different Berber varieties, you would expect valid proto-Berber terms to be reflected in more than one place. However, these could always be right. Other issues are more serious.

In several cases, a single proto-Berber root is split across several AA ones, due to mistaken sound correspondences. For example:

Proto-Berber *i-qăs "bone, (fruit) pit" is split between PAA *ʔayš/ʔawš- "ripened grain, corn" with Zenaga iʔssi (quoted without the glottal stop) "os; grain, graine, baie; comprimé, pilule, cachet, pastille; perle" (Taine-Cheikh), and *ḳ(ʷ)as "bone", with all other reflexes of *iqăs, even though Berber γ (<*q) commonly corresponds to Zenaga ʔ.
Proto-Berber *ta-Hăli (> *ti-Həli) "sheep" is split between pAA *ʔayl "ram" and *bawil "ram", although Ghadames-Awjila v corresponds regularly to Tuareg h and other Berber Ø. (A couple of forms, like Figuig tili mistakenly glossed as "ram", have even somehow found their way into a third etymon, "proto-Berber" *laH!) The issue is alluded to in a cryptic comment under the Berber section of PAA *waʔil "wild goat/ram; antelope": "Pr. H No. 220 (and Kössm. 193): Ghdm., Audj. Hgr etc. te-hele < *tiHeli, which, on the contrary, is to be connected with *ʔayl- 'ram' 3061 (together with Brb. forms of the t-ili type), as *ʔ > h in Hgr, while *ʕ > Hgr 0".
Most reflexes of pan-Berber ikərri / akrar "ram" are assigned to PAA *kar(w)- "ram, goat; lamb; kid". (The Semitic parallels listed for this word are rather interesting.) But Zenaga ǝgrǝrh, pl. gurănh 'bélier' (Nic. 156), on its own, is given a supposed proto-Berber form *gur- "ram", corresponding to an AA form *(ʔa-)gʷar "kind of antelope; ram; goat". In fact, however, there is a common correspondence of Zenaga g followed by a sonorant to proto-Berber k (eg ägärgur "chest" = Siwi ikərkər, əməgyih "dine" = Kabyle iməkli etc), and this word is obviously related to the other Berber forms.

Another case is listed as doubtful, eg:

Most reflexes of Proto-Berber *a-lăqŭm "camel" are under PAA *ʕalVḳ/g- ˜ *lVḳ/gum- ˜ *ḳalVm- "camel"; but the Zenaga one äyiʔm, with regular *l > y (in his source's transcription ǯ) and common *γ > ʔ as seen previously, ends up as PAA *gam-al- (?).

Similarly, unrelated forms may be grouped together due to accidental similarity, eg:

Under PAA *kʷay(-t)- "hen; partridge; dove; chick" is listed a "proto-Berber" form *i-kaHi; but the Ahaggar form listed corresponds regularly to Niger Tuareg tekažit, Mali Tuareg tekazzit, Awjila təkažit "hen" (see Kossmann 2005:60), and as such is unrelated to the Ayr and Tawllemmet forms takəyya quoted.

Another problem is undetected loans; this applies especially in sub-Saharan Africa, where little work has been done on their impact. PAA *ʔa/iw / *waʔ "bull, cow" is supported by Tawellemmet hawu "cow", isolated in Berber and obviously borrowed from Songhay, cp. Zarma haw, Tadaksahak hawú; removing this from the etymology leaves only pan-Tuareg iwan "cows", with no evidence for the desired *H. PAA *bar "cereal, corn" is supported by Zenaga būru "bread"; but this word is isolated in Berber and widespread in West Africa (eg Wolof mbuuru, Soninke buuru, Bambara nbuuru, Peul mbuuru, Zarma buuru), and is more likely a loan from Wolof or Pulaar.

Interestingly, most of the problem cases I've noticed in this quick skim are related to agricultural terminology. I wonder if that has anything to do with the particular interest of such terms for archeologists motivating a more intense search for cognates.

Thursday, June 24, 2010

Why they thought the Berbers came from Yemen

A long-standing tradition in North Africa, convincingly rejected by Ibn Khaldūn but perpetuated by poets and curricula alike, claims that some major Berber tribes descend from Yemeni Arabs through semi-mythical pre-Islamic kings and their wholly mythical vast conquests. This idea has little to support it, and probably became popular because it allowed these tribes to claim prestigious connections in the context of a high culture dominated by Arab ideas; but why should the connection be specifically Yemeni, rather than, say, North Arabian or perhaps Persian? Linguistics suggests a possible answer.

In southern Arabia live several groups, most famously the Mehri tribe, whose languages, though Semitic, are only distantly related to Arabic, and quite incomprehensible to other Arabs. (You can hear recordings of it at SemArch.) Recently I borrowed a copy of the recently published Mehri Language of Oman, by Aaron Rubin; looking through it, I could see several points where Mehri resembles Berber but not Arabic that a traveller might seize on, notably:

-s ـس "her", -sən ـسن "their (f.)"; compare Siwi -nn-əs ـنّس "his/her", -n-sən ـنسن "their (m/f)". A 3rd person in -s was found in proto-Semitic, as shown by Akkadian, but was replaced in Arabic.
əl ال "not" (preverbal first element of negative); compare Tumzabt ul أُل. Again, this is found in Akkadian and hence must be proto-Semitic.
-ət ـت feminine singular; compare Siwi -ət ـت (feminine singular in Arabic borrowings.) Again, the connection is real, but dates back to proto-Semitic rather than indicating any special relationship between the two.
-tən ـتن feminine plural; compare Berber -tən ـتن (plural of some masculine nouns)
a- أَ used as a definite article for some nouns; compare Berber a- أَ(masculine singular noun prefix). A striking case is Mehri a-məsge:d أَمسجيد vs. Siwi a-məzdəg أمزدج "the mosque". However, in Mehri this indicates definiteness, and does not depend on gender; this is probably a coincidence.
tə-...-əm تـ...ـم second person plural imperfective, eg təkə́tbəm تكتبم "you (pl.) write"; compare Berber t-...-m تـ...ـم. The t- is cognate; not sure about the history of the -m offhand.
'ār آر "except, but"; compare Tuareg ar.
ā آ "oh" (vocative); compare pan-Berber a أ. (This is actually found in Classical Arabic as well, أ, but is not widely used.)

None of these similarities in fact imply any close relationship between Berber and Mehri, of course; some are coincidental, while others can be traced back to proto-Semitic, and hence constitute evidence connecting Berber with Semitic, not specifically with Mehri. However, a medieval traveller between Yemen and North Africa would not have known that, and could easily have observed similarities like these and leapt to the seemingly plausible conclusion that Berber was connected to the language of these Yemeni tribes, who, like many Berbers, seemed to live just like Arabs yet speak totally differently.

Tuesday, June 15, 2010

The Berber language of Sokna (Libya)

Thank you SOAS library - I finally got a copy of Il dialetto berbero di Sokna! Sokna (they even have a Facebook group) is a small oasis south of Sirt in Libya, whose dialect of Berber, along with that of nearby El-Fogaha, is Siwi's closest relative. There were several surprises inside, including unusual vocabulary like amerru "mountain" or imeγri "Dhuhr (the midday prayer)", and some striking features shared with Siwi; one of the main ones is an unexpected bit of allomorphy. Across Berber, the second person plural ("you guys") is expressed on the verb with t-...-m, except in the imperative; Sokna does the same, so for example "you have" is t-la-m. In the imperative, you have a suffix -t; Sokna again does the same, eg sag-it-ten iyi-leḥbes "(you guys,) take them to prison!" But if you add an indirect object pronoun ("to him" etc.) to the imperative, you replace this t with an m, like the m in the second half of the non-imperative forms: eḍbeḥ-im-as a-na-dd y-used "(you guys) tell him to come to us!" The same thing happens in Siwi, except that in Siwi the prefixed t- of the non-imperative forms has disappeared. I'm doing a paper on the development of indirect object agreement in Siwi for the Berberologie conference in July, and this is a useful pointer to its history. Amazigh readers - have you come across anything like this?

Sadly, Berber is probably no longer spoken in Sokna. When this article was written in 1911, the shaykh of the oasis reported that only 4 or 5 Isuknan could still speak it, although many more could understand a bit. I don't know whether the people of Sokna today regret the loss of their language or are glad of it - but its disappearance destroys a key not just to Sokna's history but to that of Libya, Egypt, and the whole of North Africa, leaving only this article's fairly short wordlist (and a few even shorter older sources) as evidence for migrations between central Libya and Siwa and early contact with vanished pre-Sulaymi Arabic dialects.

Wednesday, June 09, 2010

Religious origins of the "Welsh Not"?

A well-known weapon in the arsenal deployed by educational systems the world over against local languages was what in the UK used to be called the Welsh Not - a piece of wood hung around the neck of a student caught speaking their own language, and passed on through the day to anyone that student heard speaking their language, so that whoever was wearing it at the end of the day would be punished. At a talk yesterday I heard that the same idea was implemented in Japan (against Ryukyuan languages) and Sudan (against Nubian.) Coincidentally, I just came across an account that gives interesting insight into the origins of this oppressive practice:

"With a general consent of all our company, it was ordained that there should be a palmer or ferula which should be in the keeping of him who was taken with an oath; and that he who had the palmer should give to every one that he took swearing, a palmada with it and the ferula; and whosoever at the time of evening or morning prayer was found to have the palmer, should have three blows given him by the captain or the master; and that he should still be bound to free himself by taking another, or else to run in danger of continuing the penalty, which, being executed a few days, reformed the vice, so that in three days together was not one oath heard to be sworn."The Observations of Sir Richard Hawkins, Knt in his voyage into the South Sea in the year 1593

Hard to imagine a ship full of sailors submitting to such a practice! But was this the original purpose of the Welsh Not? It would be interesting to find out. If anyone has an older citation to compare, I'd love to see it.

Thursday, May 20, 2010

Endangered languages on Aljazeera

Aljazeera English is doing an interesting series on language endangerment and revitalisation:
* Language on the brink, talking with the last speaker of Wichita.
* Saving the language of the Cherokee, in Tahlequah
* French region aims to save language, on Breton
* Turkey's fading linguistic heritage and Saving Turkey's Laz language, on Laz (a close relative of Georgian, not "an ancient tongue that bears no resemblance to any other language in the region".)
* Circassians in bid to save language in Jordan - at a talk this week by Enam al-Wer I heard that, at the start of the twentieth century, the only permanent population in Amman was Circassian.

Thursday, April 29, 2010

Manatees and bilingual compounds

In Djenné Chiini, the Western Songhay dialect of Djenné in Mali, the word for "manatee" is ayuumaa. This is clearly a compound of two elements: ayuu, the word for manatee throughout the rest of Songhay (as well as in Hausa), and maa from Bozo máa, which also means "manatee" (Bozo being the original language of the Djenné region.) It's as if the American English word for an elk were "elk-moose". I can't think of any other examples of this kind of half-borrowing, where a native word is "expanded" by adding on its translation into another language; can you?

(Sources: Daget 1953, La langue bozo; Heath 1998, Dictionnaire songhay-anglais-français, tome II: Djenné chiini.)

Monday, April 05, 2010

More on the WOLD Kanuri entry

The World Loanword Database is a great resource, and the Hausa/Kanuri team deserve congratulations for undertaking the Herculean labour of putting together two sets of etymologies. However, there are some issues with the Arabic etymologies in the Kanuri entry. The transcription is inconsistent and sometimes incorrect; more seriously, a few entries give incorrect meanings or impossible etymologies, as in the following cases:

3.592 àkú parrot: the quoted Arabic form is almost impossible as a Classical Arabic noun (and not in the Lisan al-Arab; the Arabic word is babγā’), and parrots are known in the Arab world only as an exotic import. Assuming the form exists in some Arabic dialect, it must be a loan from a sub-Saharan African language, not vice versa.
9.24 mágàsù scissors: the g and the u both suggest that this word entered directly from (Bedouin) Arabic, not via Hausa.
11.12 hàláltə́ own: if this is correctly transcribed, surely it comes from Arabic ħalāl “licit; one’s lawful property”. Arabic halak means “perish”.
11.79 ríwà dìò to earn: “ribā” means usury, and is strongly condemned in Islam; it is unlikely that this would be adopted as a neutral word “earn”. The more plausible source for both the Kanuri and the Hausa is Arabic ribħ “profit, gain”.
11.78 àlwúsùr wages: Perhaps < Arabic al-`ušr "tithe (< one-tenth)"; surely not from ma`āš.
14.451/6 kàjílí evening: “kajir” is not a possible native Classical Arabic word, and is not attested in Classical Arabic. If it’s in Shuwa, it must come from Kanuri, not vice versa.
16.34 tə́wə́rítə́ regret: Hausa tuubaa does come from Arabic, but clearly from Arabic tūb “repent”; it has nothing to do with Arabic ta’assaf (not *tāssaf) “regret”.
16.69 gàfə̀rtə́ forgive: the connection to Arabic γafar- is obviously correct, but Arabic yaʕfū is equally obviously not relevant; even if ʕ were normally reflected as g in Kanuri, it would leave the r unexplained.
18.33 kàsàttə́/àrdìtə́ admit: the Arabic form “kasat” does not exist. yarḍā means “may He hope/ approve” (as noted), not “admit”, making the connection rather tenuous.
18.45 áwúlò dìò boast: there is no Classical Arabic word “awulo”.
19.47 àmàrtə́ permit: Arabic ʔamar- means “he ordered”, not “permission”.
20.31 súlwé armor: Arabic silāħ means “weapons”, not “armor”.
21.24 àlàptà swear < ħalaf "swear" (not < allāh "the god")
21.37 àzáwù punishment: from Arabic ʕađāb “punishment, torment” rather than jazā’.
21.47 perjury: by what chain of semantic changes could “perjury” derive from “lawful”? And why would l > k?

Probable Arabic loanwords not listed as such include:
11.54 bàyîl stingy: from Arabic baxīl.
4.89 sûm poison: surely from Arabic samm?
4.93 sə̀lé bald: surely from Arabic ‘aṣla`?
5.26 kóló pot: perhaps cp. Arabic qullah (or onomatopeic?)
7.58 kábbì arch: surely from Arabic qubbah?
14.25 bàdìtə́ begin: surely from Arabic bada’?
11.29 lòrùtə́ damage: from Arabic ḍarr (impf. -ḍurr-). Cp. “judge” for ḍ > l.
24.02 wàltà become: perhaps from Maghrebi Arabic wəlli “become, return”.

In some cases, looking more widely allows the etymologies to be improved:
3.11 lə̀mân animal: < al-māl- "livestock, money", rather than al-mann "favor, benefit". For the dissimilation, compare the common Maghrebi Arabic change of n...n to n...l, eg badənjal < bāđinjān, fənjal < finjān.
2.34 lòrúsà wedding: probably from al-`arūs “bride” (Maghrebi Arabic l-aʕṛuṣa), rather than direct from ʕurs. Cp. Siwi aʕṛus “wedding”, with the same semantic shift.

There are also a few cases, many probably originally formatting issues, where the correct form is given in comments, but contradicted elsewhere:

3.25 sheep: the source cited, Kossmann 2005 (67), points out that the form quoted by Skinner, *adaman, is unattested. The correct form, adəmman, is found in Arabic as well as Berber, and refers to a type of sheep said to come from sub-Saharan Africa. Given that it refers to a specifically sub-Saharan sheep breed, 5 would seem a better classification than 4, though 4 is understandable.
3.78 camel: Kossmann 2005, cited, makes it rather clear than an Arabic origin for this word is very improbable. Moreover, there is no such Arabic word as “ləγəmal”; only the form jamal is correct.
4.87 physician: If Shuwa Arabic or some such variety has a term liktaay, there can be little doubt that it is a loan into Shuwa, not from Shuwa. As the comment indicates, this comes from English, not from Arabic.
7.422 blanket: The comments indicate a Berber form abroγ, but the field gives abrok. The Arabic etymology is less implausible than it appears, since the semantic shift to “full body covering” is well-attested, as in English “burka” from the same source.
12.081 above: here it is called areal and probably not Arabic, but under “sky” and “heaven” the same word is listed as “clearly borrowed”. One of these statements must be wrong.
13 zero: the Hausa form is transcribed correctly in comments, but wrongly under “Source words”.
18.51 write: rubuta is Hausa, not Berber, as the sources quoted make clear. The proto-Berber form had no suffix -t (as Kossmann indicates), and neither do any of the equivalent modern Berber verbs.
19.62/20.11 quarrel: If it’s related to “alhilaafu”, the Arabic form is al-xilāf. If it’s related to “judge”, that form is irrelevant. In either case, there is no Arabic word “alwalaʔ” with appropriate meaning.

Monday, March 01, 2010

Identify the language of this manuscript

A scan of much of the manuscript MS Leiden Or. 14.052 is available online. The main text of this manuscript is in a rather poor Arabic. The marginal and interlinear notes, however, are "in one or more West African languages", as yet unidentified. My best guess is that they're in Mandinka, based on the orthography's use of tanwīn and on the frequent word-initial a/i (suggestive of Mande's 3rd person subject pronouns), but I'm not sure; I haven't been able to decipher any phrases. Anyone else feel like having a look?

Tuesday, February 16, 2010

Subjacency: The judgements

Thank you very much for your responses, everybody! (If you haven't answered yet and want to, please do it before reading the rest of this post.)

Chomsky's intuitions were as follows (* marks ungrammaticality as usual):

* That's the boy who they intercepted John's message to.
* That's the boy who he believed the claim that John tricked.
* That was a lecture that for him to understand was difficult.
* Which book did John wonder why Bill had read?
√ Which book did John think that Bill had read?
√ What would you approve of John's drinking?
* What would you approve of John's excessive drinking of?

Mine were that 1, 4, 5, 7, and (only after some thought) 6 were good, while 2 and 3 were wrong - but I exclude those judgements here, since I was reading the book and might have been swayed by my reactions to the arguments. My sister found 1, 2, and 4 wrong, 3 "weird but comprehensible", and 5-7 good - so even within a single family judgements vary significantly. Your 11 collective judgements (plus some friends and family, and excluding non-native speakers) add up as follows (grading "uncertain" as 0.5):

The discrepancy, and the level of individual variation, are striking - not a single reader agrees with all of Chomsky's judgements, and the only consistent judgements are 2 (always wrong) and 5 (always right.) Most of Chomsky's judgements also happen to be predicted by his (and others in the generative tradition's) theories; your judgements therefore often pose problems for those. According to Chomsky, 1 and 2 should both be ungrammatical for the same reason - they involve movement past more than one "barrier" (boundary of a 'noun phrase' (DP) or clause excluding the complementiser (IP)) at a time. Yet more than half the people here (including me) accept 1, while nobody accepts 2; one could argue that 2 should be less acceptable than 1 because it crosses three barriers rather than two, but why should 1 be acceptable at all? 4 should be ungrammatical because "why" is occupying a position that "which book" should have to move through - but about half of you (including me) think it's fine. And most readers of this blog find 7 to be better than 6 - the opposite of Chomsky's judgements and of the predictions of the "A-over-A" principle he was working with then (although the latter is obsolete.)

Chomsky (1963:51) said of sentences like these: "In some unknown way, the speaker of English devises the principles of [wh-movement etc.] on the basis of data available to him; still more mysterious, however, is the fact that he knows under what formal conditions these principles are applicable... The sentences of [1-3] are as 'unfamiliar' as the vast majority of those that we encounter in daily life, yet we know intuitively, without instruction or awareness, how they are to be treated by the system of grammatical rules which we have mastered." This seems to be false; individually we often find it difficult to decide the grammaticality of sentences like these, and collectively we routinely disagree on them. Certainly it cannot be construed as belonging to that part of the "knowledge of language" that is, in the words of Chomsky (1963:64), "independent of intelligence and of wide variations in individual experience".

If it did, then that would be rather interesting: it has been claimed that the principles of Subjacency must be innate, because children aren't exposed to enough evidence to deduce them otherwise. But given the level of variation actually observed, it is tempting to reverse the reasoning: children don't deduce most of the principles of Subjacency, so they must neither be exposed to enough evidence for them nor have innate knowledge of them. Rather than postulating arbitrary rules hard-wired into the brain and specific to the language faculty, a more promising way to explain Subjacency phenomena might be to try to derive them from processing difficulties, as suggested by Sag et al.

Subjacency intuitions

I've been reading an old Chomsky book, Language and Mind, lately. As usual, the moment he starts discussing what would eventually be called subjacency I find my intuitions are systematically different from his, and I'm curious: how common is this? By way of testing, here's a few sentences in English: which ones would you consider ungrammatical/unacceptable as phrased?

That's the boy who they intercepted John's message to.
That's the boy who he believed the claim that John tricked.
That was a lecture that for him to understand was difficult.
Which book did John wonder why Bill had read?
Which book did John think that Bill had read?
What would you approve of John's drinking?
What would you approve of John's excessive drinking of?

Chomsky's grammaticality judgements will be provided later - they're on pp. 50-54 of the book.

Thursday, February 11, 2010

Berber manuscripts in Arabic script online

A major collection of early Tashelhiyt manuscripts from the 16th century onwards has gone online: Manuscrits arabes et berbères du Fonds Roux. It includes a copy of al-Hilali's Berber-Arabic lexicon. The Lmuhub Ulaḥbib library of Bejaia has also put a number of works online, including an 18th/19th century manuscript on theology in Kabyle: العقيدة السنوسية. Both collections are also of interest for their many Arabic books, but the Berber ones are particularly significant due to the serious paucity of materials for the study of precolonial Berber writing traditions.

Friday, February 05, 2010

Word Loanword Database

I shouldn't really be blogging at this stage of my thesis-writing, but this I had to share: the World Loanword Database has come online. Vocabularies likely to be of particular interest include Tarifiyt, Hausa, Kanuri, Iraqw, but there are plenty more, all carefully analysed for loanwords... Have fun, and feel free to discuss any mistakes you think you spot in it here :)

(Via Glossographia.)

Wednesday, January 27, 2010

Language endangerment: thoughts from Igli

I recently found a forum for the town of Igli, about 150 km north of Tabelbala as the crow flies. Igli's traditional language is a Berber variety called "Tabeldit", or in Arabic "Shelha" شلحة, reasonably close to the better-documented dialect of Figuig across the border but with significant differences (such as the first person singular in -ɛ rather than -γ.) In Igli, it is at least as endangered as Kwarandzyey, and is likely to disappear in another couple of generations - although I was told that it is doing better in the small neighbouring town of Mazzer. I think the reason, as in Tabelbala, is that parents started speaking only Arabic to their kids in the hope of giving them a head start in school, but all I know about Igli I heard from Glaouis in other towns. In situations like this, speakers inevitably see their language's disappearance with mixed feelings, and the following pair of posts forms a microcosm of the global language preservation debate:

The "Xiṭ Azugar" Project (posted by Shayma)

"Tabeldit Shelha is part of the fragrance of the Saoura region... a treasure inherited from our ancestors. Shall we preserve it, or let it disappear before our eyes?.... A secret weapon that saved some of us from death. How long will we remain with our hands tied as our language disappears before our eyes? Until when, until when?

I hope that these words have awakened your sleeping hearts and moved your sentiments. Therefore I present to you today this project, consisting of the establishment of an "Arabic-Shelha" dictionary to preserve our language. Therefore I ask the director and administrators and even the members to study this project; if you accept the idea, then let's start to lay down precise plans to overcome difficulties... and if you don't accept the suggestion, then we will do our ancestors an injustice... I urge you to take the matter seriously. To the administration, and all the members, let us put hand in hand. No more lamentation over Shelha, that doesn't help. What helps is effective work.

Forgive me for my harsh words, and I hope you accept the idea. The project is called "Xiṭ azugar" for historical reasons, because these words have saved a person from certain death.

This suggestion was acclaimed and adopted, and there is now a small Arabic-Shelha Dictionary forum. However, there was also some scepticism - the following post started a vigorous debate:

What would we lose if Shelha becomes extinct? (posted by igliab)

Following the increased concern with the local dialect "Shelha" from the brother members, for which thanks are due, I decided to pose the following question: What would we lose if this dialect became extinct?

It's not a language of civilisation, nor a language of science. And supposing we are able to make an "Arabic-Shelha" dictionary and lay down the rules for this language, will our sons agree to learn it? What would the motive be? It's not used at home, nor in public places. Or do we want to put it in museums and say we have "saved" it?

Moreover, by my reckoning those who speak it today are:
90% old men - 8% middle-aged men - 1.5% youths - 0.5% children. Admittedly I haven't made a study to come up with these figures but it could be worse than I anticipate, so it can be said that Shelha has no future in Igli.

I also told myself that if everyone thought the way I think then they would put down their pens and wait for the demise of Shelha, the way an ill man who has despaired of his state waits for death. But I rethought the issue, this time positively, and realised the need to put together a plan for its preservation. But what is the point of solutions if there is no logical, powerful reason, so the first question we have to answer is: why should we preserve Shelha? I urge the brothers to think deeply about this issue and put sentiments aside.

What would your thoughts be? Have you had a parallel experience?

Monday, January 11, 2010

Ajami in Boston

The Boston Globe has an article today about Ajami, the tradition of transcribing African languages in the Arabic script. It focuses particularly on the efforts of Fallou Ngom, whose work has been mainly on Wolof Ajami in Senegal, the subject of one of my first posts here. In the article he emphasises the potential historical significance of such work in opening up neglected sources on African history. While most African manuscripts are in Arabic, some historically rather interesting Ajami sources are known; for Mandinka, published historical manuscripts include the Pakao Book and the Bijini manuscript, the latter outlining regional history over the past 500 years. There are undoubtedly more out there that have gone uninvestigated simply for lack of enough historians who can read them. My work on Ajami has focused more on issues of orthography, however: most African languages have rather different sound systems to Arabic, and it's quite interesting to see what kind of devices they developed to make the alphabet fit better.

Saturday, January 09, 2010

Earliest Kwarandzyey source online (also Tarifit of Arzew)

It turns out that the earliest and most extensive published source on Kwarandzyey (Korandje), the language of Tabelbala in southwestern Algeria which I am studying, is downloadable online:

* Cancel, Lt. 1908. "Etude sur le dialecte de Tabelbala". Revue Africaine 52.

Readers may also be interested in Biarnay's study of the probably extinct Tarifit dialect that was then spoken at Arzew, in volumes 54 and 55 of the same publication.

Saturday, January 02, 2010

Siwi Scarborough Fair

Over the dinner mentioned in the last post I was also shown a Siwi poem sent as a text message - it's a rather below average example of the genre, but interesting as an representative illustration of Siwis' orthographic preferences.

كان تازمرت تجبد تيني
كان تفكت تعمار تازيري
كان اتغت تيرو اغي
كان امان نلبحورا يسقلبن اخي
كان الغم ينسخط ايزي
بردو شك غوري (غالي)

Or in Latin Berber orthography:
Kan tazemmurt tejbed tayni
Kan tfukt teɛmaṛ taziri
Kan tγatt tiṛew aγi,
Kan aman n lebḥuṛa yesqelben axi,
Kan alγem yensxeṭ izi,
Beṛdu cek γuṛi "γali".

So I decided to render it into English, taking a few liberties to reproduce the rhyme (for added faithfulness, change "flea" to "fly", and eliminate "someday" and "or three"):

If dates can come from an olive tree,
If the sun someday a moon shall be,
If a goat gives birth to a calf or three,
If milk fills the waters of every sea,
If a camel can turn itself into a flea -
Then only will you be dear to me.

Thursday, December 31, 2009

Siwi and Kabyle: same language family, but not same language

Just back from a nice evening with the Siwi community of Qatar. A Kabyle friend came along (hello if you're reading this!), giving me a chance to see first-hand to what extent Siwi and Kabyle are mutually comprehensible. The answer is: very little indeed. Looking through basic vocabulary it's not hard to find cognates; but when it comes to even short sentences, mystified expressions on both sides were the order of the day. The Berber languages of Algeria and Morocco may shade into one another to some extent, even across sub-family boundaries - there seem to be dialects for which it is difficult to decide whether they should be called Kabyle or Chaoui, for example. But by the time you get to Siwa, it's quite clear that you're dealing with a different language, even by Arabic speakers' rather generous standards. Further confirmation, if any was needed, that Berber is a language family, not a language.

Saturday, November 21, 2009

Songhay and Nilo-Saharan

Following up on the preceding post, I've been looking at Greenberg's (1966) Nilo-Saharan comparisons - specifically, the 29 ones involving Songhay that have reflexes in Kwarandzyey, the Songhay language least likely to be involved in recent contact with Nilo-Saharan. Of these, 20 have comparanda in Saharan (Kanuri/Kanembu + Teda/Daza + Berti + Beria/Zaghawa), 17 in Eastern Sudanic (Nubian, Nilotic, Surmic, etc.), vs. a maximum of 13 for any other branch. (At least 7 also have plausible Mande comparisons.) Now, Saharan only consists of about 4 languages (9 by Ethnologue standards.) For Eastern Sudanic, excluding Kuliak, the Ethnologue counts 103 languages, and a huge amount of internal diversity. If Songhay were equally distant from the whole of Nilo-Saharan, you would expect far more cognates with Eastern Sudanic than with Saharan; the figures suggest that the link (whatever its nature) is primarily with Saharan, and only secondarily, if at all, with the rest of the languages he classified as Nilo-Saharan.

The grammatical comparisons that Greenberg offers are interesting but not compelling; there are only 10 of them (only 4 with Kwarandzyey reflexes), and they often incorporate misrepresentations (as Lacroix noted, for example, -ma forms verbal nouns, not relatives/adjectives, and 1sg ay < *agay, reducing the similarity to forms like Zaghawa ai.) Some of the lexical ones, however, are rather good; similarities such as Koyraboro Senni kokoši “scale (of fish)” = Manga Kanuri kàskàsí “scale (of fish)” cry out for explanation, and, though quite rare, look sufficiently numerous that chance seems unlikely. But whether they should be explained by contact or borrowing remains unclear. Either scenario would be historically interesting, since at present rather a large expanse of Tuareg and Hausa-speaking land separates Songhay from even Kanuri, and Saharan originated closer to modern-day Darfur than to Lake Chad.

Sunday, October 18, 2009

Arabic loanwords in "proto-Nilo-Saharan"

Ehret 2001 (or see Nostratic.ru) looks at first sight like an astonishingly detailed reconstruction of Nilo-Saharan, with nice binary splits and loads of technology-related words for archeologists and anthropologists to sink their teeth into. Why shouldn't specialists take advantage of this amazing opportunity to correlate historical developments to linguistic ones?

I just found a handy answer to that question. Bender (1997:175ff) gives the 15 cognate sets in Ehret 2001 that are represented in the most sub-families of Nilo-Saharan. 3 of the 15 look distinctly like Arabic loans.

1387 *wàs “to grow large”: Fur wassiye “wide” and Songhay wásà “to be wide” are both from Arabic wāsi`- واسع. The other items cited – Ik “stand”, Kanuri “yawn”, Kunama “increase, augment”, and Uduk “to tassel, of corn” – are scarcely obvious candidates for being related to one another in the first place.

1297 *là:l “to call out (to someone)”: Kanuri làn “to abuse, curse” and Songhay láalí “to curse” are obviously from Arabic la`an- لعن; Kunama lal- “to denigrate” might be from the same source. That only leaves Uduk “to persuade, incite to do something” and Proto-Central-Sudanic “to call out”.

718 *t̪íwm “to finish, complete”: almost certainly Songhay tímmè “to be finished”, very likely Uduk t̪ím “to finish”, Ocolo t̪um “to finish”, and maybe even Fur time “total”, are from Arabic tamm- تمّ (impf. -timm-), as Bender (ibid:177) considers probable. That leaves Proto-Central-Sudanic, Kunama, and Maba “all”, Kanuri “ideophone of dying animal” (!), and Proto-Kuliak “buttocks”. The “all” set looks rather promising – the whole etymology, not so much.

There are plenty of other Arabic loanwords in Ehret's “Proto-Nilo-Saharan” – a particularly egregious example is Kanuri zàmzàmíyɑ̀ “leather bottle-shaped water vessel for journeys” (#1223 *zɛ̀m “to become damp, moist”), and other especially clear-cut cases include #1173 < sawṭ, #1185 < šamm – but the fact that they include a significant proportion of the best cognate sets is what really strikes me. If a reconstruction attempt can't distinguish a widely distributed recent loan from a cognate set that split more than eleven thousand years ago, any information it gives about readily diffused items like technologies is completely unreliable. For another review from a similar perspective, try Blench 2000 (not sure why it appeared a year before the book's nominal publication date...)

The more I read about Nilo-Saharan, the less convinced I am that it exists (much less that Songhay belongs to it.) That means the classification of the languages of quite a lot of Africa is basically up for grabs. It would be great to have a reexamination of the area.

Wednesday, September 30, 2009

Why would "qaswarah" be claimed to be Ethiopic?

In the Qur'ān, 74:51, an interesting word occurs:

{ كَأَنَّهُمْ حُمُرٌ مُّسْتَنفِرَةٌ } * { فَرَّتْ مِن قَسْوَرَةٍ }
ka'annahum ħumurun mustanfirah * farrat min qaswarah
As if they were wild donkeys. Fleeing from a Qaswarah.

This tends to be rendered as "lion" in English, but the early commentators indicate that that is only one of several possible meanings of the word. al-Ṭabari (d. 310 AH), gives four (all supported by chains of transmitters whose reliability I am not competent to judge): الرماة archers, القُنَّاص hunters, جماعة الرجال a group of men, الأسد a lion. The point of interest here is that two of these explanations are supported by allusions to Ethiopic:

حدثنا هناد بن السريّ، قال: ثنا أبو الأحوص، عن سِماك، عن عكرِمة، في قوله: { فَرَّتْ مِنْ قَسْوَرَةٍ } قال: القسورة: الرماة، فقال رجل لعكرِمة: هو الأسد بلسان الحبشة، فقال عكرِمة: اسم الأسد بلسان الحبشة عنبسة.

...[`Ikrimah] said: "al-qaswarah is archers." Then a man told `Ikrimah: "It is 'lion' in the language of the Ḥabashah (Ethiopians)." Ikrimah said: "The name of the lion in the language of the Ḥabashah is `anbasah."

حدثني محمد بن خالد بن خداش، قال ثني سلم بن قتيبة، قال: ثنا حماد بن سلمة، عن عليّ بن زيد، عن يوسف بن مهران عن ابن عباس أنه سُئل عن قوله: { فَرَّتْ مِنْ قَسْوَرَةٍ } قال: هو بالعربية: الأسد، وبالفارسية: شار، وبالنبطية: أريا، وبالحبشية: قسورة.

...[Ibn `Abbās] said: It is 'asad (lion) in Arabic, and in Persian šēr (شير), and in Nabataean 'aryā (ܐܪܝܐ), and in Ethiopic: qaswarah.

The thing is, it looks like `Ikrimah was right: in Ethiopic, "lion" is indeed `anbasā (ዐንበባ), and no Ethiopic word qaswarah has been found. Qaswarah is most likely an originally Arabic word. But these were intelligent people, and the saying attributed to Ibn `Abbās above is obviously right about Persian and Nabataean; why would they say that qaswarah was the Ethiopic word for "lion" if it wasn't? One obvious possibility is that they were referring to another language of the Ethiopia region. This cannot be ruled out, since many languages of the area have no doubt gone extinct without documentation since then; but it looks as though the words for "lion" in Somali, Oromo, Beja, Agaw, Sidamo, Nubian, Nara, and Kunama are rather different. One might momentarily be tempted to think of Berber, cp. Nafusi war, but that's certainly not long enough.

Could the idea that qaswarah is "lion" in Ethiopic have derived from a misreading of `anbasa at some point? That certainly wouldn't be plausible in Arabic. It doesn't look all that plausible in Ethiopic either: ዐንበባ doesn't look all that similar to ቀስወራ. But there is another alphabet that might conceivably have been involved: the musnad, the Old South Arabian letters that continued to be used in Yemen into the Islamic period. In this alphabet, ` ع is quite similar to q ق, and n to s. The other two letters are rather less similar, but I can imagine b plus the right side of s being miscopied as w, and the remainder of s being reinterpreted as r. Here's roughly how the two words (qswr on the left, `nbs on the right) would have looked (ignoring the possibility of a final feminine -t):

Suppose this is right. Why then would someone at the time have learned an Ethiopic word from a text written in the musnad, rather than by asking an Ethiopian? Histories and travelogues are both genres attested in the Middle East of the time, and might have found occasion to mention in passing the Ethiopian word for "lion", given its cultural importance (it is a common theme in Aksumite art, and in later Ethiopia was adopted as a royal title.) Some Yemeni scholar who's never been to Ethiopia reads a miscopied version of such a history, thinks: ah, this must be the same word as in the Qur'ān, and goes on to tell everyone he knows, including (if the attribution is correct) Ibn `Abbās.

But there's a difficulty here: all that's ever been discovered in the musnad is stone inscriptions and occasional letters. No books have survived at all, much less histories or travelogues. And if there were books, you would think they would be written in the cursive script used in the letters, rather than the monumental script of the inscriptions - which reduces the similarity of the two words even more (see the table on p. 13 of History of the Arabic Script for cursive forms.)

On the other hand - anyone have a better idea?