Jabal al-Lughat

Wednesday, June 04, 2025

Eastern Sudanic subgroup reconstructions

This is basically a note to myself, and may be updated.

Eastern Sudanic is generally taken to embrace most of the languages of Sudan, including the following families:

Nubian
Nara
Taman
Nyima
Jebel
Daju
Surmic
Nilotic
Temeinic

Its existence, however, remains debatable (cf. Güldemann 2022). A reconstruction of Eastern Sudanic (much less anything above it, such as Nilo-Saharan) remains out of reach. If it is possible at all, it will most likely need to be based on prior reconstructions of each of these subgroups. It is therefore useful to outline what has been done in terms of reconstruction.

Rilly's (2010) monograph identifies a clearer family consisting of Nubian, Nara, Taman, and Nyimang (along with the extinct Meroitic), which he labels North Eastern Sudanic ("soudanique oriental du nord"), and for which he proposes some 200 lexical reconstructions. In the process, he also offers 200-word reconstructions of proto-Nubian and proto-Taman, finding it necessary for the former to amend Bechhaus-Gerst's reconstruction of 97 items significantly, and drawing for the latter primarily on Edgar (1991).

Nara is a single language, whose dialectal diversity is not sufficiently well documented to make even internal reconstruction feasible.

Nyima consists of two languages, both poorly documented; Rilly gives provisional reconstructions.

For (Eastern) Jebel, Bender (1998) proposes an extremely provisional reconstruction of 100 items, outlining major sound correspondences.

Proto-Daju is reconstructed in the Ph.D. thesis of Thelwall (1981), who provides more than 300 lexical reconstructions along with the principal sound correspondences, but keeps discussion of morphology and syntax to a minimum.

Proto-Surmic has yet to be reconstructed; Yigezu (2001), however, reconstructs 200-300 words for each of two of its three subgroups, Southwest and Southeast. (The third is a single language, Majang.)

For Proto-Nilotic, Dimmendaal (1988) provides a "first reconnaissance", giving 204 items and ignoring tone; the work of Hall et al. (1975) and Hieda (2006) also deserves notice. Much more elaborated monograph-length reconstructions are available for Eastern Nilotic (Vossen 1982) and Southern Nilotic (Rottland 1982); each of these provides about 200 items for the relevant proto-language along with quite a few more for lower-level subgroups. Western Nilotic has not been reconstruced, but one sub-subgroup, Southern Luo, has been reconstructed in Heusing (1983).

Temein, with three poorly documented members, has not been reconstructed.

In brief: out of nine primary Eastern Sudanic families, none has yet been reconstructed in detail. Where reconstructions at this level exist, they cover a limited number of sound correspondences (usually segmental, ignoring tone), and a couple of hundred basic words; discussion of morphology is limited to a few prominent affixes.

Wednesday, December 11, 2024

More Mabaan pharyngeals

Thomas Anour has posted a number of Bible extracts: Mark 10:13-18, John 1:1-13, and James 4:1-3. Comparing these to a published translation from 2002 (from which he sometimes diverges slightly) and to the anonymous dictionary linked in the previous post makes it possible for a beginner to parse much of the text. No more examples of /ħ/ were heard; but another pharyngeal, /ʕ/, was. This phoneme is absent from the online audio version of this Bible translation, but can be heard clearly in Thomas Anour's pronunciation of at least three frequent words, despite occasional variation, and seems to contrast with the glottal stop /ʔ/, as illustrated by the the last few lines of the following table. While one of the words with /ʕ/ is an Arabic loan, the rest clearly are not.

Unfortunately, I don't know yet where it's coming from. I have yet to find any useful cognates to the words with the pharyngeal in the rest of Nilotic, or even in the meager Jumjum dictionary. "We" corresponds to Nuer <kɔn> and (probably?) Dinka /wɔ̂ɔk/.

English	Mabaan (Anour)	Mabaan (anon)	Mabaan (Anderson)
and	[ʕɔ́sì]	ɔci	ʔɔ́cé
so that	[ʕáŋkàː]	aŋ-ka	ʔáŋkà
because (< Ar.)	[ʕásàan]	acaan
where	[ʔáŋɛ̀]	aŋɛ
quotative particle	[ʔàgɪ́]	agi	ʔàgē
we	[ʔɔ̂ːn]	ɔɔn	ʔɔ̆ɔn

Tuesday, December 10, 2024

Mabaan pharyngeals

The least well documented subgroup of West Nilotic is the Burun group, spoken around the borders between Sudan, South Sudan, and Ethiopia. The largest language in this subgroup is Mabaan, spoken in South Sudan, for which there exists at least one dictionary (available without bibliographic information on Roger Blench's site), and several very interesting articles by Torben Andersen. But we are no longer in the era where a non-field linguist could be content to look at printed sources alone; there is a fair amount of Mabaan content on YouTube, including a channel by a BA-trained linguist and first language speaker of Mabaan, Thomas Anour: Learn Maban, African Language with Thomas Anour. (Like and subscribe, or whatever it is you're supposed to do on YouTube to encourage creators.) Between these, that makes enough material to observe an interesting phonological difference.

In Mabaan as described by Torben Andersen and in the aforementioned anonymous dictionary, /h/ seems to show up only in interjections or loans, and /ħ/ is not mentioned at all. The variety spoken by Thomas Anour, however, features a number of words with initial [ħ] (occasionally varying with [h]). A single cognate in a North Burun language, Mayak, suggest that this is the reflex in his variety of *r, which otherwise becomes a semivowel in Mabaan; more would be desirable.

English	Mabaan (Anour)	Mabaan (anon)	Mabaan (Andersen)	Jumjum (Fadul et al.)	Mayak (Andersen)
sorghum field (?)	<hill> [ħîl]	<yielo> "field for dura grain"	-	<yiil> "field, farm"	-
rat	<heeñ> [ħéːɲ]	<yyeño> "rat"	yiiêɲ-ʌ̀ "~, mouse"	<yiiñ>	rii-nit̪
sausage tree	<heeṭṭa> [ħétà]	<wyeṭṭa> "pod of ~"	-	-	-
desert	<hong> [ħʌ̂ːŋ]	<wɔɔŋ> "wilderness, desert"	-	-	-
salmon (sic)	<hitta> [ħítàː]	-	-	-	-
excuse (Ar. izin)	<honda> [ħʌ̀ndá]	-	-	-	-

Edit (12/12/2024): The Elenchus comparativus (von Hurter, 1800) records, s.v. "souris" (mouse), <hén> for "Abugonos Burun" vs. <rine> for "J. Kurmuk". This is the only word in the list transcribed with initial h - and the only word on the list corresponding to any of the ones above - but seems sufficient to suggest that this pronunciation is indeed old. Among words with *r, one notes Abugonos <yonga> "meat" and <ímaghi> "blood" (Kurmuk <rin>), which do not support the hypothesis of *r > ħ, but, given the imprecise transcription, do not disprove it either. My thanks to Shuichiro Nakao for sending me a link to this exceptionally early source.

Thursday, September 26, 2024

Tlemcen: medieval folk etymologies and their implications

In the mid-14th century work Bughyat al-ruwwād fī dhikr il-mulūk min banī ʕAbd al-Wād, Yaḥyā Ibn Khaldūn (brother of the more famous Ibn Khaldūn) ventures two possible etymologies for the name of Tlemcen (Standard Arabic Tilimsān, dialectal Arabic Tləmsān):

تسمى بلغة البربر تلمسنين كلمة مركبة من تلم ومعناه تجمع وسين ومعناه اثنان اي الصحراء والتل فيما ذكر شيخنا العلامة ابو عبد الله الابلي رحمه الله وكان حافظا بلسان القوم ويقال ايضا تلشان وهو ايضا مركب من تل ومعناه لها وشان اي لها شان
In the Berber language it is called "T.l.msīn", a word composed of t.l.m, meaning "she/it gathers", and sīn, meaning "two" - i.e. the Sahara and the Tell - according to our shaykh the most learned Abū ʕAbd Allāh al-Ābilī, may God have mercy on him, who was well-versed in the people's tongue. It is also said "T.l.šān", which is also a compound, of t.l., meaning "she/it has", and šān, i.e. "it has status".

Both etymologies are easy enough to interpret in the light of comparative Berber data. In the nearest (barely) surviving Berber variety - Beni Snous (Aṯ Snus), some 40 km west of the town - "Tlemcen" is indeed Tləmsin, not Tləmsan (cf. Destaing's Etude, pp. 368, 370, 371, etc.) This variety, however, does not use the word sin for "two" - it uses ṯnayən, like the Rif to its west (cf. Destaing, Dictionnaire, p. 98). The closest varieties to preserve a Berber word for "two" - geographically and genetically - use sən, in common with the rest of the Zenati subgroup to which Beni Snous belongs. The nearest varieties using the form sin are Kabyle, far to the east, and Middle Atlas Tamazight and Tashlḥiyt, far to the west. For the verb, one might consider t-əlləm "she/it spun", but the gloss given better matches a widespread dialectal Arabic word that could well have been borrowed into Berber: t-ləmm "she/it gathers". The second is obviously a compound of Arabic ša'n "affair, rank, status" and the Berber verb t-la "she/it has". Today this verb survives in Beni Snous, as in Kabyle, only residually, in the construction wi-h y-il-ən "who does it belong to?" (Destaing, Grammaire, p. 88). But it may have been more productive at that time, as it still is in Middle Atlas Tamazight.

Obviously, the first of these etymologies is implausible, while the second is a self-aggrandising play on words rather than an attempt to explain the name. But the fact that the first one could seriously be suggested is strong evidence that the meaning of Tlemcen was no more transparent to 14th century Berber speakers than it is to 21st century ones - as is not unusual for placenames. A better etymology can be proposed by taking into account comparative data - and allows us to explain the cross-linguistics differences in the final vowel - but I'll leave that for another day.

Sunday, September 15, 2024

"Berber" language in early Arabic texts

Searching Shamela, I recently realised that the earliest references to a language of (al-)Barbar in Arabic go back further than I had assumed, to the second century AH. While these are unlikely to shed much light on actual linguistic practices, they are worth a look.

Two occur in the Qur'ānic commentary (tafsīr) of Mujāhid ibn Sulaymān (d. 150 AH = 767 AD); one in his discussion of sūrat al-Isrā':

فقال عبد الله بن الزبعري السهمي: إن الزقوم بلسان بربر التمر والزبد. قال أبو الجهل: يا جارية ابغنا «٣» تمرا فجاءته. فقال لقريش وهم حوله تزقموا من هذا الزقوم الذي يخوفكم به محمد «٤» .
And ʕAbd Allāh ibn al-Zabʕarī al-Sahmī said: Zaqqūm means 'date' and 'butter' in the tongue of Barbar. Abū Jahl said "Slave-girl, find us some dates", and she came to him, and he said to Quraysh as they were about him: "Have some of this zaqqūm that Muhammad is scaring you with".

And one, almost identical, in his discussion of sūrat al-Dukhān:

بلسان بربر وأفريقية الزقوم يعنون التمر والزبد، زعم ذلك عبد الله بن الزبعري السهمي، وذلك أن أبا جهل قال لهم: إن محمدا يزعم أن النار تنهت الشجر وإنما النار تأكل الشجر، فما الزقوم عندكم؟ فقال عبد الله بن الزبعري: التمر والزبد. فقال أبو جهل بن هشام: يا جارية، ابغنا تمرا وزيدا. فقال: تزقموا.
In the tongue of Barbar and Ifrīqiyah, zaqqūm they mean 'date' and 'butter'. So claimed ʕAbd Allāh ibn al-Zabʕarī al-Sahmī, on the basis that Abū Jahl said to them "Muhammad claims that the Fire grows trees, yet fire consumes trees! So what is zaqqūm according to you?" And ʕAbd Allāh ibn al-Zabʕarī said: "Dates and butter". And Abū Jahl ibn Hishām said: "Slave-girl, find us some dates and butter!" And he said: "Have some zaqqūm."

One is vaguely reminded of Siwi a-zəggar "a (single) date"; but if indeed this were a word of "Barbar and Ifrīqiyah", one would hardly expect it to trip from the lips of Abū Jahl, and still less to be familiar to his audience. (For those unfamiliar with the context: Abū Jahl was the foremost enemy of early Islam, and ʕAbd Allāh ibn al-Zabʕarī was an anti-Islamic poet; their assertion that in some foreign language zaqqūm means "dates and butter" was almost certainly intended as mockery of the Qur'ān, not as serious lexicography.) The juxtaposition of "Barbar and Ifrīqiyah", however, seems to corroborate that the intended reference is indeed to the Berbers rather than to any East African groups.

Another is to be found in the earliest Arabic dictionary - the Kitāb al-ʕAyn, by al-Khalīl ibn Aḥmad:

والقَيطونُ: المخدع في لغة البربر ومصر.
qayṭūn: chamber, in the language of the Barbar and Egypt.

The ultimate origin of this word, unlike that of zaqqūm, is perfectly clear: it comes from Greek κοιτών, probably via Aramaic. The term is used in Coptic too: ⲕⲟⲓⲧⲱⲛ. In modern Berber varieties its form (e.g. aqiḍun) has a q, suggesting a more recent borrowing from Arabic, but one may reasonably suspect that Berbers in Cyrenaica and the Western Desert (most of whom have since switched to Arabic) would have been familiar with some version of the Greek term before Arabic influence.

A number of references to the "Barbar" are to be found in the works of Imām Mālik ibn Anas. None of these refer to the language, but one, in al-Mudawwanah, makes a clear reference to skin colour in the context of what counts as a legally punishable insult, confirming (unless this was a later scribe's addition) that the term's reference was indeed to North rather than (as Rouighi's hypothesis might suggest) East Africa:

بَلَغَنِي أَنَّ مَالِكًا قَالَ فِي الْمَوَالِي كُلِّهِمْ: مَنْ قَالَ لِبَرْبَرِيٍّ يَا فَارِسِيُّ أَوْ يَا رُومِيُّ أَوْ يَا نَبَطِيُّ أَوْ دَعَاهُ بِغَيْرِ جِنْسِهِ مِنْ الْبِيضِ كُلِّهِمْ فَلَا حَدَّ عَلَيْهِ فِيهِ، أَوْ قَالَ يَا بَرْبَرِيُّ وَهُوَ حَبَشِيٌّ فَلَا حَدَّ عَلَيْهِ وَهُوَ قَوْلُ مَالِكٍ.
It has reached me that Mālik said of mawlās in general: "Whoever says to a Berber 'Hey, Persian!' or 'Hey Roman!' or 'Hey Nabaṭī!', or calls him by any other nation of the whites - he is not to be punished for it. Or if he says 'Hey Berber!' when he is really Ethiopian, he is not to be punished for it. That is Mālik's statement.

Monday, August 19, 2024

Miscellaneous Darja notes, 2024

When I wrote my paper on the Arabic dialect of Dellys a couple of decades ago, I described it as using the participle "going (to)" - ṛayəħ رايح (m.) / ṛayħa رايحة f. / ṛayħin رايحين pl., depending on the gender/number of the subject - to form the future. In more recent years, I've started to notice women speakers reducing feminine ṛayħa and plural ṛayħin to ħa حا and ħin حين respectively. On this trip, I heard a young woman say waš ħa-yəqṛa? واش حايقرا؟ "what will he study?", with ħa unmistakeably generalised even to the masculine, as in Egyptian.

Another probable innovation in progress is the spread of -ti- as a possessive linker between CCaC nouns and pronominal suffixes. I was already familiar with forms like ṣbaʕ-ti-k صباعتيك "your fingers", but ṛwaħ-ti-na رواحتينا for "ourselves" and even šɣal-ti-hŭm شغالتيهُم "their tasks" (alongside šɣalat شغالات "tasks"!) are harder to motivate.

I alluded in my last post to a sort of possessive perfect construction; the reference was to ʕənd- ma..., meaning "have Xed plenty". It can quantify the event, as in ʕənd-u ma lbəs-hŭm عندهُ ما لبسهُم "he's worn them plenty", or the object, as in ʕənd-u ma lbəs عندهُ نل لبس "he's worn plenty", or ʕənd-u ma šaf عندهُ ما شاف "he's seen a lot".

The usual "whatchamacallit" filler word in this region is laxŭṛ لاخُر "the other (thing)". I heard one example showing that it can follow the passive prefix, and therefore substitute for verb roots as well as full stems: yə-t-laxŭṛ - yə-t-rigla يتلاخُر - يتريڨلا "get whatsited - fixed".

As in Standard Arabic, the past imperfective is regularly formed with kan "be" plus the imperfective. It's worth noting, however, that either of the two verbs can be negated: yəmma kanət ma tŭxrəj-ši يمّا كانت ماتُخرجشي "Mom used to not go out".

Codeswitches always raise the question of complement selection. In siṛaṛ win nərgŭd سيرار وين نرڨُد "rarely do I fall asleep", the loan c'est rare would take the complementiser que "that" in French, but shows up here with win "where" instead, corresponding not to French but to the construction normally used with the corresponding Arabic word, qlil قليل "few".

From an elderly aunt, I heard a double-object form that intuitively seems completely impossible in Darja to me: ila ma tḏəkkəṛnihaš إيلا ما تذكّرنيهاش "if you don't remind me of it". I think it's a religious classicism (unusually, she's literate despite having grown up before the Revolution), but noting it here in case more examples turn up.

There's a lot to be said about feminine -a in plurals. Usually it corresponds to final nisba -i in the singular, but it consistently shows up in the plural of family names (e.g. ṣwawga صواوڨة "Souags"), and I noticed it on a non-nisba profession noun: šifuṛ شيفور "driver", pl. šwafṛa شوافرة, where one might otherwise have expected *šwafəṛ. I don't recall hearing any of these words in the construct state, but for "brothers" (xiwa خيوة or xawa خاوة), forms like xiwətna خيوتنا "our brothers" seem to confirm that this really is morphologically identical to the feminine singular ending, rather than just being a different morpheme with the same vowel.

Conflicting evidence on the phonological representation of the French loanword ṣak صاك "purse, bag": "my bag" is ṣakki, with a geminate, but "bags" is (or can be) ṣikạn صيكان, with no geminate, and with an unexplained emphatic [ɑ] in the second syllable.

Secondary gemination in central Algerian Arabic is too large a topic to cover here - I have a draft paper on it I really ought to publish - but I was amused to hear it applied to the English loanword bəznəs "do business (esp. shady)": 3pl. impf. ibəzzənsu يبزّنسوا "they do business".

A couple of idioms: ma tkəssəṛ-š ṛaṣ-ək ما تكسّرش راصك "don't bother yourself, don't go to the trouble" (lit. "don't break your head"); ma fiha walu ما فيها والو "no problem, it's not an issue" (lit. "there's nothing in it"). "Easier than easy" isn't an idiom as such, but a construction worth attention: əlqur'an sahəl fuq əsshuliyya u waʕər fuq əlwʕuriyya القرآن ساهل فوق السهولية وواعر فوق الوعورية "the Qur'an is easier than easy and harder than hard."

Some proverbs: duga duga təbbəʕ əṭṭṛig əlməgduda دوڨا دوڨا تبّع الطريڨ المڨدودة "little by little, follow the level path"; kŭll ṯqil fəlmizan xfif كُلّ ثقيل في الميزان خفيف "any load is light in the balance"; ərrʷgad ṣəlṭan الرّڨاد صلطان "sleep is a despot". A few Darja words I've learned this summer - etymologically obscure, except the last:

gŭrgab ڨُرڨاب - fishing-ground, spot in the sea with lots of fish.
ẓəṛtiṭa زرطيطة - grape stalk (pedicel), subset of ʕənqud عنقود "cluster"; also refers to the last grapes of the season
mčəʕčək متشعتشك - (of hair) tangled, messy
ṭəṛṛaħ طرّاح - mattress maker

Wednesday, August 14, 2024

Spengler and morphology

While Spengler is better known for his efforts (in Decline of the West) to establish a historical morphology of cultures, he also briefly branches out there into linguistic historical morphology:

Instead of sum, Gothic im, we say ich bin, I am, je suis; instead of fecisti, we say tu habes factum, tu as fait, du habes gitân, and again, daz wîp, un homme, man hat. This has hitherto been a riddle because families of languages were considered as beings, but the mystery is solved when we discover in the idiom the reflection of a soul. The Faustian soul is here beginning to remould for its own use grammatical material of the most varied provenance. The coming of this specific ‘‘I’’ is the first dawning of that personality-idea which was so much later to create the sacrament of Contrition and personal absolution. This “ego habeo factum,” the insertion of the auxiliaries ‘‘have’’ and ‘‘be’’ between a doer and a deed, in lieu of the "feci" which expresses activated body, replaces the world of bodies by one of functions between centres of force, the static syntax by a dynamic. And this “I” and “Thou” is the key to Gothic portraiture. A Hellenistic portrait is the type of an attitude — a confession it is not, either to the creator of it or to the understanding spectator. But our portraits depict something sui generis, once occurring and never recurring, a life-history expressed in a moment, a world-centre for which everything else is world-around, exactly as the grammatical subject ‘‘I’’ becomes the centre of force in the Faustian sentence. (Atkinson translation, 1926, pp. 262-3)

A linguist - or a scientist - is not particularly well-positioned to judge airy intuitions about "the dawn of a new life-feeling" or the emergence of a "Faustian soul"; how would one even go about testing such claims rigorously? But the emergence of forms like these is rather better studied. On a world scale, there is nothing unusual about fecisti - plenty of languages collapse the subject pronoun, tense/aspect/mood, and the verb into a single word. In fact, a good 70% of languages in Siewierska's WALS survey mark subject agreement on the verb one way or another. Nor is their completely analytic separation all that rare (22% of the same sample). However, Siewierska's work reveals that there really is something unusual about a form like ich bin, where the independent pronoun is obligatory yet the verb still agrees with it:

My cross-linguistic investigations of verbal person markers reveal that person markers which require the presence of accompanying independent nominals or pronominals are very rare. In a sample of 272 languages I found only two such markers, in Dutch and Vanimo, a New Guinea language of the Sko family. The only other languages that I have come across which display such markers are: English, German, Icelandic, Faroese, some Rhaeto-Romance dialects, Standard French and perhaps Labu, an Austronesian language of New Guinea, and Anejom a language of Vanatu. (Siewierska 2001:219

Perfect marking based on "have" also turns out to be a European feature with almost no parallels elsewhere, as shown by Dahl and Velupillai's WALS chapter on the perfect; they emerged and spread there only in the Middle Ages as an areal innovation. (Bridget Drinka has worked on this question in more detail.)

For Spenglerians, then, these two would at first sight seem to be very promising features to focus on - two globally very rare features, known to have emerged in Europe only after the fall of the Roman Empire, equally innovative in Romance and Germanic and prominent in both. However, there are naturally a couple of hitches.

Person marking requiring independent pronominals is essentially a North Sea feature; though found in French, it never made it far enough south to be shared by Italian and Spanish. Given the prominent role of Italy in Spengler's account of the emergence of Faustian culture, a Spenglerian would presumably be forced to dismiss this feature or to find some workaround. If he retained it, the obvious next task would be interesting: to examine the scattering of South Pacific languages which share this feature and see if their speakers' attitudes to the ego (?) show any relevant parallels to Western European ones.

Perfect marking with "have" shows a better match with the hypothesised Faustian culture-area - but extends a little beyond it, to Albania and to some extent Greece. (Indeed, even Algerian Arabic shows a rather marginal possessive perfect construction.) This presumably reflects Romance influence on these languages in the context of Western Europe's rise to power in the Mediterranean, though one wonders why nothing similar happened outside the Mediterranean. Not a problem for a historical linguist; but would a Spenglerian be forced to take this as evidence for a change in ego-conceptions in those regions, and seek corroboration for it in literature and painting?