Wednesday, August 06, 2025

Darja miscellaneous notes 2025

Every time I go to Algeria, I come back with some linguistic observations that are new to me (if not necessarily to anyone else.) Here are this year's.

Many collective nouns take plural agreement: sqit əššjəṛ əttəħtaniyyin “I irrigated the lower trees”, kanu sjəṛ “there were trees”, ənnməl haðu “these ants”. Not all do, though, or at least not all the time: nnamus bəkri kʊnna nšufuh nəqqʊtluh “mosquitoes, in the old days, if we saw them (lit. it) we’d kill them (lit. it).” A topic worth looking at in more detail.

“Have”-based expressions for “ago” are familiar from Romance languages; in Darja, however, they agree with the notional possessor, e.g. dərtu ma-ʕəndi-š bəzzaf ‘I did it not long ago’ (lit. “I did it I don’t have much”). Along similar lines, the subject of ʕla bal-i “I know” (originally “on my awareness”) was originally the theme, the fact known. Synchronically, however, utterances like ma-kʊnt-š ʕlabal-i “I didn’t know” (lit. “I was not I know”) suggest this is no longer the case.

Another example of næ̃mpoṛt (discussed here previously): u xəllih yakʊl næ̃mpoṛt ħaja ‘and let him eat anything’.

The construct state has undergone some interesting developments. Most masculine nouns have no distinct construct state, and most feminine nouns form a construct state by replacing -a with -ət. If we factor out, for the present, the stem-internal effects of schwa-zero alternations and compensatory gemination, then, for most nouns, we can speak of a single construct state used for head nouns followed by possessor NPs or by suffixed possessor pronouns alike. However, a few nouns show a different distribution. Several kinship terms in -a take the suffixes directly: yəmma-k ‘your mother’, baba-k ‘your father’, jədda-k ‘your grandmother’, even ṭaṭa-k ‘your auntie’. (These nouns have zero-marked 1Sg possession: yəmma u yəmma-k “my and your mother”.) Such nouns usually take clitic doubled possessives (yəmma-ha ntaʕ Baya ‘Baya’s mother’, lit. ‘her mother of Baya’); however, if used in the regular synthetic possessive (“iḍāfah”) construction, they take a suffix t, e.g. yəmma-t yəmma-k “your mother’s mother”. For these nouns, it seems tempting to postulate two construct states rather than one.

The noun pattern CəCCayC is not particularly productive, but I heard a new example: tərtayqat “firecrackers” (cf. tərtəq “pop”). Other examples include ħərrayqa “jellyfish” (ħrəq “burn”), xʊṭṭayəf “swallow (bird)” (xṭəf “snatch”), bu-zəllayəq “blenny (fish)” (zləq “slip”).

Feminine nouns without overt feminine marking form diminutives with overt feminine marking: yədd ‘hand’ > ydida ‘little hand’. Very few masculine nouns have apparent feminine marking, but x(a)lifa ‘caliph’ is one such; məskin əlxliyyəf haðak “poor little caliph!” shows that the converse is also true, i.e. that masculine nouns with apparent feminine marking form diminutives without it.

The verbal template CəCCəC is in generally semantically and syntactically distinct ftom its corresponding passive/middle tCəCCəC. However, the distinction is neutralised in the participles: mwəð̣ð̣i “washed for prayer” from twəð̣ð̣a, mkəṛməṣ “dried (of figs)” from tkəṛməṣ “dry (of figs, intr.)”. Some speakers, however, do say mətwəð̣ð̣i.

Passives in n usually involve a simple coda n, but I heard clear gemination in li baš yənnəqsəm ‘for it to be divided’. The question of gemination in triliteral passives would deserve a closer look.

Weak-final triliteral verbs tend to add -an- in verbal nouns: tənħaniyya “removal” from nəħħi “remove”.’

A few emotional idioms: bərrəd qəlb-u “he cooled his heart”, i.e. he satisfied his heart’s desire; ṭəyyəṛhali “he made it fly for me”, i.e. he made me lose my temper; ṭəḷḷəʕlu lgaz “he raised the gas for him”, i.e. he made him angry. A proverb: triq əlʕafya tənẓaṛ yalukan tkun bʕida “the road of safety gets visited even if it’s far away.”

The usual ‘whatchamacallit’-word in Dellys and elsewhere in Algeria is laxʊṛ, originally “the other one”, used to substitute for verbs as well as nouns. However, from a relative about 90 years old, I heard a different construction based on haðak “that”: ma-yhaðak-š “he doesn’t whatsit”. This is paralleled in Malta and Morocco, so presumably it used to be more widely used.

The usual word for “knife” in Dellys is mus, but xʊdmi (usual in Bechar) is also in use. However, I hadn’t previously heard xʊdmiša. The curious final š can perhaps be explained as a borrowing from Berber, in some varieties of which ṯaxʷəḏmiyṯ would regularly yield ṯaxʷəḏmišṯ.

French cinquante is often heard as sikõnt “fifty”. The vowel is difficult to explain – influence from another Romance language?

Some words new to me: gərziz “empty gum, empty tooth socket”; ma-ksan-š “he’d rather not”; ṣfiħa “horseshoe”.

The ʕ in the verb ‘give’ is often elided: aṭini “give me” for regular aʕṭini.

I don’t think triliteral verbs ever end in w, but quadriliterals may: yqəwqəw ‘(a chicken) cackles’ (usually yqaqi in Dellys), yčəwčwu ‘they chatter’.

Wednesday, July 30, 2025

HEAD = GOURD in Algeria

The metaphorical identification of heads with gourds is probably obvious enough to arise spontaneously anywhere that gourds are in regular use (even English has expressions like "stoned out of his gourd".) In Algeria, it is historically reflected in some varieties' lexicon. Kabyle has in most contexts replaced pan-Berber ixf with novel a-qəṛṛu, whose betrays its loanword origin. The immediate source seems to be dialectal Arabic qəṛṛuʕ, attested in the meaning "head" around Jijel, but originally "big gourd", imposing the augmentative template CaCCūC on the noun qarʕ (dialectal qəṛʕa) "gourd, squash". (One might also consider a role for Classical ʔaqraʕ "mangy, bald", dialectal gəṛʕa "bald".

The thing about metaphors, though, is that they appear across multiple domains, not just in language. I recently learned of a traditional Algerian treatment for migraines (reported to be very effective) that involves cutting a fragment of gourd, writing various symbols on it, and pressing it against the appropriate place on the head of the affected person. The same metaphor that produced lexical change in Kabyle has evidently inspired curative practices next door. Perhaps a wider cultural survey would yield examples in other domains as well?

Wednesday, June 04, 2025

Eastern Sudanic subgroup reconstructions

This is basically a note to myself, and may be updated.

Eastern Sudanic is generally taken to embrace most of the languages of Sudan, including the following families:

  • Nubian
  • Nara
  • Taman
  • Nyima
  • Jebel
  • Daju
  • Surmic
  • Nilotic
  • Temeinic

Its existence, however, remains debatable (cf. Güldemann 2022). A reconstruction of Eastern Sudanic (much less anything above it, such as Nilo-Saharan) remains out of reach. If it is possible at all, it will most likely need to be based on prior reconstructions of each of these subgroups. It is therefore useful to outline what has been done in terms of reconstruction.

Rilly's (2010) monograph identifies a clearer family consisting of Nubian, Nara, Taman, and Nyimang (along with the extinct Meroitic), which he labels North Eastern Sudanic ("soudanique oriental du nord"), and for which he proposes some 200 lexical reconstructions. In the process, he also offers 200-word reconstructions of proto-Nubian and proto-Taman, finding it necessary for the former to amend Bechhaus-Gerst's reconstruction of 97 items significantly, and drawing for the latter primarily on Edgar (1991).

Nara is a single language, whose dialectal diversity is not sufficiently well documented to make even internal reconstruction feasible.

Nyima consists of two languages, both poorly documented; Rilly gives provisional reconstructions.

For (Eastern) Jebel, Bender (1998) proposes an extremely provisional reconstruction of 100 items, outlining major sound correspondences.

Proto-Daju is reconstructed in the Ph.D. thesis of Thelwall (1981), who provides more than 300 lexical reconstructions along with the principal sound correspondences, but keeps discussion of morphology and syntax to a minimum.

Proto-Surmic has yet to be reconstructed; Yigezu (2001), however, reconstructs 200-300 words for each of two of its three subgroups, Southwest and Southeast. (The third is a single language, Majang.)

For Proto-Nilotic, Dimmendaal (1988) provides a "first reconnaissance", giving 204 items and ignoring tone; the work of Hall et al. (1975) and Hieda (2006) also deserves notice. Much more elaborated monograph-length reconstructions are available for Eastern Nilotic (Vossen 1982) and Southern Nilotic (Rottland 1982); each of these provides about 200 items for the relevant proto-language along with quite a few more for lower-level subgroups. Western Nilotic has not been reconstruced, but one sub-subgroup, Southern Luo, has been reconstructed in Heusing (1983).

Temein, with three poorly documented members, has not been reconstructed.

In brief: out of nine primary Eastern Sudanic families, none has yet been reconstructed in detail. Where reconstructions at this level exist, they cover a limited number of sound correspondences (usually segmental, ignoring tone), and a couple of hundred basic words; discussion of morphology is limited to a few prominent affixes.

Wednesday, December 11, 2024

More Mabaan pharyngeals

Thomas Anour has posted a number of Bible extracts: Mark 10:13-18, John 1:1-13, and James 4:1-3. Comparing these to a published translation from 2002 (from which he sometimes diverges slightly) and to the anonymous dictionary linked in the previous post makes it possible for a beginner to parse much of the text. No more examples of /ħ/ were heard; but another pharyngeal, /ʕ/, was. This phoneme is absent from the online audio version of this Bible translation, but can be heard clearly in Thomas Anour's pronunciation of at least three frequent words, despite occasional variation, and seems to contrast with the glottal stop /ʔ/, as illustrated by the the last few lines of the following table. While one of the words with /ʕ/ is an Arabic loan, the rest clearly are not.

Unfortunately, I don't know yet where it's coming from. I have yet to find any useful cognates to the words with the pharyngeal in the rest of Nilotic, or even in the meager Jumjum dictionary. "We" corresponds to Nuer <kɔn> and (probably?) Dinka /wɔ̂ɔk/.

English Mabaan
(Anour)
Mabaan
(anon)
Mabaan (Anderson)
and [ʕɔ́sì] ɔci ʔɔ́cé
so that [ʕáŋkàː] aŋ-ka ʔáŋkà
because (< Ar.) [ʕásàan] acaan
where [ʔáŋɛ̀] aŋɛ
quotative particle [ʔàgɪ́] agi ʔàgē
we [ʔɔ̂ːn] ɔɔn ʔɔ̆ɔn

Tuesday, December 10, 2024

Mabaan pharyngeals

The least well documented subgroup of West Nilotic is the Burun group, spoken around the borders between Sudan, South Sudan, and Ethiopia. The largest language in this subgroup is Mabaan, spoken in South Sudan, for which there exists at least one dictionary (available without bibliographic information on Roger Blench's site), and several very interesting articles by Torben Andersen. But we are no longer in the era where a non-field linguist could be content to look at printed sources alone; there is a fair amount of Mabaan content on YouTube, including a channel by a BA-trained linguist and first language speaker of Mabaan, Thomas Anour: Learn Maban, African Language with Thomas Anour. (Like and subscribe, or whatever it is you're supposed to do on YouTube to encourage creators.) Between these, that makes enough material to observe an interesting phonological difference.

In Mabaan as described by Torben Andersen and in the aforementioned anonymous dictionary, /h/ seems to show up only in interjections or loans, and /ħ/ is not mentioned at all. The variety spoken by Thomas Anour, however, features a number of words with initial [ħ] (occasionally varying with [h]). A single cognate in a North Burun language, Mayak, suggest that this is the reflex in his variety of *r, which otherwise becomes a semivowel in Mabaan; more would be desirable.

English Mabaan
(Anour)
Mabaan
(anon)
Mabaan
(Andersen)
Jumjum
(Fadul et al.)
Mayak
(Andersen)
sorghum field (?) <hill> [ħîl] <yielo>
"field for dura grain"
- <yiil>
"field, farm"
-
rat <heeñ> [ħéːɲ] <yyeño> "rat" yiiêɲ-ʌ̀
"~, mouse"
<yiiñ> rii-nit̪
sausage tree <heeṭṭa> [ħétà] <wyeṭṭa>
"pod of ~"
- - -
desert <hong> [ħʌ̂ːŋ] <wɔɔŋ>
"wilderness, desert"
- - -
salmon (sic) <hitta> [ħítàː] - - - -
excuse (Ar. izin) <honda> [ħʌ̀ndá] - - - -

Edit (12/12/2024): The Elenchus comparativus (von Hurter, 1800) records, s.v. "souris" (mouse), <hén> for "Abugonos Burun" vs. <rine> for "J. Kurmuk". This is the only word in the list transcribed with initial h - and the only word on the list corresponding to any of the ones above - but seems sufficient to suggest that this pronunciation is indeed old. Among words with *r, one notes Abugonos <yonga> "meat" and <ímaghi> "blood" (Kurmuk <rin>), which do not support the hypothesis of *r > ħ, but, given the imprecise transcription, do not disprove it either. My thanks to Shuichiro Nakao for sending me a link to this exceptionally early source.

Thursday, September 26, 2024

Tlemcen: medieval folk etymologies and their implications

In the mid-14th century work Bughyat al-ruwwād fī dhikr il-mulūk min banī ʕAbd al-Wād, Yaḥyā Ibn Khaldūn (brother of the more famous Ibn Khaldūn) ventures two possible etymologies for the name of Tlemcen (Standard Arabic Tilimsān, dialectal Arabic Tləmsān):

تسمى بلغة البربر تلمسنين كلمة مركبة من تلم ومعناه تجمع وسين ومعناه اثنان اي الصحراء والتل فيما ذكر شيخنا العلامة ابو عبد الله الابلي رحمه الله وكان حافظا بلسان القوم ويقال ايضا تلشان وهو ايضا مركب من تل ومعناه لها وشان اي لها شان
In the Berber language it is called "T.l.msīn", a word composed of t.l.m, meaning "she/it gathers", and sīn, meaning "two" - i.e. the Sahara and the Tell - according to our shaykh the most learned Abū ʕAbd Allāh al-Ābilī, may God have mercy on him, who was well-versed in the people's tongue. It is also said "T.l.šān", which is also a compound, of t.l., meaning "she/it has", and šān, i.e. "it has status".

Both etymologies are easy enough to interpret in the light of comparative Berber data. In the nearest (barely) surviving Berber variety - Beni Snous (Aṯ Snus), some 40 km west of the town - "Tlemcen" is indeed Tləmsin, not Tləmsan (cf. Destaing's Etude, pp. 368, 370, 371, etc.) This variety, however, does not use the word sin for "two" - it uses ṯnayən, like the Rif to its west (cf. Destaing, Dictionnaire, p. 98). The closest varieties to preserve a Berber word for "two" - geographically and genetically - use sən, in common with the rest of the Zenati subgroup to which Beni Snous belongs. The nearest varieties using the form sin are Kabyle, far to the east, and Middle Atlas Tamazight and Tashlḥiyt, far to the west. For the verb, one might consider t-əlləm "she/it spun", but the gloss given better matches a widespread dialectal Arabic word that could well have been borrowed into Berber: t-ləmm "she/it gathers". The second is obviously a compound of Arabic ša'n "affair, rank, status" and the Berber verb t-la "she/it has". Today this verb survives in Beni Snous, as in Kabyle, only residually, in the construction wi-h y-il-ən "who does it belong to?" (Destaing, Grammaire, p. 88). But it may have been more productive at that time, as it still is in Middle Atlas Tamazight.

Obviously, the first of these etymologies is implausible, while the second is a self-aggrandising play on words rather than an attempt to explain the name. But the fact that the first one could seriously be suggested is strong evidence that the meaning of Tlemcen was no more transparent to 14th century Berber speakers than it is to 21st century ones - as is not unusual for placenames. A better etymology can be proposed by taking into account comparative data - and allows us to explain the cross-linguistics differences in the final vowel - but I'll leave that for another day.

Sunday, September 15, 2024

"Berber" language in early Arabic texts

Searching Shamela, I recently realised that the earliest references to a language of (al-)Barbar in Arabic go back further than I had assumed, to the second century AH. While these are unlikely to shed much light on actual linguistic practices, they are worth a look.

Two occur in the Qur'ānic commentary (tafsīr) of Mujāhid ibn Sulaymān (d. 150 AH = 767 AD); one in his discussion of sūrat al-Isrā':

فقال عبد الله بن الزبعري السهمي: إن الزقوم بلسان بربر التمر والزبد. قال أبو الجهل: يا جارية ابغنا «٣» تمرا فجاءته. فقال لقريش وهم حوله تزقموا من هذا الزقوم الذي يخوفكم به محمد «٤» .
And ʕAbd Allāh ibn al-Zabʕarī al-Sahmī said: Zaqqūm means 'date' and 'butter' in the tongue of Barbar. Abū Jahl said "Slave-girl, find us some dates", and she came to him, and he said to Quraysh as they were about him: "Have some of this zaqqūm that Muhammad is scaring you with".

And one, almost identical, in his discussion of sūrat al-Dukhān:

بلسان بربر وأفريقية الزقوم يعنون التمر والزبد، زعم ذلك عبد الله بن الزبعري السهمي، وذلك أن أبا جهل قال لهم: إن محمدا يزعم أن النار تنهت الشجر وإنما النار تأكل الشجر، فما الزقوم عندكم؟ فقال عبد الله بن الزبعري: التمر والزبد. فقال أبو جهل بن هشام: يا جارية، ابغنا تمرا وزيدا. فقال: تزقموا.
In the tongue of Barbar and Ifrīqiyah, zaqqūm they mean 'date' and 'butter'. So claimed ʕAbd Allāh ibn al-Zabʕarī al-Sahmī, on the basis that Abū Jahl said to them "Muhammad claims that the Fire grows trees, yet fire consumes trees! So what is zaqqūm according to you?" And ʕAbd Allāh ibn al-Zabʕarī said: "Dates and butter". And Abū Jahl ibn Hishām said: "Slave-girl, find us some dates and butter!" And he said: "Have some zaqqūm."

One is vaguely reminded of Siwi a-zəggar "a (single) date"; but if indeed this were a word of "Barbar and Ifrīqiyah", one would hardly expect it to trip from the lips of Abū Jahl, and still less to be familiar to his audience. (For those unfamiliar with the context: Abū Jahl was the foremost enemy of early Islam, and ʕAbd Allāh ibn al-Zabʕarī was an anti-Islamic poet; their assertion that in some foreign language zaqqūm means "dates and butter" was almost certainly intended as mockery of the Qur'ān, not as serious lexicography.) The juxtaposition of "Barbar and Ifrīqiyah", however, seems to corroborate that the intended reference is indeed to the Berbers rather than to any East African groups.

Another is to be found in the earliest Arabic dictionary - the Kitāb al-ʕAyn, by al-Khalīl ibn Aḥmad:

والقَيطونُ: المخدع في لغة البربر ومصر.
qayṭūn: chamber, in the language of the Barbar and Egypt.

The ultimate origin of this word, unlike that of zaqqūm, is perfectly clear: it comes from Greek κοιτών, probably via Aramaic. The term is used in Coptic too: ⲕⲟⲓⲧⲱⲛ. In modern Berber varieties its form (e.g. aqiḍun) has a q, suggesting a more recent borrowing from Arabic, but one may reasonably suspect that Berbers in Cyrenaica and the Western Desert (most of whom have since switched to Arabic) would have been familiar with some version of the Greek term before Arabic influence.

A number of references to the "Barbar" are to be found in the works of Imām Mālik ibn Anas. None of these refer to the language, but one, in al-Mudawwanah, makes a clear reference to skin colour in the context of what counts as a legally punishable insult, confirming (unless this was a later scribe's addition) that the term's reference was indeed to North rather than (as Rouighi's hypothesis might suggest) East Africa:

بَلَغَنِي أَنَّ مَالِكًا قَالَ فِي الْمَوَالِي كُلِّهِمْ: مَنْ قَالَ لِبَرْبَرِيٍّ يَا فَارِسِيُّ أَوْ يَا رُومِيُّ أَوْ يَا نَبَطِيُّ أَوْ دَعَاهُ بِغَيْرِ جِنْسِهِ مِنْ الْبِيضِ كُلِّهِمْ فَلَا حَدَّ عَلَيْهِ فِيهِ، أَوْ قَالَ يَا بَرْبَرِيُّ وَهُوَ حَبَشِيٌّ فَلَا حَدَّ عَلَيْهِ وَهُوَ قَوْلُ مَالِكٍ.
It has reached me that Mālik said of mawlās in general: "Whoever says to a Berber 'Hey, Persian!' or 'Hey Roman!' or 'Hey Nabaṭī!', or calls him by any other nation of the whites - he is not to be punished for it. Or if he says 'Hey Berber!' when he is really Ethiopian, he is not to be punished for it. That is Mālik's statement.