Sunday, June 10, 2018

fatta: a loan from Chadic into Songhay?

The Proto-Chadic word for "go out" was reconstructed by Newman and Ma (1966) as *p-t-, with attested reflexes in all primary subgroups of the family; the best known of these is of course (West Chadic A.1) Hausa fìtā.  The vowels vary across languages, and there is often no final vowel.  Only one subgroup, as far as I can see on a quick check, shows the consistent vocalisation *patā: the Bole languages (West Chadic A.2), spoken in Nigeria's Yobe State along the boundary between Hausa and Kanuri.  Thus Bole pàtā, Ngamo hàtâ, Karekare fàtā.

Most Songhay varieties have reflexes of two near-synonyms for "go out": *hùnú and *fáttá.  Usually, the distinction seems to be roughly "leave (a place or event)" vs. "go out of (an enclosed or concealed space)".  In Northern Songhay - the subgroup most isolated from the rest for longest, spoken in the Sahara - only reflexes of *hùnú seem to be attested, covering both senses (eg Korandje hnu).  This could be interpreted as reflecting Northern Songhay's general tendency to reduce its inherited vocabulary by widening the usage of generic terms.  In light of the Chadic data, however, it is tempting to interpret it the other way around: did Northern Songhay preserve the original situation, while a West Chadic borrowing spread throughout the rest of the family via the Niger River?

Saturday, June 09, 2018

Songhay glosses in Djenne manuscripts

Djenne, in central Mali, is one of the oldest cities in West Africa; it also happens to be the westernmost Songhay-speaking town, isolated in a predominantly Bozo area.  As an old regional centre of Islamic learning, it has rather a lot of manuscripts, most still in the hands of local families rather than taken over by official heritage-keepers.  56 family collections of manuscripts in Djenne have recently been digitised and made available online, at the Djenne Manuscript Library Collection.  Searching through this amazing resource is a bit of an adventure, since a lot got lost in the translation of the metadata (for instance, this manuscript labelled as Intercession is actually a list of tribe names).  But doing so has potential rewards for the historical linguist as well as for the historian: scattered through the manuscripts are very occasional marginalia in local languages.

The first examples I've managed to find come from a late 19th or early 20th century manuscript of 8 pages, belonging to the family of Alphamoye Baber Djenepo, to which the cataloguers gave the title مكتوب في اللغة "writing on language" (which, after passing through a layer or two of translation, ended up in English as "Philology").  It's an obviously incomplete part of an alphabetical poem (unknown to Google) recounting the life of the Prophet, which gives for each letter of the Arabic alphabet in order a section rhyming in that letter.  The language is somewhat obscure, and is copiously annotated - mainly in Arabic, but every so often in Songhay.

On p. 8, for instance, we see the Arabic word تَعَوُّذِ "seeking God's protection" glossed with the Songhay word sumburku "holy formula, spell":

On p. 9 of the same, we see Arabic نَادِ "caller" glossed with Songhay kaati "call, shout":

This particular example is too recent to contribute much to Songhay philology, but it at least proves that Songhay was used to gloss manuscripts in Djenne, and suggests that it would be worth looking through the collection for other examples.

(Added after posting): On p. 5, we find Arabic تمساح "crocodile" glossed with Songhay kaarey "small crocodile sp.":

(PPS): And in this undated fragment of Maqamat al-Hariri, p. 4, we find another identifiable Songhay gloss (or at least a word found in Djenne Chiini): tangara for قضيب "rod, staff", followed by عجم "non-Arab" to make its status clearer:

Friday, June 01, 2018

Drawing water in Songhay and Zenaga

Almost every attested Songhay variety (Tasawaq is perhaps the only exception) has a reflex of the proto-Songhay word *gúrú "draw water" (from the river, from a pond, from a well, etc.)  To express this concept, most Berber varieties (including Tashelhiyt, Kabyle, Tumzabt, Ghadames, Awjila, Tamajeq...) use reflexes of a verb *āgum "draw water", which is thus equally securely reconstructible for proto-Berber.  Zenaga, however, has a rather different verb: ägur "puiser l'eau d'un puits, remonter le delou, tirer la corde du seau; faire parvenir qqc (à qqn)" and "se lever (astre)", with an irregular corresponding noun tgäʔrih "eau tirée du puits".  It seems to be distinct from äggur "pull".

The only Berber cognates Taine-Cheikh suggests for ägur are reflexes of a verb that may be reconstructed as *agir "throw; rise (of sun)" (eg Tashelhiyt gr, Kabyle gər, Chaoui gər).  Presumably the semantic shift of "throw" to "draw water" would be explained via the idea of throwing the bucket down the well.  If the comparison is accepted, then the verb shows an innovative semantic shift specific to Zenaga.  (It would be interesting to see if Tetserrét shares this, but unfortunately the relevant term doesn't seem to have been recorded.)

If the Zenaga word is indeed cognate to the suggested Berber forms, then it seems reasonable to draw the conclusion that proto-Songhay borrowed *gúrú "draw water" from an early relative of Zenaga.  This would fit well with the evidence for a Western Berber language having played an important role in the history of at least northern Mali.  If not, then it would become tempting to draw a conclusion much harder to fit with what is known of the region's history: that Zenaga borrowed the word from proto-Songhay.

Tuesday, May 29, 2018

Zenaga dialectal reflexes of ʔ, :

For the purposes of Berber historical linguistics, arguably the most important thing about Zenaga is its thoroughgoing retention of the glottal stop. Some Zenaga glottal stops derive from *q, corresponding to ɣ elsewhere in Berber, but many derive from *ʔ, lost without trace in most Berber varieties. When a rather carefully transcribed new source of dialectal Zenaga data comes to light, it thus seems logical to start by seeing how the glottal stop is reflected there. For convenience, I restrict this first pass to two of Ahmadou Ismail's wordlists: body parts, and herding vocabulary. The results are fairly clear.

In general, Taine-Cheikh's Vʔ corresponds regularly to Ismail's V:, with the length clearly marked, as distinct from Taine-Cheikh's short V, which Ismail consistently transcribes short. Thus:

Ismail Taine-Cheikh
young camel awāra äwaʔräh
waterbag āga äʔgäh
moustache āya aʔyäh
donkey m. ājji aʔž(ž)iy
donkey f. tājil taʔž(ž)əL
beard tāmmart taʔmmärt
camels īyman iʔymän
cows tiššīđan ətšiʔđaʔn / ətšiʔđän
lamb hīmmar iẕ̌iʔmär
donkey foal īgiyu iʔgiyi
shoulder(blade) tūṛiḍ toʔṛuḌ
donkeys ūjjayan uʔž(ž)äyän
shoulder(blade)s tūrdin tuʔṛäđän

There are only two contexts where this correspondence does not hold.  In the context / _C#, if C is a stop or fricative, Ismail retains the glottal stop; if C is a sonorant, it disappears without affecting vowel length.  (More examples of this context would be useful to confirm the exact conditioning.)

spring taniʔđ täniʔḏ
cow taššiʔđ täšši
head iʔf iʔf
camel ayyim äyiʔm
camel f. tayyimt täyi(ʔ)mt

Word-finally, the variety Taine-Cheikh describes has no overtly realised glottal stops (*ʔ > Ø / _#); the contrast, however, is maintained, since all originally vowel-final words now end in h (*V > Vh / _#). In Ismail's dialect, the latter change never happened:

waterbag āga äʔgäh
moustache āya aʔyäh
young camel awāra äwaʔräh
stomach taxṣa taḫs(s)äh
goat tikši təkših
ewe tīyyi tīyih

Nevertheless, the two classes have not completely merged; final *i remains i, but final *iʔ becomes u:

billy-goat ahayu äẕ̌äyi
mouth immu əmmi
tooth awkšu äwkši
tongue itšu ətši
donkey foal īgiyu iʔgiyi
calf īrku īrki
In the variety Taine-Cheikh describes, long vowels derive not from *Vʔ but from *Vh (ultimately *Vβ). Given that vowel length can be a reflex of a former glottal stop in Ismail's dialect, the next thing we need to check is what happens to *Vh there; it turns out that there too it yields long vowels:

small cattle tākšin tākšən
calf īrku īrki
ewe tīyyi tīyih
nostril tīnhart tīnẕ̌ärt
nose tīnharin tīnẕ̌ärän

The regularity of these correspondences is a testimony to the accuracy of both parties' work, and confirms the value of Zenaga as a data source for Berber historical phonology.

Monday, May 28, 2018

A "crazy rule" in Zenaga

As part of what seems to be a solo documentation effort, Ahmadou Ismail has been posting some very interesting tidbits on Zenaga (in Arabic). The dialect reflected differs in some ways from the one reflected in Catherine Taine-Cheikh's publications. One of the more conspicuous differences is in the fate of proto-Berber *z. For Taine-Cheikh, *z > ẕ̌ in general (a slightly lowered ž), but *zt > Z (a tautosyllabic geminate zz). In Ahmadou Ismail's dialect, *zt > zz as with Taine-Cheikh, but otherwise *z > h, eg tihigrarin "tarawih prayers" vs. Taine-Cheikh's təẕ̌əgrärən, hīmmar "lamb" vs. Taine-Cheikh's iẕ̌iʔmär, awahiđ̣ "rooster" vs. äwäẕ̌uđ̣, yahinha "he sold" vs. yäžžənẕ̌äh. This leads to systematic alternations between h and zz; synchronically, Ismail's dialect of Zenaga has the "crazy rule" ht > zz. This is nicely illustrated by "he knew" (Taine-Cheikh: yuʔgäẕ̌) plus the direct object personal pronoun clitics:
  • "he knew me": yūgah-i
  • "he knew you m.": yūgah-ku
  • "he knew you f.": yūgah-kam
  • "he knew him": yūgaz-zu
  • "he knew her": yūgaz-zað
  • "he knew us": yūgah-ānag
  • "he knew you": yūgah-kūn
  • "he knew you": yūgah-kimmið
  • "he knew them m.": yūgaz-zin
  • "he knew them f.": yūgaz-zincað (maybe; not quite sure how چَّٰ is supposed to be read)
For forms without assimilation, compare, as posted by someone else on the same group (Omar Sidi Mohamed), "he was owned by" (Taine-Cheikh yənšäg):
  • "he was owned by me": yiššag-i
  • "he was owned by you m.": yiššak-ku
  • "he was owned by you f.": yiššak-kam
  • "he was owned by him": yiššak-tu
  • "he was owned by her": yiššak-tað
  • "he was owned by us": yiššag-ānag
  • "he was owned by you": yiššak-kūn
  • "he was owned by you": yiššak-kamað
  • "he was owned by them m.": yiššak-tan
  • "he was owned by them f.": yiššak-tinyað

Tuesday, May 22, 2018


Ever since she got interviewed on TV ten days ago, the 19-year-old president of the student union at Université Paris-Sorbonne, Maryam Pougetoux, has been making headlines - not for anything she said, but simply for wearing a hijab while she said it. In the name of defending freedom and feminism, the Minister of the Interior himself had the gall to criticise this brave young Frenchwoman as "marking her difference from French society". But as a historical linguist watching all this, I found myself wondering: where does the name "Pougetoux" come from? It turns out it can be traced several thousand years back:

In the course of this long history, no less than three different diminutive suffixes have been accreted on to the original root (although I'm not quite sure about the identity of that -oux.) I wonder whether that generalizes; do words meaning "hill" tend to accrete more and more diminutive suffixes as they develop over time?

Tuesday, May 08, 2018

Songhay viewed through PCA

Playing around a bit more with PCA, I decided to apply the method* to a dataset I've worked with more extensively: Songhay, a compact language family spoken mainly in Niger and Mali. On a hundred-word list (Swadesh with a few changes), randomly choosing one form in cases of synonymy and including borrowings, I get the following table of lexical cognate percentages:

Tabelbala Tadaksahak Tagdal In-Gall Timbuktu Djenne Kikara Hombori Zarma Djougou
Tabelbala 1 0.678 0.67 0.687 0.636 0.667 0.625 0.622 0.616 0.602
Tadaksahak 0.678 1 0.857 0.8 0.63 0.635 0.567 0.576 0.58 0.586
Tagdal 0.67 0.857 1 0.857 0.632 0.649 0.579 0.588 0.582 0.588
In-Gall 0.687 0.8 0.857 1 0.65 0.667 0.598 0.606 0.6 0.606
Timbuktu 0.636 0.63 0.632 0.65 1 0.979 0.773 0.808 0.79 0.778
Djenne 0.667 0.635 0.649 0.667 0.979 1 0.753 0.789 0.771 0.768
Kikara 0.625 0.567 0.579 0.598 0.773 0.753 1 0.835 0.814 0.823
Hombori 0.622 0.576 0.588 0.606 0.808 0.789 0.835 1 0.838 0.867
Zarma 0.616 0.58 0.582 0.6 0.79 0.771 0.814 0.838 1 0.808
Djougou 0.602 0.586 0.588 0.606 0.778 0.768 0.823 0.867 0.808 1

Running this through R again to get its eigenvectors, the first two principal components are easily interpretable:
  • PC1 (eigenvalue=7.3) separates Songhay into three low-level subgroups - Western, Eastern, and Northern, in that order - with an obvious longitude effect: it traces a line eastward all the way down the Niger river, jumps further east to In-Gall, and then proceeds back westward through the Sahara.
  • PC2 (eigenvalue=1.1) measures the level of Berber/Tuareg influence.
All the other eigenvectors have eigenvalues lower than 0.4, and are thus much less significant.

The resulting cluster patterns have a strikingly shallow time depth; as in the Arabic example in my last post, this method's results correspond well to criteria of synchronic mutual intelligibility (Western Songhay is much easier for Eastern Songhay speakers to understand than Northern is), but it completely fails to pick up on the deeper historic tie between Northern Songhay and Western Songhay (they demonstrably form a subgroup as against Eastern). It's nice how the strongest contact influence shows up as a PC, though; it would be worth exploring how good this method is at identifying contact more generally.

* Strictly speaking, this may not quite count as PCA - I'm starting from a similarity matrix generated non-numerically, rather than turning the lexical data into binary numeric data and letting that produce a similarity matrix.

Update, following Whygh's comment below: here's what SplitsTree gives based on the same table: