Sunday, October 18, 2009

Arabic loanwords in "proto-Nilo-Saharan"

Ehret 2001 (or see Nostratic.ru) looks at first sight like an astonishingly detailed reconstruction of Nilo-Saharan, with nice binary splits and loads of technology-related words for archeologists and anthropologists to sink their teeth into. Why shouldn't specialists take advantage of this amazing opportunity to correlate historical developments to linguistic ones?

I just found a handy answer to that question. Bender (1997:175ff) gives the 15 cognate sets in Ehret 2001 that are represented in the most sub-families of Nilo-Saharan. 3 of the 15 look distinctly like Arabic loans.

1387 *wàs “to grow large”: Fur wassiye “wide” and Songhay wásà “to be wide” are both from Arabic wāsi`- واسع. The other items cited – Ik “stand”, Kanuri “yawn”, Kunama “increase, augment”, and Uduk “to tassel, of corn” – are scarcely obvious candidates for being related to one another in the first place.

1297 *là:l “to call out (to someone)”: Kanuri làn “to abuse, curse” and Songhay láalí “to curse” are obviously from Arabic la`an- لعن; Kunama lal- “to denigrate” might be from the same source. That only leaves Uduk “to persuade, incite to do something” and Proto-Central-Sudanic “to call out”.

718 *t̪íwm “to finish, complete”: almost certainly Songhay tímmè “to be finished”, very likely Uduk t̪ím “to finish”, Ocolo t̪um “to finish”, and maybe even Fur time “total”, are from Arabic tamm- تمّ (impf. -timm-), as Bender (ibid:177) considers probable. That leaves Proto-Central-Sudanic, Kunama, and Maba “all”, Kanuri “ideophone of dying animal” (!), and Proto-Kuliak “buttocks”. The “all” set looks rather promising – the whole etymology, not so much.

There are plenty of other Arabic loanwords in Ehret's “Proto-Nilo-Saharan” – a particularly egregious example is Kanuri zàmzàmíyɑ̀ “leather bottle-shaped water vessel for journeys” (#1223 *zɛ̀m “to become damp, moist”), and other especially clear-cut cases include #1173 < sawṭ, #1185 < šamm – but the fact that they include a significant proportion of the best cognate sets is what really strikes me. If a reconstruction attempt can't distinguish a widely distributed recent loan from a cognate set that split more than eleven thousand years ago, any information it gives about readily diffused items like technologies is completely unreliable. For another review from a similar perspective, try Blench 2000 (not sure why it appeared a year before the book's nominal publication date...)

The more I read about Nilo-Saharan, the less convinced I am that it exists (much less that Songhay belongs to it.) That means the classification of the languages of quite a lot of Africa is basically up for grabs. It would be great to have a reexamination of the area.

11 comments:

John Cowan said...

Indeed it would be. Consider: what did Greenberg actually do with his professional life? Let's look at the record, as Al Smith said.

In the New World, Greenberg identified Eskimo-Aleut (duh), Na-Dene (duh, barring Haida which is now out of the picture since last year's Yeniseian breakthrough), and Other. In Africa, he identified Afroasiatic (mostly duh, though including Chadic and excluding other peripheral groups was probably right), Niger-Congo (okay, some serious work there, I guess -- I'm totally ignorant of that whole family outside Bantu), and Other. The rest of his work is worthless, and his mass-comparison methodology is @#$# and every historical linguist knows it.

Time to let all of Greenberg's works perish and be forgotten.

Lameen Souag الأمين سواق said...

Greenberg's works ought to be taken in the right spirit: as very speculative proposals, usually based on data that has since become obsolete, that sometimes point the way towards reality, but should never be taken as the last word on a subject. The trouble is that no one seems willing to put in the hard work of coming up with a better classification, knowing that much of the potential audience mistakenly assumes the problem has already been solved.

David Marjanović said...

It's very simple: mass comparison is phenetics rather than phylogenetics. It can give the right answer for phylogenetics, but it can at least as easily be misled by shared retentions, convergence/loans, and so on.

It's great for generating phylogenetic hypotheses, I'd say, but it's completely incapable of testing them.

BTW, does anything bar Haida from being the sister-group of Dené-Yeniseic?

Joseph B. said...

Trashing Greenberg seems to be a safe sport these days, but before taking it at face value, read his last book Genetic Linguistics where he responds to the critics on method as well as specific cases including Haida and Nilo-Saharan.

Lameen Souag الأمين سواق said...

I'm primarily criticising Ehret here - I recognise the heuristic value of Greenberg's work. But since you mention it: the only Songhai data in the book you cite is from the comparison Maba eri, Songhai kuri, Daza gere "blood"; but the proto-Songhai form is *kwidi, with a d. The comparison is still conceivable (all three languages could independently have lenited the d, say), but I am always suspicious of comparisons that look less good when you reconstruct the proto-forms. Regular correspondences would make that more convincing, but they aren't available either (Ehret claims to have them, but go through his dictionary and judge its plausibility for yourself): for example, in The Languages of Africa, for "bird" ("kyiraw" < *kidaw) Greenberg compares Maba kebele(k) "wing", with an l rather than an r.

He doesn't present any new arguments for Nilo-Saharan in this book that I can see, unless you count his discussion of moveable-k (which does look like an argument for treating some subset of Nilo-Saharan as valid.) He just says that it is "a grouping now universally accepted", and, looking at the scant and often misanalysed data he used to argue for it in the first place, it really should not be - and, in fact, it isn't. Songhay's membership seems to be especially questionable - in his statistical analysis, Mikkola (1999), though broadly supporting the notion of Nilo-Saharan, concluded that Niger-Congo has more in common with Nilo-Saharan than Songhay does!

Joseph B. said...

I was addressing John's complete dismissal and character attack.

Songhay was Greenberg's last addition to Nilo-Saharan so presumably the weakest. He certainly did not claim his classifications were set in stone or that moving one group would invalidate the whole family. Greenberg's Nilo-Saharan had six coordinate branches and did not specify a complete branching tree as Bender and Ehret's later work have.

The alternatives for which extant language family has closest genetic relation to Songhay are limited. There's Niger-Congo (which is viewed as coordinate to Nilo-Saharan by some, or simply a subgroup according to Blench), there's Afroasiatic, and all other families are much more distant with nobody proposing a relation.

I thought you had commented on Nicolai's Mande/Berber theory in the past, but can't find it. Mande is also the most divergent member of Niger-Congo. I can't find Mikkola's paper online; what does he think Songhay has more in common with Niger-Congo, and how much of that is Mande vs. the rest?

Joseph B. said...

Oops, my last sentence based on misreading of your last sentence. Anyway, Mikkola's conclusion seems in agreement with Blench and not to particularly contradict Greenberg.

Jim said...

"BTW, does anything bar Haida from being the sister-group of Dené-Yeniseic?"

Geography maybe? If Haida is a sister language, that means that the Haida started out from Siberia a lot earlier than the Na-Dene speakers did. Then maybe the Na-Dene speakers swamped all the Haida speakers on the mainland.

It's kind of unlikely, but not impossible. In fact, that's basically the status of Welsh and Irish with regard to Germanic in Britain.

John Cowan said...

Joseph B.: I did not attack Greenberg's character. For example, I do not at all claim that his work was consciously fraudulent, or even that he cheated on his wife (if any). I only claim that he put forward a method suitable only for generating hypotheses (as David M. and indeed Edward Vajda point out) as a method of proving them, and that since in fact it is possible to frame any number of rational hypotheses consistent with a set of facts, his method even as a hypothesis generator is no better than experienced intuition and perhaps not as good.

As our good host implies, someone will eventually have to come up with the combination of chutzpah and hard work required to establish a scientifically supportable classification of African languages. Until then, we will continue to languish in the wilds of ignorance.

Lameen Souag الأمين سواق said...

"Bender (1976) revisited the question with a comparison of 24 Nilo-Saharan languages and one language from each of three other
families. With a criterion described as “the ‘look-alike’ principle modified by the results
of a painstaking search for regular phonological correspondences,” he found an average of 3.4% cognates between languages in different families and only 3.8% cognates between languages in different subgroups at the highest level within Nilo-Saharan." - http://email.eva.mpg.de/~wichmann/GlottoHeurUpload.pdf

Joseph B. said...

Wichman (and apparently Bender's work that he quotes) use pairwise comparisons between each possible pair of languages, and finds similarity no better than chance. However this method loses any trends that would be visible when you look at a word across all languages simultaneously.

Dene-Yeniseian is the current poster boy for the superiority of pairwise comparison to mass comparison, but Wichman finds the pairwise similarity between Na-Dene and Yeniseian to be less than the chance expectation! So pairwise raw comparison becomes worthless as we go back in time, while pairwise comparison taking into account reconstructed phonetic change can reach farther back. But this is not comparing mass comparison to anything; it's comparing two methods of pairwise comparison.

Greenberg explicitly disavowed glottochronology - in response to criticism of a paper by Greenberg and Swadesh, he responded that the glottochronology part was Swadesh's work inserted over Greenberg's misgivings, and listed some of his criticisms of glottochronology.