Monday, April 05, 2010

More on the WOLD Kanuri entry

The World Loanword Database is a great resource, and the Hausa/Kanuri team deserve congratulations for undertaking the Herculean labour of putting together two sets of etymologies. However, there are some issues with the Arabic etymologies in the Kanuri entry. The transcription is inconsistent and sometimes incorrect; more seriously, a few entries give incorrect meanings or impossible etymologies, as in the following cases:

3.592 àkú parrot: the quoted Arabic form is almost impossible as a Classical Arabic noun (and not in the Lisan al-Arab; the Arabic word is babγā’), and parrots are known in the Arab world only as an exotic import. Assuming the form exists in some Arabic dialect, it must be a loan from a sub-Saharan African language, not vice versa.
9.24 mágàsù scissors: the g and the u both suggest that this word entered directly from (Bedouin) Arabic, not via Hausa.
11.12 hàláltə́ own: if this is correctly transcribed, surely it comes from Arabic ħalāl “licit; one’s lawful property”. Arabic halak means “perish”.
11.79 ríwà dìò to earn: “ribā” means usury, and is strongly condemned in Islam; it is unlikely that this would be adopted as a neutral word “earn”. The more plausible source for both the Kanuri and the Hausa is Arabic ribħ “profit, gain”.
11.78 àlwúsùr wages: Perhaps < Arabic al-`ušr "tithe (< one-tenth)"; surely not from ma`āš.
14.451/6 kàjílí evening: “kajir” is not a possible native Classical Arabic word, and is not attested in Classical Arabic. If it’s in Shuwa, it must come from Kanuri, not vice versa.
16.34 tə́wə́rítə́ regret: Hausa tuubaa does come from Arabic, but clearly from Arabic tūb “repent”; it has nothing to do with Arabic ta’assaf (not *tāssaf) “regret”.
16.69 gàfə̀rtə́ forgive: the connection to Arabic γafar- is obviously correct, but Arabic yaʕfū is equally obviously not relevant; even if ʕ were normally reflected as g in Kanuri, it would leave the r unexplained.
18.33 kàsàttə́/àrdìtə́ admit: the Arabic form “kasat” does not exist. yarḍā means “may He hope/ approve” (as noted), not “admit”, making the connection rather tenuous.
18.45 áwúlò dìò boast: there is no Classical Arabic word “awulo”.
19.47 àmàrtə́ permit: Arabic ʔamar- means “he ordered”, not “permission”.
20.31 súlwé armor: Arabic silāħ means “weapons”, not “armor”.
21.24 àlàptà swear < ħalaf "swear" (not < allāh "the god")
21.37 àzáwù punishment: from Arabic ʕađāb “punishment, torment” rather than jazā’.
21.47 perjury: by what chain of semantic changes could “perjury” derive from “lawful”? And why would l > k?

Probable Arabic loanwords not listed as such include:
11.54 bàyîl stingy: from Arabic baxīl.
4.89 sûm poison: surely from Arabic samm?
4.93 sə̀lé bald: surely from Arabic ‘aṣla`?
5.26 kóló pot: perhaps cp. Arabic qullah (or onomatopeic?)
7.58 kábbì arch: surely from Arabic qubbah?
14.25 bàdìtə́ begin: surely from Arabic bada’?
11.29 lòrùtə́ damage: from Arabic ḍarr (impf. -ḍurr-). Cp. “judge” for ḍ > l.
24.02 wàltà become: perhaps from Maghrebi Arabic wəlli “become, return”.

In some cases, looking more widely allows the etymologies to be improved:
3.11 lə̀mân animal: < al-māl- "livestock, money", rather than al-mann "favor, benefit". For the dissimilation, compare the common Maghrebi Arabic change of n...n to n...l, eg badənjal < bāđinjān, fənjal < finjān.
2.34 lòrúsà wedding: probably from al-`arūs “bride” (Maghrebi Arabic l-aʕṛuṣa), rather than direct from ʕurs. Cp. Siwi aʕṛus “wedding”, with the same semantic shift.

There are also a few cases, many probably originally formatting issues, where the correct form is given in comments, but contradicted elsewhere:

3.25 sheep: the source cited, Kossmann 2005 (67), points out that the form quoted by Skinner, *adaman, is unattested. The correct form, adəmman, is found in Arabic as well as Berber, and refers to a type of sheep said to come from sub-Saharan Africa. Given that it refers to a specifically sub-Saharan sheep breed, 5 would seem a better classification than 4, though 4 is understandable.
3.78 camel: Kossmann 2005, cited, makes it rather clear than an Arabic origin for this word is very improbable. Moreover, there is no such Arabic word as “ləγəmal”; only the form jamal is correct.
4.87 physician: If Shuwa Arabic or some such variety has a term liktaay, there can be little doubt that it is a loan into Shuwa, not from Shuwa. As the comment indicates, this comes from English, not from Arabic.
7.422 blanket: The comments indicate a Berber form abroγ, but the field gives abrok. The Arabic etymology is less implausible than it appears, since the semantic shift to “full body covering” is well-attested, as in English “burka” from the same source.
12.081 above: here it is called areal and probably not Arabic, but under “sky” and “heaven” the same word is listed as “clearly borrowed”. One of these statements must be wrong.
13 zero: the Hausa form is transcribed correctly in comments, but wrongly under “Source words”.
18.51 write: rubuta is Hausa, not Berber, as the sources quoted make clear. The proto-Berber form had no suffix -t (as Kossmann indicates), and neither do any of the equivalent modern Berber verbs.
19.62/20.11 quarrel: If it’s related to “alhilaafu”, the Arabic form is al-xilāf. If it’s related to “judge”, that form is irrelevant. In either case, there is no Arabic word “alwalaʔ” with appropriate meaning.


Kim said...

I was very excited by the WOLD project (and still am, in principle). But when I looked through the lists for the two languages in the set that I feel I have some background in, historical-linguistically, I was very taken aback by the oddness of some of the judgments of borrowed-ness. It made me a little more wary of the overall quality of the data... and your post here increases my uncertainty.

Priscila Andrade said...

It’s time for The Top 100 Language Blogs 2010 competition and the good news is your blog has been nominated. Congratulations!
After previous years’ success the language portal and Lexiophiles language blog are hosting our worldwide language blog competition once again.
We are looking for the top 100 language blogs in four categories: Language Learning, Language Teaching, Language Technology and Language Professionals.
You have been nominated to the following category: Language Learning.
The nomination period goes from April 27th to May 11th. Each blog will have a one-sentence-description for the voting. If you would like a special description to go along with your blog, just send me an email (priscila [at] The voting period goes from May 12th to May 24th. The winners will be announced on May 28th. Feel free to spread the word among bloggers writing about languages.
For more information on The Top 100 Language Blogs 2010 visit:

Kind regards,
On behalf of the and Lexiophiles team