Friday, February 05, 2010

Word Loanword Database

I shouldn't really be blogging at this stage of my thesis-writing, but this I had to share: the World Loanword Database has come online. Vocabularies likely to be of particular interest include Tarifiyt, Hausa, Kanuri, Iraqw, but there are plenty more, all carefully analysed for loanwords... Have fun, and feel free to discuss any mistakes you think you spot in it here :)

(Via Glossographia.)

8 comments:

Lameen Souag الأمين سواق said...

Alright, actually I'll start that with some errors in the Kanuri one:

* lə̀mân "animal" < Ar. al-māl-, dialectally "livestock", not al-mann, surely?
* dímì "sheep", támà "lamb": "adaman" is a misspelling, and the dəmman sheep breed is universally known in the Sahara as coming from sub-Saharan Africa.
* Berber alγəm, cited correctly for kàlímò, doesn't come from Arabic jamal (much less from the nonexistent "Arabic" form * lǝɣǝmal - unless this is some Shuwa form, in which case they should say so.)
* "Arabic" liktaay "doctor"? If that is Shuwa or something, it certainly isn't of Arabic origin.
* bàyîl "stingy": from Arabic, certainly, but the form (baxīl) is omitted for some reason.
* "Arabic" adîn "the east"? Presumably miscoded.

And there are a lot of mistranscriptions in the Arabic forms quoted. But enough - back to work now...

Lameen Souag الأمين سواق said...

And the Japanese entry traces sekai "world" < Chinese shijie < Sanskrit, while the Chinese entry for shijie says "no evidence of borrowing." The number of loans in Chinese does look suspiciously low...

Anonymous said...

Thanks for the link to my blog! I'm still working through some of the data for numerals, but it looks like in a couple of cases, loanwords for 'ten' and 'five' are correctly identified but the word for 'fifteen', analyzable as 'ten' + 'five', is identified as 'no evidence of borrowing'. The result is that it falsely appears that 'fifteen' is less often borrowed than either of its components.

bulbul said...

There a bunch of Amharic loans in Gawwada that I think could be ultimately traced to Arabic, e.g. kəbrit, hakim, suf, kis etc.
For Kildin Saami, some words are misidentified, e.g. 'kurva' is marked as '5. no evidence for borrowing' when it is clearly a Russian loan and in fact, the database says so right there. Maybe it's just my imperfect understanding of the methodology and terminology, but there are a few other entries that seem fishy, e.g. voafsxess, which is marked as '2. probably borrowed' (with Latin 'aurora' as the source, really?), but also as 'pre-Proto-Saami' and 'Present in pre-contact environment'.
The entry for 'mudta' = other (adj., I assume) gives the Finnish adverb 'muutoin' as the source. That's a bit too far to go for the etymology and where did the -n go? Off the top of my head, I would suggest the much more frequent 'muuta' = other-PART.
And btw, borrowed negative marker? Nice!

languagehat said...

This is fantastic -- I'm blogging it immediately!

Cemmust said...
This comment has been removed by the author.
Cemmust said...

Azul,

Thanks a lot, Lameen, for the link. I enjoyed the Tarifit part very much. Great work by the great Tarifit fan Dr. Maarten Kossmann!

Another topic... I would like to hear your opinion(s), if possible, on this:

As you know, Berber neology has accumulated a large number of words that are now accepted by many mainstream Berber linguists. These unified "pan-Berberized" neologisms are bringing the Berber speakers closer together, compensating for the vernacular divide and showing us a hint of a future standard written Berber.

Some of these popularized Berber neologisms are:
1- "Tudrsnt"(biology)
2- "Tarbasnt" (pediatrics)
3- "Talsasnt" (anthropology)
4- "Timttisnt" (sociology)

I never saw one single Berber linguist try to expand on these neologims by deriving more words like: biological, biologist, pediatric, pediatrist, sociological, sociolgist...etc.

My question is:
What would be the most correctly derived words from the above mentioned neologisms?

My suggestions (in masc. / fem.):

1 -Biological: "Udrsan / Tudrsant"
-Biologist: "Amusnudr / Tamusnudrt"

2 -Pediatric: "Arbasan / Tarbasant"
-Pediatrist: "Amusnarba / Tamusnarbat"

3 -Anthropological: "Alsasan / Talsasant"
-Anthropologist: "Amusnalsa / Tamusnalsat"

4 -Sociological: "Imttisan / Timttisant"
-Sociologist: "Amusnimtti / Tamusnimttit"

Lameen Souag الأمين سواق said...

Well, they sound good to me - they may not make much sense in Siwi though...