Wednesday, September 29, 2010

Small vocabularies, or lazy linguists?

In Guy Deutscher's new book The Language Glass (which I'll be reviewing on this blog sometime soon) he claims (p. 110) that "Linguists who have described languages of small illiterate societies estimate that the average size of their lexicons is between three thousand and five thousand words." This would be rather interesting, if verified - but this statement is not sourced at the back, and is in any case too vague (what counts as "small"?) to be relied on as it stands. Does anyone have any idea where he might have got this figure?

I haven't found his source, but Bonny Sands et al's paper "The Lexicon in Language Attrition: The Case of N|uu" gives a nice table of Khoisan dictionaries' sizes, ranging from 1,400 for N|uu to < 6,000 for Khwe and 24,500 for Khoekhoegowab. She prudently concludes "The correlation between linguist-hours in the field and lexicon size is so close that no conclusions about lexical attrition can be drawn" - the outlier, Khoekhoegowab, is not only the biggest of the lot (with over 250,000 speakers), but had its dictionary written by a team including a native speaker over the course of twenty years. Given that "2,000 - 5,000 word forms (in English) may cover 90-97% of the vocabulary used in spoken discourse (Adolphs & Schmitt 2004)", it is not surprising that it should take disproportionately long to move beyond the 5,000 word range. However, she also points out that "Gravelle (2001) reports finding only 2,300 dictionary entries in Meyah (Papuan) after 16 years of study", suggesting that some languages may simply have unusually small vocabularies. Along similar lines, Gertrud Schneider-Blum's talk Don’t waste words – some aspects of the Tima lexicon suggested that the Tima language of Kordofan had an unusually small number of nouns due to extensive polysemy and use of idioms (I can't remember any figures, nor indeed whether she gave any.)

I'd be interested to see other discussions of the issue of differences in lexicon size and explanations for them. My Kwarandzyey dictionary (in progress) so far stands at about 2000 words - it would be encouraging to think that I might already have done more than half the vocabulary, but I very much doubt it!

Wednesday, September 22, 2010


I finally got my hands on an article I had been looking for for a while about the "Kouriya" language of Gourara (around Timimoun, Algeria): Rachid Bouchemit, 1951. Le Kouriya du Gourara, Bulletin de Liaison Saharienne 5, p.46-47. While short, it's significantly more informative than the vague rumours to be found in other sources. "Kouriya", it turns out, was the general-purpose name given locally to any Black African language - "L'unité du terme cache la pluralité des idiomes: Haoussa, Bambra, Foullan, Mouchi, Songhai, Bornou, Boubou, Gouroungou, Minka, Sarnou, Nourma, Kanembou, Karkawi, etc...", in particular as spoken by ex-slaves in the region. Following the abolition of slavery, these languages, no longer reinforced by the arrival of new slaves, rapidly fell into disuse; the new generation learned Arabic and Taznatit instead. By 1951, the author could find only seven or eight speakers of a "Kouriya" in Timimoun, and only two of them spoke the same language, namely Bambara.

While the author leaves the etymology unexplained, I would add that the term "Kouriya", and the corresponding ethnonym kuri, probably derive from Songhay koyra "town, village", used to form the Songhays' own name for themselves, koyra-boro "townsman"; Songhay is, after all, the nearest major ethnic group in the Sahel to the Gourara region.

Monday, September 13, 2010

Arabic right-hemispheric WEIRDness

Recently Language Hat asked for informed reactions to a BBC report claiming that Reading Arabic 'hard for brain'. The papers under discussion are to be found at Eviatar's home page, in particular the 2009 paper "Language status and hemispheric involvement in reading: Evidence from trilingual Arabic speakers tested in Arabic, Hebrew, and English" but also clearly the 2004 paper "Orthography and the hemispheres: Visual and linguistic aspects of letter processing". Now I'm no psycholinguist, but obviously this story smells fishy, so I had a closer look.

At least one glaring mistake seems to be clearly the BBC's fault: it wrongly claims "When the Arabic readers saw similar letters with their right hemispheres, they answered randomly - they could not tell them apart at all." In fact, this seems to conflate two different experiments. Telling letters apart was the first task in the 2004 paper, and the Arabic readers' error rates for similar letters were only 8% (Table 6) - worse than with the left hemisphere, but not nearly so bad. The claim that "there is a specific RH deficit in reading Arabic, because that is the only condition (with bilateral presentation), where these native Arabic speakers responded at chance" comes from the 2009 paper - but the task referred to there was substantially more complicated. They were looking at words/nonwords, not letters; they were presented with two words, one for each hemisphere, one of which was underlined; and they had to decide whether the underlined "word" was a real word or not. Other issues are not so much wrong as stupid: talking as though students could choose which hemisphere to learn with, for example.

However, the BBC cannot be blamed for drawing excessively sweeping conclusions from this experiment. The authors themselves talk of their results as applicable to Arabic in general, which rather overstates the case. In both papers, the Arabic speakers were all also fluent speakers of Hebrew, which they had studied since second grade, and were living in a state where Hebrew is the dominant language. In the 2004 test, at least, they were also all undergraduates studying degrees taught in Hebrew. Obviously, this is a rather unusual situation for Arabic speakers! In particular, it is one where pragmatic (and status-related) motivations to study Hebrew, and opportunities to familiarise oneself with it, are likely to be much greater than for Arabic (especially given the big difference between spoken and written Arabic.) In some types of tests, these speakers's right hemispheres seem to read Hebrew more easily than Arabic. The authors take this to mean that there is a "specific difficulty of the RH with Arabic orthography". But, without further testing elsewhere, it can equally well be taken to reflect the sociolinguistic situation of Palestinian citizens of Israel. This is, in fact, a special case of a much wider problem: most psychology experiments focus on "WEIRD" populations (read the link - it's a concept very much worth remembering when you read the science news.)

Friday, September 10, 2010

Doctorate done

Eid Mubarak everyone! I am now Dr. Souag. (As of a couple of weeks ago, actually, but I've been doing other stuff instead of being online.) You can read my thesis online, for the moment: Grammatical Contact in the Sahara. My examiners were Prof. Jeffrey Heath and Dr Martin Orwin. Thanks once again to everyone in Tabelbala or Siwa that helped me learn their languages, and to my supervisors, teachers, friends, and family. I'm currently working out future plans, but rest assured that they include plenty more research.