In Guy Deutscher's new book The Language Glass (which I'll be reviewing on this blog sometime soon) he claims (p. 110) that "Linguists who have described languages of small illiterate societies estimate that the average size of their lexicons is between three thousand and five thousand words." This would be rather interesting, if verified - but this statement is not sourced at the back, and is in any case too vague (what counts as "small"?) to be relied on as it stands. Does anyone have any idea where he might have got this figure?
I haven't found his source, but Bonny Sands et al's paper "The Lexicon in Language Attrition: The Case of N|uu" gives a nice table of Khoisan dictionaries' sizes, ranging from 1,400 for N|uu to < 6,000 for Khwe and 24,500 for Khoekhoegowab. She prudently concludes "The correlation between linguist-hours in the field and lexicon size is so close that no conclusions about lexical attrition can be drawn" - the outlier, Khoekhoegowab, is not only the biggest of the lot (with over 250,000 speakers), but had its dictionary written by a team including a native speaker over the course of twenty years. Given that "2,000 - 5,000 word forms (in English) may cover 90-97% of the vocabulary used in spoken discourse (Adolphs & Schmitt 2004)", it is not surprising that it should take disproportionately long to move beyond the 5,000 word range. However, she also points out that "Gravelle (2001) reports finding only 2,300 dictionary entries in Meyah (Papuan) after 16 years of study", suggesting that some languages may simply have unusually small vocabularies. Along similar lines, Gertrud Schneider-Blum's talk Don’t waste words – some aspects of the Tima lexicon suggested that the Tima language of Kordofan had an unusually small number of nouns due to extensive polysemy and use of idioms (I can't remember any figures, nor indeed whether she gave any.)
I'd be interested to see other discussions of the issue of differences in lexicon size and explanations for them. My Kwarandzyey dictionary (in progress) so far stands at about 2000 words - it would be encouraging to think that I might already have done more than half the vocabulary, but I very much doubt it!
More on tonal variation in Sinitic
1 hour ago