Showing posts with label lexicography. Show all posts
Showing posts with label lexicography. Show all posts

Tuesday, February 20, 2024

"Punching up/down" in comedy: dating a lexical innovation in English

Any educated English speaker nowadays is likely to be familiar with the idea that comedy should punch up, not punch down: i.e., that it's okay to make fun of people more powerful than yourself, but not of people less powerful. But I remember being struck by the novelty of this expression when I first encountered it, well into adulthood. Notwithstanding the recency illusion, a bit of research suggests that my impression was correct. The earliest attestations I've been able to track down online go back to July 2012, in connection with a controversy about rape jokes made by some comedian named Daniel Tosh:

"Kilstein trots out the old trope that all comics are victims who have been bullied and that’s why we’re doing standup. Total bullshit, of course, but he uses the tired cliche to glorify himself and others– who are “punching up”– and characterizes Tosh and others as tyrants or bully comics who are now punching down." (Brian McKim & Traci Skene, Tosh.Opus, 16 July 2012)
"The answer is that in both cases, the comedians were “punching down.”
Punching down is a concept in which you’re assumed to have a measurable level of power and you’re looking for a fight. Now, you can either go after the big guy who might hurt you, or go after the little guy who has absolutely no shot. Either way, you’ve picked a fight, but one fight is remarkably more noble and worthwhile than the other. Going after the big guy, punching up, is an act of nobility. Going after the little guy, punching down, is an act of bullying." (the pseudonymous "Kaoru Negisa", Punching Up, 19 July 2012)

All three writers are, naturally, American, and at least two of them are standup comedians themselves. Presumably the expression would already have been in use in some circles - perhaps backstage in standup comedy - for some years before that. But internal evidence suggests that it was still not assumed to be familiar to a general audience; both sources feel the need to put it between quotation marks on first use, and one even provides a definition, treating it as a metaphorical extension of a meaning used in the context of fights rather than as a familiar term in the context of comedy. (As further evidence, one may point to its complete absence from this 2012 Jezebel article about the same controversy; had it been written a few years later, it would seem unthinkable not to use the term "punching down" in expressing these ideas.) The term's use on MSNBC (as mentioned in the first source) would have been a good first step towards making the term familiar to a wider audience. By 2014, it was already appearing in The Atlantic (""We like standing up for the little guy, we like punching up," Bolton said."). On Google Books, however, the earliest hits in the relevant sense show up only in 2016, at which time the "'punching up' vs. 'punching down' dichotomy" could still be described as a way in which this tension has "recently been encoded" (Taboo Comedy.) Before that date, the object of "punching down" mostly seems to have been bread dough.

Can anyone find an attestation predating July 2012? And does this new terminology represent a new concept of comedians' moral duties, or just relabel an older one? If the latter, what did earlier American comedians call it?


Via @sanddorn on Twitter and Matt Farthing, a 2011 attestation - once again by a stand-up comedian, but from England this time.

"And a lot of comedians do jokes that I think aren’t funny enough to justify what they are about, and there’s plenty of ways you can be offensive without ‘punching downwards’. When FB does jokes about Palestine or black people there’s much more of a point behind it really. But it’s difficult because that’s his job, that’s how he sees himself – as this comedian who’ll say anything and make jokes about anything." (Richard Herring, 18 January 2011, )

And using this, I find that Ben Zimmer managed to discover an even earlier attestation, in a good discussion of this term's origins: a blogpost, also by Richard Herring, in December 2010. Note that, in these earliest attestations, it appears as part of a broader metaphor of likening satire to punching rather than as a preset cliché: "the weak punching the strong, rather than the strong bullying the weak", "Though there are no rules, comedy, I feel, should be siding with the weak and the oppressed and punching either inwards (at the comedian him or herself) or upwards (at the powerful or the oppressors)."

The metaphor derives, as Zimmer notes, from the world of boxing: "If you’re punching up, you’re taking on an opponent who might be taller or perhaps in a higher weight class, while punching down would be for an opponent who’s shorter or in a lower weight class." But its transfer to comedy doesn't appear to have been direct: the earliest relevant metaphorical uses found by Zimmer reflect power differentials in the contexts of British football (2002), then American politics (2006).

Sunday, August 27, 2023

An unusual polysemy in Algeria and its cultural background

Today I heard nsəhhlu? “Shall we head off?” The verb səhhəl expresses two rather different meanings: transitive “make easy” and intransitive “head off, leave”. The former is well-integrated into the lexicon: the verbal template BəCCəD regularly forms causatives from triliteral adjectives and verbs, and sahəl “easy” accordingly yields səhhəl, just as barəd “cold” yields bərrəd “make cool”. The latter is much less so: the root shl has no particular ties to motion. A colexification of “leave” with “make easy” is not cross-linguistically common (see CLICS), and a linguist encountering it in isolation in some wordlist would surely be at a loss to account for it.

It is not, however, arbitrary or accidental. The missing link can easily be found by going beyond the lexicon proper into the realm of politeness: a standard expression used by people staying behind to say goodbye to people leaving is ḷḷah ysəhhəl “may God make it [the trip] easy”. (Algerian Arabic etiquette is pretty much all about knowing which blessing to use when.) The intransitive meaning is therefore indirectly derived from the transitive one.

Knowing this, and knowing the extent of lexical-typological convergence in this region, one might predict that a similar colexification should be found in Kabyle. Sure enough, consulting Dallet (1982), one finds sahəl “leave on a trip; (God) make a trip easy”. He even records the corresponding blessing to a person departing on a trip: ad isahəl ṛəbbi, yəlli tibbura! “may God make it easy and open the doors!” Unfortunately, the verb is simply an Arabic borrowing rather than a calque properly speaking, although it’s based on a different verb template than the Dellys Arabic one.

Wednesday, November 03, 2021

Instrument nouns between Dholuo and Arabic

In Dholuo (a West Nilotic language of Kenya), instrument nouns are formed using ra-...-i (the final -i is dropped after sonorants and semivowels), as in the table below (Tucker 1993:111-112, retranscribed). Both English and Arabic have comparable formations. In English, instrument nouns are occasionally formed with the -er suffix, like agent nouns. In Arabic, instrument nouns are more systematically formed, but with a variety of different patterns, starting with mi-..., or in modern colloquials with a feminine agent noun CaCCaaC-a.

However, taking a look at the cases listed by Tucker, we may note a striking cross-linguistic difference in distribution. In Arabic, all but three of the translated nouns use an instrument noun pattern of some sort, and two of the others use a more general verbal noun pattern; only "ladder" appears completely underived. In English, "peg", "billhook", "pestle", "tongs", "lid" all seem to be underived and simplex, and for several cases with zero-derivation (notably "hoe", "rake", "drill", "sign"), intuition suggests that the verb derives from the noun, the opposite of what we see in Arabic or Dholuo.

This suggests a typological difference in the structure of the lexicon: perhaps some languages "prefer" to mark instrument nouns as such and to form them from corresponding actions, while some prefer simple instrument nouns from which verbs may be formed indicating the corresponding actions. I wonder whether that holds up on a larger sample? What does your language tend to do, dear reader?

cut toŋ-o قطع | billhook, cutter ra-tóŋ̂ منجل
slash bẹt-ọ مزّق | slasher rạ-bẹ́t-ị̂ منجل طويل
hoe pur-o عزق | hoe ra-púr̂ معزق
scratch gwạr-ọ خدش | forked rake rạ-gwạ́r̂ مدمّة
see ŋịy-ọ رأى | mirror rạ-ŋị́ị̂ مرآة
strain dhịŋ-ọ صفّى | strainer rạ-dhị́ŋ̂ مصفاة
pound yọk-ọ دق | pestle rạ-yọ́k-ị̂ مدقة
pierce cwọw-ọ ثقب | piercing instrument ra-cwọ́p-î مثقاب
hold mạk-o مسك | tongs rạ-mạ́k-ị̂ ممساك
plug up din-o سد | stopper ra-dín̂ سدّادة
hang ŋạw-ọ علّق | peg for hanging ra-ŋạ́ŵ علاّقة
cover um-o غطّى | lid, cover ra-úm̂ غطاء
show nyis-o أظهر | sign ra-nyís-î علامة
climb ịdh-ọ صعد | ladder rạ-ị́dh-ị̂ سلّم

Tuesday, October 22, 2019

Getting lost in the NW Sahara

Two languages of the northwestern Sahara, spoken reasonably close to each other, have basic motion verbs derived from a word that originally meant GET LOST. Let's see if we can figure out how that happened.

For COME, practically all Berber languages consistently use reflexes of the proto-Berber word *asəʔ. In the largest Berber variety, however - Tashelhiyt, in southern Morocco - this root has been lost, and a quite different verb is used: ašk (ⴰⵛⴽ). The original meaning of this verb can still be seen in other Berber languages, such as Tamasheq: GET LOST (a meaning which in Tashelhiyt has been replaced by what's probably a borrowing from Arabic جلا.) Presumably, GET LOST came to mean WANDER, and WANDER (over) came to mean COME.

In Songhay, GET LOST is *dere(y), preserved as such in most varieties. In Korandje in western Algeria, however - uniquely within the family - this root's reflex has undergone a very similar shift in meaning: dri now means GO. (Songhay speakers might assume this comes from dira WALK, but this word, from Proto-Songhay *dida, rather corresponds to Korandje zda WALK.) Meanwhile, Berber *aškəʔ GET LOST has itself been borrowed - probably from Tamasheq - as wuška GET LOST (the vowels reflect the Berber perfective form.)

In summary:

COMEGET LOSTGO
Tashelhiytaškžluddu
Tamasheqasaškăkk
Gao Songhaykaaderekoy
Korandjekawuškadri

Both changes can be summarized as GET LOST > BASIC-MOTION-VERB. Lexically, Korandje shows heavy influence from southern Moroccan Berber, much of which seems to match Tashelhiyt better than it does the Southern Tamazight varieties currently spoken closest to Tabelbala. That makes it rather tempting to seek a contact explanation. But if Korandje was copying a Tashelhiyt pattern, why would it replace GO rather than COME?

To make sense of what happened, I think we have to envision an intermediate earlier stage where WANDER (from GET LOST) was getting used as a generic verb of motion irrespective of direction in some (perhaps expressive) contexts. Both Tashelhiyt and Korandje require direction towards (and sometimes away from) the speaker to be expressed with a directional morpheme outside the verb root proper, so no ambiguity would necessarily result. From this situation, WANDER could end up replacing either COME or GO, while still maintaining the existing (seemingly superfluous) lexical distinction between the two by keeping the other root.

Now I think about it, British English offers a possible parallel for the initial stages of such a development, with particles substituting for the directionals of Berber and Songhay. In phrases like "he wandered over" ("he came over"), "he wandered off" ("he went away"), the original implication of aimlessness has faded away in informal usage to the point of being virtually absent. Should we expect some peripheral English dialect to replace "come" or "go" with "wander" altogether? Check back in a few centuries to find out...

Sunday, September 08, 2019

C. S. Lewis' criterion for prescriptivism

Prescriptivism - it's what linguists love to hate, and not without reason. So much of it is just a thin veil stretched over social prejudices. But could we have socially impartial, language-internal criteria for good and bad language change? C. S. Lewis, in Studies in Words (1960:6), proposes one:
This implies that I have a good idea of what is good and bad language. I have. Language is an instrument for communication. The language which can with the greatest ease make the finest and most numerous distinctions of meaning is the best. It is better to have like and love than to have aimer for both.

In the book, he makes some effort to use this to judge various changes in English lexical semantics: he deplores the loss of the old senses of "liberal" and "conservative" caused by their adoption as party political labels replacing Whig and Tory, but regards the change of "wit" from "genius" to its modern meaning as having happily made it a useful word.

What would his reaction have been to some of the changes in English that have occurred since? Applying his criterion strictly, he should have welcomed words like "vape" or "twerk" - new forms expressing previously unlexicalized meanings. (His probable reaction to their referents is another story!) "Irregardless" should have left him unmoved - a new (actually not that new) word for a meaning already expressed by "regardless" has no impact on the ease of making "the finest and most numerous distinctions of meaning" (and may make it easier for poets to fit their thoughts to the metre). The use of "literally" as a general intensifier, on the other hand, should have driven him up the wall - he specifically complains about "verbicide" through inflation, citing the comparable case of "awfully". In brief, whatever the merits of this criterion, it cannot consistently be used as a general-purpose attack on novelties; it forces the prescriptivist to consider them on a case-by-case basis.

Assuming such a criterion is accepted, the next move is predictable: someone somewhere is going to want to compare the merits of different languages on its basis. The problems with that should be obvious. Suppose language A makes finer and more numerous distinctions of meaning in one semantic field than language B, but in another semantic field the reverse is true (as is usually the case). How do you weigh the importance of different semantic fields in an impartial way? To make matters worse, many of the relevant distinctions of meaning are only going to be familiar to a handful of domain-specific experts; can we really consider them as properties of the language as a whole (whatever that even means)? A criterion like this makes more sense as a standard for measuring individual changes than as a metric for comparing entire languages.

Wednesday, November 30, 2016

Siwi vocabulary for addressing animals

Probably every language has a certain number of forms used especially for addressing animals, especially domestic animals. In response to a recent query by Mark Dingemanse, I gathered together all the ones I happened to have recorded for Siwi - the list below is definitely not exhaustive, but should at least be suggestive. Note the sounds used - clicks do not usually form part of Siwi phonology!

To chicks:
didididididi: eat!

To cats:
ərrrr: come!
ǀǀǀǀǀ: come!
pss: move!

To dogs:
ʘʘʘʘʘʘʘ: follow me!

To goats:
əšš: go!
ħəww: go!
xətt: go!
kškškškškš: eat!

To donkeys:
ǁǁǁǁ: giddy-ap! (?)

The interesting question here is: to what extent are these arbitrary, reflecting an emergent cross-species convention just as most human lexemes do, versus to what extent do they reflect innate properties of animal perception and communication? How do they compare to those you've encountered, if any?

Tuesday, June 24, 2014

From Figuig to Igli: Berber in the Algerian-Morocco borderland

The number of good Berber descriptive dictionaries has been slowly but steadily increasing in recent years, but Hassane Benamara's new Dictionnaire amazigh-français : Parler de Figuig et ses régions (Rabat: IRCAM, 2013), which I was lucky enough to be lent a copy of lately, is surely one of the best. Apart from being quite unusually large (800 pages), it incorporates examples, multiple senses, pictures of items difficult to describe, an appendix with encyclopedic information on culturally specific words such as festivals and childrens' games. It incorporates a few neologisms useful for schooling, but takes a fairly inclusive attitude towards Arabic loanwords. There are barely 15,000 people in Figuig, but, astonishingly enough, this is actually the second dictionary of Figuig Berber published by a native speaker; the first, Ali Sahli's معجم أمازيغي-عربي (خاص بلهجة أهالي فجيج) (Oujda: Al Anwar Al Maghribia, 2008), was a good effort, but is substantially shorter and used a less accurate transcription. (There's even another linguist from Figuig, Mohamed Yeou, threatening to make a third dictionary – if he goes ahead with the project, he'll have a high hurdle to clear.)

Across the border in Algeria, the situation is rather different. A number of towns across a wide area around Bechar and Ain Sefra speak Berber varieties closely related to that of Figuig, collectively imprecisely termed "Shelha". Some of them seem to be shifting to Arabic (on my latest trip, I was told that in Lahmar they had stopped speaking Berber with their children, and for Igli I had heard the same much earlier.) But little effort – and no official effort, as far as I know – is being made to document them. The only (very) partial exceptions of which I am aware are Igli and Boussemghoun.

For Igli (population 7000), I have already described the local Scouts' efforts to put together an online dictionary. More recently, however, I came across a laudable local attempt at approaching the problem academically: Fatima Mouili's The Berber Speech of Igli, Language towards Extinction. After a very brief summary of Igli grammar and phonology, unfortunately made frequently illegible by font problems, the author discusses the reasons for language shift. Corresponding to my impressions for the region, including Tabelbala, she cites emigration and the desire to ensure educational success as important drivers; others are more surprising, including the immigration of refugees expelled by the French from a nearby village during the Algerian War of Independence. Apparently, her thesis discusses similar issues, for those with 59€ to spare...

For Boussemghoun (population 4000), a few articles and a book by Mohamed Benali may be cited, all focusing – as far as I can see – exclusively on the sociolinguistic situation of Berber in the town. A local Berber-language poet billed as "the Ait Menguellet of Boussemghoun", Bashir Oulhaj, has a considerable presence on YouTube, eg here; he's even been interviewed, by Figuig News. It seems to be treated as the centre for Amazigh identity in the region; the HCA has even organised a symposium there. Nevertheless, little if any descriptive work has been published on its variety of Berber.

Taken together, there are probably more speakers of Berber in southwestern Algeria than in and around Figuig. Why the difference, then? Is it because linguistics is better represented in Moroccan universities than in Algerian ones? (Notwithstanding some interesting work coming out of Algeria, I think that is fair – it would be hard to think of any linguist working in Algeria with a profile comparable to Abdelkader Fassi Fehri, for example.) Or is it because the Amazigh movement in Morocco is less closely associated with one side in the "culture war"? (Benali observes that, while most Semghounis wanted Berber to be taught in schools, they rejected the installation of an HCA office due to distrusting their politics.) Or are there more specific, purely local factors explaining the difference? That would be worth a study in itself – though perhaps not as much so as the Berber varieties in question!

Thursday, December 26, 2013

Does Arabic have the most words? Don't believe the hype.

For some time, I've been hearing rumours (from Arabs, of course) that Arabic has the largest number of words of any language. Recently I found one vector for this rumour: Comparison of the Number of Words in Languages of the World, a poster put together by Azzam Aldakhil which has the merit of at least giving the sources for its figures, namely Muʕjam ʕAjā'ib al-Lughah by Shawqī Ḥamādah, 2000. (In a follow-up comment he gives the page numbers, 83-84.) This poster claims that "Arabic has 25 times as many words as English".

Unfortunately for this claim, if you go to the book cited, what you actually find is a calculation of the number of possible roots in Arabic, without regard to whether or not the root actually has a meaning. Such a count includes huge numbers of unused roots such as بزح bzḥ or قذب qḏb, while at the same time lumping together all words derived from the same root; كتاب book, كاتب writer, and مكتب office are three words, but only one root. The result of such a calculation might tell us something about the potential for expanding Arabic, but absolutely nothing about the state of the Arabic language. And since in practice both Arabic and the languages it is being compared to on that poster allow arbitrary long words without real roots, if only in loanwords, it doesn't even tell us much about its potential.

Both the number of Classical Arabic roots with actual meanings and the number of words can be estimated from the classic dictionaries: according to Sakhr's statistics, there seem to be around 10,000 roots, and up to 200,000 distinct words. Roots don't play such a major role in the lexicography of most non-Semitic languages, so it's difficult to compare the number of roots cross-linguistically. But in terms of words, that would be slightly fewer than English (250,000 in the OED, although the poster cites 600,000) and slightly higher than French (over 100,000 excluding proper nouns, according to the Académie Française).

However, such comparisons can hardly fail to be misleading. For one thing, English is much more hospitable towards dialectal and colloquial usages than Arabic is – the OED is full of words marked as Scottish or Northern or slang or whatnot, the equivalents of which would never be accepted by an Arabic dictionary. For another thing, the whole enterprise of counting words across languages runs into apparently insuperable problems, especially when it comes to compounds, which Arabic dictionaries do not normally treat as words. If you include compounds, then compound-friendly languages like German or Turkish or Inuktitut are automatically going to beat all the rest – and all the available statistics that I've seen for, say, English happen to include compounds.

So the best answer is that we don't really know, and that word count, even if we could measure it better, is not a very good measure of a language's expressive power anyway. Some missing words make a genuine difference, as I've discussed here before. But is English really missing out by not having distinct words for male camels (جمل) vs. female camels (ناقة)? Is Arabic really missing out by not having a special word for cornpone, or for scones?

Sunday, February 10, 2013

Kabyle vocab 1: Verbs of motion

I've been taking advantage of being in Paris to attend some Kabyle classes. However, the classes are in French - as are all the textbooks - and I find that I memorise vocabulary more easily when English equivalents are presented. So I'm going to experiment with writing up vocabulary lists and posting them online periodically, on the theory that these might be useful to Anglophone learners other than myself, and that putting them together will be good for my memory. For today, the theme will be verbs of motion. I find that knowing facts about a word's wider connections makes me more likely to remember it, but that may just be me, so if you don't, feel free to ignore them...

Go: ṛuḥ "go!", yeţṛuḥ(u) "he goes", iṛuḥ "he went". This verb, obviously, is borrowed from colloquial Arabic ṛuḥ (like its Siwi counterpart ṛuḥ, iteṛṛaḥ, iṛaḥ); it is quite commonly used, but there is a more purist alternative:

Go: ddu "go!", iṯeddu "he goes", yedda "he went". This verb is also used with the same meaning in Tashelhiyt; it's probably related to Tamasheq idaw, itidaw, ǎddew "accompany, go with". Example: Tom yebɣa ad yeddu ɣer Japun.

Come: as "come!", yeţţas "he comes", yusa "he came". This nearly pan-Berber verb is usually combined with the particle -d "hither (towards here)"; in Siwi, that particle has fused with the stem, yielding héd, itased, yused. Example: Yusa-d ɣer Japun asmi ay yella d agrud.

Pass: ɛeddi "pass!", yeţɛeddi / yeţɛedday "he passes", iɛedda "he passed". This verb, widespread in both Berber and dialectal Arabic, is from Arabic عدا "he passed", as the generally un-Berber ɛ betrays. Siwi retains fel, iteffal, yefla "pass / depart"; the rarer cognate verb (fel, yeffal, ifel) in Kabyle means "go over". Example: ɛeddaɣ fell-as deg wezniq.

Arrive: aweḍ "arrive!", yeţţaweḍ "he arrives", yebbʷeḍ (yuweḍ) "he arrived". Siwi instead uses an Arabic loan mraq, imerraq, yemraq; but it retains a causative of the original root, siweṭ. Example: aql-ik tuwḍeḍ-d zik.

Go up: ali "go up!", yeţţali "he goes up", yuli "he went up". The similarity to Arabic على is probably just a coincidence, since the Tashelhiyt equivalent is eɣli. Siwi uses an equally Berber but unrelated form wen, itewwan, yuna, also found in Tashelhiyt (awen); Kabyle retains a causative of this root, ssiwen "go up (eg road)", and a commoner noun, asawen "(up) a rising slope". Example: La ttalyeɣ isunan.

Go down: aḏer "go down!", yeţţaḏer "he goes down", yuḏer "he went down". Siwi again uses an equally Berber but unrelated form ggez, iteggez, yeggez, also found in Tashelhiyt (ggʷez). Example: La ttadreɣ isunan.

Go in: ḵcem "go in!", iḵeččem "he goes in", yeḵcem "he went in". The same verb is used in Tashelhiyt; Siwi uses a cognate form kim, itekkam, ikim. Example: Ttxil-k, kcem-d.

Go out: ffeɣ "go out!", iṯeffeɣ "he goes out", yeffeɣ "he went out". The same verb is used in Tashelhiyt. and (with a trivial regular vowel change) in Siwi f̣f̣eɣ, itef̣f̣aɣ, yef̣f̣aɣ. Example: Zemreɣ ad ffɣeɣ ad urareɣ?

Or, in a form more suitable for quick self-testing:

goṛuḥ
goddu
comeas
passɛeddi
arriveaweḍ
go upali
go downaḏer
go inḵcem
go outffeɣ

Comments and suggestions welcome, especially if you speak Kabyle!

Tuesday, October 25, 2011

Berber dictionary online

A link I've been meaning to post for a while: Amawal n Tiddukla Tadelsant Imedyazen. The guy behind it, Omar Mouffok, deserves credit for his efforts to document Kabyle dialects outside of the mainstream, like the one spoken near Blida; many entries indicate which regions the word is used in, though unfortunately a fairly impenetrable system of abbreviations is used. Translations into French, Spanish, and Arabic are given for some words, but many are only given definitions in Kabyle.

Monday, October 03, 2011

Songhay online

The Northern Songhay family is of some general interest, both for the study of language contact - all its members are astonishingly strongly influenced by Berber and/or Arabic, to the point that only a few hundred Songhay words survive and much of the grammar has been replaced - and for understanding the history of the Sahara (they suggest both that the spread of Songhay predates the Songhay Empire and that a Berber language different from Tuareg used to be spoken in much of Mali and Niger.) I've recently put together a sort of homepage for Northern Songhay linguistics: Northern Songhay. It includes a more or less complete bibliography.

Anyone interested in that will also be interested in a site I recently came across: Songhay.org, offering lexicographical data, lessons, software, and some references focused mainly on the Songhay of Gao (Koyraboro Senni.) I particularly appreciated the pictorial dictionaries under "Encyclopédie".

Wednesday, September 29, 2010

Small vocabularies, or lazy linguists?

In Guy Deutscher's new book The Language Glass (which I'll be reviewing on this blog sometime soon) he claims (p. 110) that "Linguists who have described languages of small illiterate societies estimate that the average size of their lexicons is between three thousand and five thousand words." This would be rather interesting, if verified - but this statement is not sourced at the back, and is in any case too vague (what counts as "small"?) to be relied on as it stands. Does anyone have any idea where he might have got this figure?

I haven't found his source, but Bonny Sands et al's paper "The Lexicon in Language Attrition: The Case of N|uu" gives a nice table of Khoisan dictionaries' sizes, ranging from 1,400 for N|uu to < 6,000 for Khwe and 24,500 for Khoekhoegowab. She prudently concludes "The correlation between linguist-hours in the field and lexicon size is so close that no conclusions about lexical attrition can be drawn" - the outlier, Khoekhoegowab, is not only the biggest of the lot (with over 250,000 speakers), but had its dictionary written by a team including a native speaker over the course of twenty years. Given that "2,000 - 5,000 word forms (in English) may cover 90-97% of the vocabulary used in spoken discourse (Adolphs & Schmitt 2004)", it is not surprising that it should take disproportionately long to move beyond the 5,000 word range. However, she also points out that "Gravelle (2001) reports finding only 2,300 dictionary entries in Meyah (Papuan) after 16 years of study", suggesting that some languages may simply have unusually small vocabularies. Along similar lines, Gertrud Schneider-Blum's talk Don’t waste words – some aspects of the Tima lexicon suggested that the Tima language of Kordofan had an unusually small number of nouns due to extensive polysemy and use of idioms (I can't remember any figures, nor indeed whether she gave any.)

I'd be interested to see other discussions of the issue of differences in lexicon size and explanations for them. My Kwarandzyey dictionary (in progress) so far stands at about 2000 words - it would be encouraging to think that I might already have done more than half the vocabulary, but I very much doubt it!