Sunday, December 10, 2017

Jerusalem's suppletive gentilic

Jerusalem stands out among Arab cities today not only culturally and religiously, but morphologically as well. In Modern Standard Arabic, the city of Jerusalem is al-Quds القدس, and the gentilic suffix is (properly -iyy), but "Jerusalemite" is Maqdisī مقدسي rather than the expected *Qudsī (though the latter is attested as a personal name). As a general cross-linguistic rule of thumb, morphological irregularities are most likely with older, more basic words. Yet this type of irregularity is rather unusual, even among the region's oldest and most prominent cities: Dimashq (Damascus) yields Dimashqī (Damascene), Baghdād yields Baghdādī, Makkah (Mecca) yields Makkī... How did it arise?

It turns out that, in the early Muslim era, it was formed in a perfectly regular way. In his masterwork, the medieval geographer Al-Maqdisī (d. 991) calls his hometown Bayt al-Maqdis بيت المقدس ("house of holiness"), a title now largely supplanted by al-Quds ("the holy"). It survives to the present in certain religious contexts or as a poetic synonym, not only in Arabic but in Kabyle Berber as well: H. Genevois ("Croyances") notes a traditional popular belief that the souls of the dead gather in Bit Elmeqdes, corresponding exactly to Al-Maqdisī's boast that Jerusalem is "the site of the Day of Judgement, and from it is the Resurrection, and to it is the Gathering" (عرصة القيامة ومنها النشر وإليها الحشر).

A quick search of Alwaraq's heritage library suggests that the shorter name "al-Quds" became popular around the period of the Crusades, when Jerusalem was as much a subject of dispute as now. The earliest attestation I can spot on a cursory search (excluding a work falsely attributed to al-Wāqidī) is a mention by the Andalusi traveller Ibn Jubayr (1185), who notes that "between [Kerak] and al-Quds is a day's march or so, and it is the best location in Palestine" (بينه وبين القدس مسيرة يوم أو اشف قليلاً، وهو سرارة أرض فلسطين). Very likely a longer search would yield slightly older attestations. By the time of the next major Palestinian writer I notice in the collection - Al-Ṣafadī (d. 1363) - al-Quds had clearly become the unmarked term for the town; it recurs constantly in his work.

The name Bayt al-Maqdis was thus replaced in practice by the shorter and catchier name al-Quds a good 800 years ago, yet the corresponding gentilic continues to preserve the older name. Since 1967, the Israeli government has imposed a third name as its official term for the city in Arabic: Ūrshalīm, a transcription of the Syriac name used in Christian liturgical contexts which provoked "furious ridicule" from residents (Segev 2007:492). Since this usage remains entirely unknown to most Arabic speakers, it is unlikely to have much impact on Arabic usage. Yet the timing of the shift from Bayt al-Maqdis to al-Quds reminds us that political upheaval impacts placenames as well as people's lives.

Monday, December 04, 2017

Tifinagh and place of articulation

The order of the Latin alphabet we use is a matter of historical chance; if it ever made sense, the reasons behind it were lost millennia ago. Many other writing systems, however, have tried to order their letters in a less arbitrary fashion. The most prominent successes for this approach are found in and around India, where scripts are usually ordered by place of articulation - ie, by how far back in the mouth they are pronounced - as in Devanagari: a..., ka ga kha gha ŋa, ca cha ja jha ña, ṭa ṭha ḍa ḍha ṇa... (After a couple of sound changes, this order ultimately also yields that of the Japanese kana: a, ka, sa (< ca), ta na, ha (< pa) ma, ya ra wa n.) In Arabic, the normal order of letters reflects a partial reordering by shape rather than by sound (thus ب ت ث are all grouped together, whereas in the older order they were far apart from one another). However, for technical purposes such as traditional phonetics and Qur'an recitation, one occasionally also finds the place-of-articulation order: indeed, the earliest Arabic dictionary (Kitāb al-`Ayn) used it (ع ح هـ خ غ ق ك ج ش ض ص س ز ط ت د ظ ذ ث ر ل ن ف ب م و ي ا ء).

Tifinagh, the traditional script of the Tuareg people of the Sahara, seems not to have any established traditional ordering. However, if you organize its letters by place of articulation, an obvious pattern emerges:

This table represents Tifinagh as used at Imi-n-Taborăq in Mali, as recorded by Elghamis (2011:64-65). (Note that w is a labio-velar sound; for obvious reasons, I've chosen to place it in the velar column rather than the labial one. Also, the letter put in the laryngeal plosive slot actually just indicates the presence of a final vowel, although there are reasons to suspect that it once represented a glottal stop.) There is a lot of regional variation in Tifinagh, but one thing stands out: in every variety, everything on the right side of the thick line - ie, everything velar or further back - is consistently formed exclusively out of dots, except for g - and even that is often composed of a combination of dots and lines. Throughout much of Tuareg, original g tends to be palatalized to [ɟ], and some dialects - like this one - have lost the distinction altogether.

How this distribution emerged is unclear for the moment. It is noteworthy, however, that dot letters did not exist in Tifinagh's ancestor, Libyco-Berber as used in the pre-Roman and early Roman periods (with rare, doubtful exceptions). Two of the dot letters have clear Libyco-Berber origins; ⴾ (k, three dots in a triangle) was originally ⥤ (k, a rightwards open arrow), while : (w) was originally =. Based on these two alone, one might suppose a sort of regular form shift of = to :, in which case the development might simply be coincidental. ⵗ (ɣ) may derive from the rarely attested ÷, whose value (q?) is speculative, while ... (x) is simply a rotation of ɣ. :: (q) had no Libyco-Berber equivalent, and is perhaps historically a visual "ligature" of ɣ and + (t) - the word-final cluster *ɣt becomes qq in Tuareg. The final vowel sign · might derive from classical ☰, which had the same function; alternatively, one might derive it from or the dot occasionally used to separate words, and suppose that classical ☰ actually yielded ⵂ (h), in which case the extra dot needs to be explained.

It's not impossible that Tifinagh users at some stage made a conscious link between back consonants and dots. But even if the distribution is just a coincidence, it should still be useful for anyone seeking to memorise the script.

Sunday, October 29, 2017

Butterfly-collecting: the history of an insult

Chomsky's barb about butterfly-collecting has echoed in the ears of descriptive linguists for decades, and is sometimes blamed for the withering away of field linguistics over the late 20th century. The earliest published version I could track down via Google is:
"You can also collect butterflies and make many observations. If you like butterflies, that’s fine; but such work must not be confounded with research, which is concerned to discover explanatory principles of some depth and fails if it does not do so." (Chomsky 1979:57)
So I was surprised to find a similar statement attributed to the eminent early 20th century physicist Ernest Rutherford, quoted by Dyson (2006:179) as saying "Physics is the only real science; the rest are butterfly-collecting." How did this metaphor make its way into linguistics?

For a start, it appears that Dyson's version is somewhat inexact. The Rutherford quote appears to belong to the oral tradition of physics, rather than deriving from any publication of his; the earliest version that I can find on Google Books is from Baker (1942:96):

"These ideas are crystallized in the statement, attributed to Rutherford, that science consists of physics and stamp- collecting. This is an epigram intended to mean that particular objects are uninteresting : it is the extreme view-point of a general analytical scientist."
The shift from stamps to butterflies came decades later, first attested only in 1974. In fact, the derisive comparison to butterfly collecting seems likely to have seeped into linguistics not from physics but from, of all subjects, anthropology. Edmund Leach (1961:2) makes it the central metaphor of his assault of Radcliffe-Brown:
"Radcliffe-Brown maintained that the objective of social anthropology was the 'comparison of social structures'. [...] Comparison is a matter of butterfly collecting — of classification, of the arrangement of things according to their types and subtypes. The followers of Radcliffe-Brown are anthropological butterfly collectors and their approach to their data has certain consequences."
Anthropologists would reuse the metaphor in debates over the distinction between different types of comparison in linguistics itself, whether endorsing it like Lehman (1964:387) or rebutting the criticism like Sarana (1965:29). From there it seems to have been taken up by Chomskyan linguists as an argument against Bloomfield's "disovery procedures", if I am correctly interpreting the incomplete fragment of Ferber and Lynd (1971) that I can find on Google Books:
"These procedures, which are largely a matter of classification, have been uncharitably called "butterfly-collecting" in the manner of pre-Darwinian biology: they account for a detailed "external" description of each language (what Chomsky [...]"
Geoffrey Leech (1969:4) deploys the same metaphor against rhetoric:
"Connected to this is a second weakness of traditional rhetoric - what I am tempted to call its 'train-spotting' or 'butterfly-collecting' attitude to style. This is the frame of mind in which the identification, classification and labelling of specimens of given stylistic devices becomes an end in itself [...]"
The redeployment of this argument to belittle descriptive work in general, rather than particular approaches, seems to be attributable to David DeCamp (1971:158), criticizing sociolinguistics from a Chomskyan perspective:
"The weakest theory is a 'functional' model, which only relates outputs from the black box to inputs, e. g. a grammar which would generate all and only the sentences of a language; the goal of much scientific research is to replace such a functional model with a 'structural' model, one that makes the stronger claim of describing what is actually in the black box. Mendel's 'genes' were only a functional model of genetics; the research on the DNA and RNA molecules has yielded a model that is much more nearly structural. Thus one branch of biology has at last become a true science; general linguistics is approaching that status; sociolinguistics is still in the pre-theoretical, butterfly-collecting stage, with no theory of its own and uncertain whether it has any place in general linguistic theory."
He then clarifies (ibid:170) that:
"'Butterfly collecting' is simply the collection of a whole lot of information toward the day when somebody can produce a formal theory. Now this is valuable, this is useful. We need a lot of empirical data collection also. I certainly would not want to imply by this that in this I'm saying that there is not an importance to the kinds of things that the Urban Language Survey is doing at CAL, or Bill Labov's work in New York. This is immensely important. What I am saying is that although it is necessary, it is not sufficient. We've got enough data now; it is about time to guide further research by means of some sort of a theory."
So, if we have to blame one person for reducing descriptive linguistics to butterfly collecting, it looks like it would be David DeCamp, at least until someone tracks down an earlier citation. But that misses a broader point: the disparaging comparison of data gathering to butterfly collecting seems to have become rather pervasive across a variety of disciplines in the late 20th century - including biology itself, which may well be part of where DeCamp got it from. All the way back in 1964, Theodosius Dobzhansky - who had been an ardent butterfly collector before becoming a prominent evolutionary biologist - comments sarcastically that:
"The notion has gained some currency that the only worthwhile biology is molecular biology. All else is "bird watching" or "butterfly collecting." Bird watching and butterfly collecting are occupations manifestly unworthy of serious scientists!" (Dobzhansky 1964:443)
Had he lived to see molecular biology turn to such quintessentially descriptive, list-making pursuits as the Human Genome Project, he would surely have enjoyed having the last laugh.

(If you have any earlier citations bearing on the history of this metaphor in linguistics, please tell me below!)

Tuesday, October 24, 2017

Siwi on Wikipedia

I am not a big fan of Wikipedia, despite its usefulness. To contribute good material to it - and there is a lot of wonderful material there - is to make an article look reassuringly reliable. That appearance of reliability then makes the article prime prey for anybody with an ideological or even commercial agenda to push: one little edit, and their propaganda is integrated into the same text, gaining credibility from its context, and getting copied over and over and over. Nevertheless, the insistent niggling itch of knowing that "someone is wrong on the internet" eventually got to me, and last month I ended up massively expanding the article Siwi language - including a fairly extensive section on Siwi oral literature. Suggestions or comments are welcome, although I make no promises.

Thursday, October 12, 2017

Shoes in Songhay and West Chadic: towards an etymology

The proto-Songhay word for "(pair of) shoes, sandals" is *tàgmú (Zarma tà:mú, Kandi tà:mú, Gao taam-i, Hombori tà:mí, Kikara tă:m, Djenne taam, Tadaksahak taɣmú, Korandje tsaɣmmu). It is evidently related to a less widely attested verb *tàgmá "step on" (Zarma tà:mú, Gao taama, Hombori tà:mà, Djenne taam). (Velar stop codas are lost in all of Songhay except the Northern branch, leaving behind either compensatory lengthening or a w; see Souag 2012.)

In Hausa, the word for "shoe, boot, sandal" is tà:kàlmí: (borrowed directly into the Songhay (Dendi) variety of Djougou as tàkăm). Within Hausa, this likewise corresponds to a verb tá:kà: "step on". The two-way similarity is striking, but if there was borrowing, which way did it go? A cognate set in Schuh (2008) casts some light on the question.

Hausa belongs to the West Chadic family, in which the best comparison to Hausa "shoe" seems to be Bole tàkà(:), with no obvious cognates within its own subgroup, Bole-Tangale (Ngamo tà:hò looks similar, but Ngamo h seems normally to correspond to Bole p, not k.) For "step on", however, Schuh points to a potential cognate set in a slightly more distantly related West Chadic subgroup, Bade. In this subgroup, we have Gashua Bade tà:gɗú, Western Bade tàgɗú, Ngizim tàkɗú which Schuh analyses as *tàk- plus an unproductive verbal extension -ɗu supported by Bade-internal evidence, eg tə̀nkùku "press" vs. tə̀nkwàkùɗu "massage". Within Bole-Tangale, one might speculate that Gera tàndə̀- is cognate, but Gera seems to be known only from short wordlists, so that would be difficult to show.

So the comparative evidence provides some support for the idea that Hausa tá:kà: "step on" goes back to proto-West Chadic. If tà:kàlmí: "shoe" could be regularly derived from this verb within Chadic, then the answer would appear clear: Songhay borrowed it from Chadic. However, while Hausa frequently forms deverbal nouns with a suffix -i: (Newman (2000:157), there seems to be no plausible language-internal explanation for the -lm-. In Songhay, on the other hand, a suffix -mi forming nouns from verbs (sometimes -m-ey with a former plural suffix stuck on) is reasonably well-attested: Gao (Heath 1999:97) dey "buy" vs. dey-mi "purchase (n.)", key "weave" vs. key-mi "weaving", Kikara (Heath 2005:97-98) kà:rù "go up" vs. kàr-mɛ̂y "going up", húná "live" vs. hùnà-mɛ̀y "long life". A shift *-mi to *-mu seems natural enough, especially since a few Songhay varieties actually have reflexes of "shoe" with a final -i in any case; so the Songhay form looks kind of like it could be **tàg "step on" plus deverbal -mí̀. To top it off, deverbal noun-forming suffixes in -r- are widely attested in Songhay, and Zarma attests a combined suffix -àr-mì: zànjì "break" vs. zànjàrmì "shard", bágú "break" vs. bàgàrmì "piece of debris" (Tersis 1981:244). If we treat the Hausa form as a borrowing from Songhay, we can then analyse it as **tàg "step on" plus deverbal -àr-mí. But before we get carried away, we should note that within Songhay there's no motivation for analysing the -mu / -mi in "shoe" as a suffix; the verb and the noun differ (if at all) only in the final vowel.

So what to make of all this? So far, the scenario that suggests itself is something like the following:

  1. Songhay borrows a verb *tàk "step on" from West Chadic (or vice versa?).
  2. Songhay internally forms a deverbal noun *tàk-mí "shoe" (there is no reconstructible contrast between *k and *g in coda position in proto-Songhay), alongside a variant *tàk-àr-mí.
  3. Hausa borrows this as tà:kàlmí:.
  4. Songhay replaces *tàk with a denominal verb formed from "shoe" (which becomes internally unanalysable): *tàgm-á. This step has possible internal motivations: in most of Songhay, final velar stops disappeared leaving behind only compensatory lengthening on the preceding vowel, and the resulting form tà: would have been homophonous with the much commoner verb "receive, take".
  5. Djougou Dendi, a heavily Hausa-influenced, somewhat creolized Songhay variety spoken in Benin, borrows the Hausa form as tàkăm.

Further Chadic comparative data may yet turn out to bear upon this etymology, but one thing seems clear: these two families have been affecting each other for a long time.

Friday, September 15, 2017

Berber and not so Berber words in Tunisian Arabic

Not too long ago I finished reading Lotfi Sayahi's Diglossia and Language Contact: Language Variation and Change in North Africa. The book is a valuable contribution to the study of synchronic language contact between Tunisian Arabic, Standard Arabic, and French in Tunisia, with some coverage of the rest of the region as well. Unfortunately, when it briefly looks at Berber lexical influence on Arabic (pp. 135, 187), reflecting joint work with Zouhir Gabsi, its conclusions are rather over-hasty. Since this book is likely to become a standard point of departure for English speakers studying language contact in North Africa, I think it's worth correcting the record here even at the risk of being pedantic:
  • fakru:n "turtle" and ferzazzu "wasp" really are Berber, though the -u:n suffix in the former was first added in dialectal Arabic (almost all Berber varieties have forms similar to Kabyle ifker/ikfer).
  • garžu:ma "throat" is a very difficult word to etymologize, but may ultimately be Berber (compare Tuareg a-gurzăy), although it does bring to mind Romance forms such as French gorge.
  • karmu:s "fig" is clearly derived from karm-a "fig tree", which is definitely not Berber, and seems to come from a narrowing of the meaning of Classical Arabic كرم karm "orchard" (see the brief discussion in Behnstedt & Woidich 2011:491). The suffix -u:s might theoretically be Berber, I suppose, but probably not; it's not widely attested across Berber, and it fits well with the widespread dialectal Arabic pattern of augmentatives in -u:-.
  • sebsi: "pipe" is from Turkish sipsi.
  • bu-telli:s "monster/nightmare" ("sleep paralysis", to be precise) is a compound involving bu- "possessor of" (originally "father of") plus telli:s (a kind of rug). The latter is well-attested within Arabic in the Middle East as well as in North Africa; its etymology is controversial, but it may derive from Latin trilicium "triple-twilled fabric".
  • ḍabbu:ṭ "axilla" (ie "armpit") is evidently an expressive formation from Arabic إبط 'ibṭ. The widespread Berber word for this is rather taddeɣt (from which we get Maghrebi Arabic dəɣdəɣ "tickle").
  • dagdag "to shatter" is a reduplicated form from Arabic دقّ daqqa "pulverize".

I don't have the time to check the rest of the reduplicated verbs he cites (tartar "to mutter", dardar "to muddy", maxmax "to nibble", maṣmaṣ "to rinse", sɛksɛk "to flow", tɛftɛf "to graze", and wɛdwɛd "to talk nonsense"), but maxmax and maṣmaṣ include phonemes with no regular proto-Berber sources, and I doubt any of them is really Berber in origin.

I don't mean to pick on the authors; notwithstanding this brief lapse, it's a good book, and worth reading. But I do want to hammer home to every linguist the message that etymology needs to be done properly. If you want to do etymology in a North African dialect, don't just assume that any word you don't recognize from Modern Standard Arabic or French is a Berber loanword; check other regional languages (especially Turkish), check existing publications on the subject, check the distribution of the word across different Berber and Arabic varieties. Etymology may not be a very trendy subject, but that doesn't mean it's easy.

Monday, August 28, 2017

Street math and diglossia

In "Mathematics in the streets and in schools" (Carraher et al. 1985), child street vendors were given a paper and pencil and asked to calculate multiplications that they had, in fact, already done in their heads in the course of selling their wares. The results were often sobering, as in the following case:
Informal test
Customer: OK, I'll take three coconuts (at the price of Cr$ 40.00 each). How much is that?
Child: (Without gestures, calculates out loud) 40, 80, 120.

Formal test
Child solves the item 40 x 3 and obtains 70. She then explains the procedure 'Lower the zero; 4 and 3 is 7'.

As you can see, the children were perfectly capable of doing (some!) multiplication their own way, but when faced with school-style problems, this ability frequently deserted them. Confronted with a piece of paper, they attempted to apply the algorithm they had learned at school, without so much as checking their answers against the algorithm they had mastered as part of their daily life. In daily life, conversely, they presumably weren't getting much out of the multiplication algorithm they had learnt at school, even though it would let them tackle a much wider range of multiplication problems. School-learning that stays at school, and never affects real life despite having an obvious potential to be useful there: it's an educator's nightmare.

What this immediately reminded me of is diglossia. In a schoolroom or an essay, you obediently attempt to use Standard Arabic, and all the grammatical rules and vocabulary you learned for it. Almost anywhere else, you carefully avoid it, even while claiming to accept that Standard Arabic is correct and that what you actually make very sure to speak is wrong. To me, that seems to send a fundamentally problematic message: that what you learn in school is not supposed to be useful outside of some limited institutional contexts. I hope that's not the message most people get from it, but it would be great to know for sure. I don't suppose anyone knows of a study addressing the question?