Wednesday, April 27, 2011

An atom's weight of philology

One of the oldest motivations for studying the history of language is to better study the fixed texts of holy books or classics. We try to learn from such texts, but without an understanding of philology we misread them - because, while the words have remained the same, their content has changed. Ibn Quraysh is one case in point; Ruskin offers another:
"[I]n languages so mongrel of breed as the English, there is a fatal power of equivocation put into men's hands, almost whether they will or no, in being able to use Greek or Latin words for an idea when they want it to be awful [ie impressive]; and Saxon or otherwise common words when they want it to be vulgar… [C]onsider what effect has been produced on the English vulgar mind by the use of the sonorous Latin form "damn", in translating the Greek katakrínō, when people charitably wish to make it forcible; and the substitution of the temperate "condemn" for it, when they choose to keep it gentle; and what notable sermons have been preached by illiterate clergymen on - "He that believeth not shall be damned"; though they would shrink in horror from translating Heb. xi. 7, "The saving of his house, by which he damned the world"… "
Standard Arabic has no layer of prestige loanwords corresponding to Greek and Latin words in English - all the classics of the Arab world are themselves in Arabic, and great efforts have been expended to keep the grammar of Standard Arabic roughly constant since the pre-Islamic era. But, thanks to the many new meanings conferred upon old terms during episodes of massive translation - both in the modern era and the Abbasid era - it is fairly susceptible to another of Ruskin's complaints: misinterpreting the words of old texts thanks to their modern meanings.

Once a medical student at Cambridge told me in all seriousness that the Qur'ān anticipated modern science by centuries in mentioning the "atom" (فمن يعمل مثقال ذرة خيرا يره, "for he who does an atom's weight of good shall see it")! Of course, every modern educated Arab knows that a ذرة dharrah is an atom. But looking at a pre-modern dictionary, such as Lisān al-`Arab, gives a rather different picture: a dharrah then was a type of small red ant, a weight equivalent to 1/100 of a barley grain, or a mote of dust (as seen in sunbeams), not an elementary particle of which all matter is composed. In parts of Sudan the first of those meanings is still in regular use: dirr there means a type of ant. But elsewhere they all seem to have faded from away from popular speech.

If I were interested in an English word, I could easily look it up in the OED and find a complete history of its different meanings and the dates at which they were attested. But for Arabic no such dictionary exists; to figure out when and how dharrah came to mean "atom" in the modern sense, I would have to look through a bunch of pre-modern works, or find an article on the subject. It's a gap that would be well worth filling.

Thursday, April 14, 2011

Why *h1 and *h2 were not valid onsets in late proto-Berber

I've been working on my hopefully-forthcoming book about Siwi and thinking more about Berber laryngeals (see also Phoenix's recent post), two tasks that intermesh rather handily. Now Siwi has a wide range of strategies for forming the intensive (ie, in Siwi, the realis imperfective) of verbs, not obviously related to one another. But it is usually possible to predict which will be used from the form of the root. Basically, to recap the relevant page of my thesis, ignoring the fəl verbs discussed in the previous post and some other synchronic irregularities (U=consonant or full vowel; either count as a unit of the root):

Prefix t-:
- to geminate-initial roots
- to roots with the mediopassive prefix ən-
- to vowel-initial roots
- to vowel-medial (CVC) roots
Geminate U2:
- when U1 and U2 are distinct consonants, and U2/U3 is final
Put -a- after consonantal U3, changing any previous full vowels to a:
- when the last two units are distinct consonants (unless geminate-U2 / prefix-t applies), or
- when U2 is a full vowel (in which case prefix-t also applies)
Suffix -u:
- to geminate-final roots

Can we further simplify these conditions? In particular, what do the rather disparate environments to which t- is prefixed have in common?

Well, Siwi, like most Berber languages, shows the so-called “mobile schwa” phenomenon – ie, the position of schwa is mostly predictable solely from the consonants and long vowels of the word. (Basically, you put a schwa between any two adjacent consonants followed by a consonant or word boundary, starting from the left cyclically.) This also means that the coda/onset status of a given consonant in a stem is predictable, and depends on the affixes – for example, the k is a coda in əktər “bring!”, but an onset in kətr-ax “I brought”. However, there are a few exceptions to this principle – clusters that cannot be broken up by schwa, or, equivalently, codas that do not become onsets. These include:
- geminates: geminates cannot be broken up by schwa, and the first element of a geminate is always a coda.
- mediopassive ən-: the cluster ən+C that it forms cannot be broken up by schwa, and the n is always a coda (except before the borrowed voiced pharyngeal ʕ.)

Full vowels are by definition not onsets (semivowels behave quite differently from full vowels in Siwi.) So we can reduce the first three conditions for t- to a single one: t- is used when the first element of the root is not an acceptable onset. The fourth condition seems to be separate.

The use of t-/tt- under two of three of the conditions we have unified is reconstructible for proto-Berber (mediopassive ən-, or at least its syllabic structure, is a borrowing from Arabic), so it would be reasonable to reconstruct the No-Onset condition for proto-Berber too. Geminate-initial roots were clearly already geminate-initial in late proto-Berber (although Prasse, probably correctly, reconstructs them as *w-initial for pre-proto-Berber.) However, vowel-initial roots come from at least two sources: roots with vowel length (pre-proto-Berber h?) and roots with a glottal stop. The distinction is preserved in Zenaga, and t- shows up there in both cases. And, as it happens, Zenaga only allows the glottal stop in coda position. So it seems probable that late proto-Berber too allowed the glottal stop only in coda position.

Saturday, April 02, 2011

In search of the missing radical: a piece of Berber historical morphology

Berber normally has no glottal stops (ء = ʔ) – in fact, Chafik suggested that this was why North Africa favours the Warsh reading of the Qur'an, in which most glottal stops are omitted. However, it turns out* proto-Berber did have glottal stops - and you can still see their footprints on the verbal system.

Berber languages normally have three basic aspect/mood forms:
  • the “aorist” (or “simple imperfect”), used mainly for hypothetical events (“eat!”, “I will eat”, “I would eat”...);
  • the “preterite” (or “simple perfect”), used mainly for past events conceived of as wholes (“I ate”, “I have eaten”);
  • the “intensive” (or “intensive imperfect”), used for events ongoing at the time being referred to, irrespective of tense (“I eat”, “I am eating”, “I was eating”, “keep eating!”)
Usually, you can predict the preterite and intensive from the aorist. For three-consonant roots – eg lmd “learn”, a widespread Phoenician loanword – this is how it works in Tuareg (Tahaggart):
  • Aorist: ǎlməd “learn!”
  • Preterite: (y)-əlmǎd “(he) learned” (change the vowel pattern)
  • Intensive: (i-)lammǎd “he is learning” (double the middle consonant)
Tuareg has kept a distinction between two short vowels, ǎ and ə; but most varieties have just merged the two, so there is no difference in three-consonant roots between the aorist and preterite. So in Siwi, for example, you get:
  • Aorist: əlməd “learn!”
  • Preterite: (y)-əlməd “(he) learned”
  • Intensive: (i)-ləmməd “he is learning”
(Students of Akkadian/Assyrian/Babylonian will be getting a sense of déjà vu now...)

But some verbs have two consonants rather than three. Looking at Siwi I noticed that, if the verb had two consonants and no long vowels, there seemed to be two possibilities for the intensive, not just one; contrast:
  • Aorist: fəl “leave!”
  • Preterite: (y)-əfla “(he) left”
  • Intensive: (i)-təffal “he is leaving”
vs.
  • Aorist: ləs “wear!”
  • Preterite: (y)-əlsa “(he) wore”
  • Intensive: (i)-ləss “he is wearing”
So why the split?

Well, looking at the intensive forms, you see that in fəl you double the first consonant, while for ləs you double the second one. If you wanted to try to relate these to three-consonant verbs, you might think of something like:
- fəl < *Xfl
- ləs < *lsX

But if you look at Siwi on its own, there seem to be a lot of problems with this idea: in particular, why would the preterite of fəl end in -a?

Looking wider provides some answers. It turns out that in Tuareg – like Kabyle, and Tashelhiyt, and Ghadamsi, and a few other varieties – these verbs are distinct in the preterite too, and they are distinguished in exactly the way you'd expect from that little piece of internal reconstruction:
  • Aorist: əfəl “leave!”; əǵən “kneel!”
  • Preterite: (y)-fǎl “(he) left”; (y)-ǵǎn “(it) knelt”
  • Intensive: (y)-ffal “he is leaving”; (y)-ǵǵan “it is kneeling”
vs.
  • Aorist: ǎls “wear!”; əsəl "hear!"
  • Preterite: (y)-lsa “(he) wore”; (y)-sla "he heard"
  • Intensive: (y)-lass “he is wearing”; (y)-sall "he is hearing"
It's just that in Siwi – and Mzabi, and Chaoui, and Tarifit, and all the other Zenati Berber languages – the preterites of these two verb classes are merged, so they both end in -a. So our internal reconstruction is looking good... but what consonant might have been lost?

Zenaga, the Berber language of Mauritania, gives us part of the answer. In Zenaga, they look like this:
  • Aorist: ägun “kneel!”
  • Preterite: (y)-ugän “(it) knelt”
  • Intensive: (y)-uggan / (yə)-ttugun “it is kneeling”
vs.
  • Aorist: ätyši “wear!”, ätyšaʔ-m “wear! (to a group)”
  • Preterite: (y)-ityša “(he) wore; ityšäʔ-n “they wore”
  • Intensive: (yi)-yässä “he is wearing”; yässäʔ-n “they are wearing”
Notice that glottal stop ʔ that shows up when you add a consonant. That isn't automatic in Zenaga: contrast y-ugrah “he heard”, ugrān “they heard”. So it looks as though the original conjugation of “wear” was something like:
  • Aorist: *ǎlsəʔ “wear!”
  • Preterite: *(y)-əlsǎʔ “(he) wore”
  • Intensive: *(yə)-lassǎʔ “he is wearing”
We can also see that the missing first consonant in verbs like fəl, if they had one, was not ʔ – as far as I know, no Berber language has preserved evidence of what it may have been. (The t showing up in Siwi is probably not original, but rather borrowed from the intensive of vowel-initial roots.)

But there's still a problem here: why is *-ǎʔ reflected differently in the intensive vs. the preterite? A full answer for that would require a look at reflexes of the glottal stop in general, not just in the verbal system. But in several Berber languages, in fact, it's reflected identically. Compare, from opposite ends of the Berber world:

Tashelhiyt (southern Morocco):
  • Aorist: ls “wear!”
  • Preterite: (i)-lsa “(he) wore”
  • Intensive: (i)-lssa “he is wearing”
Awjila (eastern Libya):
  • Aorist: əsəl “hear!”
  • Preterite: (yə)-sla “(he) heard”
  • Intensive: (i)-səlla “he is hearing”
Clearly, Tashelhiyt and Awjila are not likely to form a subgroup! So my tentative interpretation would be that the form with -a is regular, and the form without -a found in Siwi, and Tuareg, and Kabyle, and almost every other Berber language between southern Morocco and Awjila is analogical – the intensive is always formed from the aorist, and it must have felt wrong to have one that looks as though it's based on the preterite. I've been looking at the always problematic subgrouping of Berber lately, and this would have interesting implications for that – it would suggest that Kabyle is more closely related to Zenati than to Moroccan Atlas Berber, since they share this innovation. But in Berber a lot of innovations seem to have spread areally, so it's scarcely conclusive.

* (All but the last bit of this post is an introductory summary of work by Prasse, Kossmann, and Taine-Cheikh that I've recently been digesting. It offers an interesting small-scale parallel to the story of Saussure's laryngeals.)