Thursday, November 27, 2014

Berber subclassification: Reading Nait-Zerrad

Kamal Nait-Zerrad's 2001 article "Esquisse d'une classification linguistique des parlers berbères" presents a good deal of useful data, but does so in a manner that I find makes it rather difficult to figure out what's going on without plenty of pencil work. In case anyone else has the same experience, here's my take on it. I will not focus on, or even necessarily present, his interpretation here - read the article for that; rather, I'm more interested in figuring out the implications of the data he presents in the light of other work before and since, and in the light of accepted principles of historical-comparative linguistics.

First, he looks at a number of morphological and phonetic isoglosses:

1. The 3rd person singular preterite of CC verbs: yərra vs. yərru. Following Kossmann (2001), we now know that these are actually CC+glottal stop, so the data exemplifies two different sound changes: the relatively trivial *-aʔ > -a, and and the more surprising *-aʔ > o > u. The former is the commonest outcome; the latter is exemplified by: Ait Seghrouchen, Figuig, Beni Snous, Bissa, Timimoun, Mzab, Ouargla, Nefusa. (Ghadames still has o).

2. The proximal demonstrative suffix: -a vs. -u. Again, -a is the default, but -u appears in the same set of varieties as seen in 1, plus one more: Iznasen.

3. The 3rd person singular aorist of CCV verbs: ad yəbḍu vs. ad yəbḍa. Here, -u is the default, and is closer to the original, while -a has spread from the preterite. This applies to the same set of varieties as 2 (excluding Nefusi), plus several more: Rif, Metmata, Chaoui, Jerba.

4. Initial vowel dropping: a- vs. 0-. A number of *(t)a-CV-initial nouns drop the original vowel of the prefix in the same set of varieties as 3, plus Nefusi, Chenoua, and Siwa.

5. Velar softening: in many varieties, in many words, what would elsewhere be k/kk/g/gg corresponds to c/čč/j/ǧǧ. The latter outcome is observed in the same set of varieties as 4, minus Nefusi.

6. Final *-əv: this is retained as such in Ghadames and Awjila, and as constrative length in Zenaga. Otherwise, it becomes -u in most varieties, but -i in the same varieties as listed in 4, plus El-Fogaha (with a few question marks where the author had insufficient data). Cf. Kossmann (1995).

All of 1-6 pick out Zenati varieties, but the exact set differs: 1-2 pick out a core Zenati consisting almost entirely of northern Saharan varieties, while 3-6 pick out a broader Zenati including the semi-arid mountainous lands stretching from the Rif to southern Tunisia, and vary in their inclusion of varieties further east (Nefusi, El-Fogaha, Siwi). Chaker (1972) cites 1-2 and 5 as possibly justifying a Zenati subgrouping, while Kossmann (1999) defines Zenati in terms of 3, 4, and one other morphological innovation, and then cites 5 and 6 as common phonological innovations.

7. Negative intensive theme: retention/loss. The negative intensive is retained in northwestern Morocco (Rif, Iznasen, Senhaja, Ait Seghrouchen, Figuig); in Bissa; in Tuareg and in the nearby oases of Mzab, Ouargla, and Ghadames; and in Jerba. Its loss everywhere else (according to his data, which should be re-checked) shows no prominent genetic patterning, and hence is probably relatively recent.

Then, he moves on to vocabulary, examining 11 lexical variables which I would summarise as follows:

Several forms appear specifically Zenati: irəḍ in the sense of "be dressed" (though it is more widespread in other senses), igur for "go", əɣs for "want", azəgrar for "long", anilti for "shepherd". Of these, El-Fogaha and Siwa share only əɣs for "want", whereas Nefusi shares all except "go in". adəf "go in" is Zenati-specific in the west, but more confusing in the east, being attested in Ghadames and (as an alternative to əggəz) in Air Tuareg.

Several forms appear specifically Tuareg: răgăz for "go", amaḍan for "shepherd", əggəz in the sense of "go in" (elsewhere "go down"), zəgrət (with the extra t) for "long".

One form unites southern/central Morocco with Kabyle: awtul "hare" (vs. pan-Berber a-yərẓiẓ.)

A couple of forms unite Libyan varieties with Tuareg, contrasting with Algerian and Moroccan varieties, in defiance of any plausible genetic classification, reminding us that a tree does not tell the whole story here:

  • iziḍ "donkey" (Tuareg, Ghadames, Nefusi, Siwa, Awjila) vs. aɣyul (everywhere else except El-Fogaha)
  • tufat/tifut/tafyi "tomorrow" (Tuareg, El-Fogaha, Siwa respectively) vs. azəkka (everywhere else except El-Fogaha, Awjila, and Zenaga)

Based somehow on all this, he proposes the following very odd tree:

  1. Group 1
    1. Senhaja, Middle Atlas, Shilha, Kabyle, Zenaga
    2. Tuareg
    3. El-Fogaha
    4. Awjila Siwa
  2. Nefusa
  3. Ghadames
  4. Group 4 ("Zenati")
    1. Ait Seghrouchen, Beni-Snous, Timimoun, Figuig, Bissa, Mzab, Ouargla
    2. Iznasen, Jerba
    3. Rif, Metmata, Aures
    4. Chenoua

Apparently, to get this he operated by successively applying at each stage the criterion from his list that divided the data into the lowest number of groups possible, without attempting to distinguish innovations from retentions, much less judge the relative likelihood of independent innovation. The fact that even such a crude method was still able to produce a recognisable Zenati subgroup either says something about the robustness of this distinction or about the selection of features. What this data set actually tells us, bearing in mind that shared retentions have no implications for subgrouping and that Zenaga fails to participate in a number of innovations that otherwise seem pan-Berber or nearly pan-Berber, is something quite different:

  • There is definitely a Zenati subgroup, as has been known at least since Destaing (1915), but its boundaries are a bit fuzzy. (If this reminds you of the situation of "Hilalian" g-dialects, that's probably not a coincidence.)
    • Western Zenati:
      • Core (mainly Northern Saharan): Ait Seghrouchen, Figuig, Beni Snous, Bissa, Timimoun, Mzab, Ouargla
      • Transitional (the High Plateau and its edges): Rif, Metmata, Chaoui, Jerba
      • Peripheral:
        • Chenoua (north-central Algeria)
        • Nefusi (northwestern Libya)
    • Eastern Zenati (Libya/Egypt): El-Fogaha, Siwa
  • There is definitely a Tuareg subgroup, as has always been known: Ahaggar, Iwellemmeden, Air, Taneslemt.
  • There just might be a subgroup combining Kabyle with Senhaja, Central Morocco and Shilha: they share the innovation *-əv > -u, and the word awtul "hare". The evidence for it is very weak, though, especially since *-əv > -u is also found in some Tuareg varieties.

The rest of the common features almost all look like shared retentions.

Sunday, November 16, 2014

Out now: The development of dative agreement in Berber

After about two years in the pipeline, an article summarising the results of my British Academy research on agreement in Berber has just come out in Transactions of the Philological Society. If you have access to Wiley Online Library, you can read it online: The development of dative agreement in Berber: beyond nominal hierarchies. If you're interested but don't have access, email me to ask for a copy. Here's the abstract:

Diachronically, agreement commonly emerges from clitic doubling, which in turn derives from topic shift constructions (Givón 1976) – a grammaticalisation pathway termed the Agreement Cycle. For accusatives, at the intermediate stages of this development, doubling constitutes a form of Differential Object Marking, and passes towards agreement as the conditions for its use are relaxed to cover larger sections of the Definiteness and Animacy Scales. Berber, a subfamily of Afroasiatic spoken in North Africa, shows widespread dative doubling with substantial variation across languages in the conditioning factors, which in one case has developed into inflectional dative agreement. Examination of a corpus covering eighteen Berber varieties suggests that low Definiteness/Animacy datives are less likely to be doubled. However, since most datives are both definite and animate, these factors account for very little of the observed variation. Much more can be accounted for by an unexpected factor: the choice of verb. “Say” consistently shows much higher frequencies of doubling, usually nearly 100 per cent. This observation can be explained on the hypothesis that doubling derives from afterthoughts, not from topic dislocation.

Sunday, November 02, 2014

Linguistics for high schools: what would a syllabus look like?

Today, just for fun, I'd like to invite you to discuss a topic a little off the beaten track for this blog: how much linguistics should a high school graduate know? The question may seem bizarre - there have been occasional efforts to introduce linguistics courses into high schools (MIT, Milwaukee), but you don't expect to see "linguistics" on a high school curriculum. Still, let's not get confused by labels. Linguistics is inextricably woven into language teaching, and even the most resolutely monolingual curriculum includes at least the school's own language. (I recently happened to come across an 8th grade final exam from 1895 from Kansas; no foreign languages were featured, but no less than two out of the six subjects tested, Grammar and Orthography, rely heavily on linguistic concepts.)

One useful way of separating linguistic education from language education is to look at universality. Some of what you learn in English class is useful across practically all languages, like the idea of a verb or of a vowel. Some of it is much more parochial; the fact that the plural of "child" is "children" is a historical accident relevant only to English and, at best, its closest relatives. Such parochial facts can be vital, of course; if you're going to grow up in an English-speaking country, you'd better be able to form your English irregular plurals correctly. But the more general concepts have a deeper interest; they help you analyse what you're saying, and make it easier to learn new languages. Unfortunately, those concepts are precisely the ones that have suffered most in recent decades. In the UK, at least, my own experience suggests that most high school graduates can't even reliably tell a noun from a verb. In theory, the latest changes to the English syllabus should change that - but given that many of the teachers were hardly taught any grammar either, one wonders how successful the reform will be.

In any case, if I were designing a syllabus, here is what I would suggest to start with. I'd be interested to see what other linguistically oriented people think:

Phonetics has never been a focus of early education, apart from the minimum necessary for teaching a child to read and write (and even that gets de-emphasised in some approaches). This is a shame, because the younger you are, the easier it is to learn to hear and pronounce unfamiliar sounds. Why not learn:
- The IPA, or at least the most commonly used symbols in it; be able to pronounce and recognise them. This should include tone if at all possible.
- Basic articulatory phonetics: how the configuration of your vocal organs relates to the sound produced, and how to use this knowledge to pronounce unfamiliar sounds. (If your language uses Devanagari, you should have an advantage, as this is practically built in to the alphabet anyway; students of tajweed too will come across this issue at some point.)
- Phonology: the concepts of the phoneme and of conditioned allophones. That way when you learn another language you'll at least know why some sounds give you so much more trouble than others.
- Metric structure: syllable, foot, etc. (Yes, I know the concept of syllable is controversial, but you'll need this to be able to study poetry anyway.)

Morphology is a lot more language-specific than the other topics here, but one should at least know:
- How to decompose a word into its component morphemes (prefixes, suffixes, templates, roots...), and guess its meaning from them if necessary.

Syntax: Unlike phonology, this has traditionally been deliberately taught, and you should certainly know:
- The parts of speech: noun, verb, adjective, preposition, etc... and how to tell them apart.
- Argument structure and case: subject, direct object, nominative, accusative, etc.
- How to to break down a sentence into its phrase structure: what modifies what? What is a phrase, and what is its head? For best results, try being able to diagram it.

Unfortunately, it's not quite so simple: all three of those - especially the latter - are the subject of major controversies between different syntactic theories... (Two good Language Log posts on this issue: parts of speech and sentence diagramming.) If you teach whatever theory happens to be traditional where you're from, you may not make any friends in academia, and you risk perpetuating some old misconceptions; but you will certainly leave your students much better prepared to learn any more current theory - or any language - than if they had studied no grammar at all.

Historical linguistics and sociolinguistics: The language you speak most likely has relatives, and certainly contains words borrowed from other languages. You should understand:
- That there is normally variation inside a single language, which people often use to signal their social position and to identify the social position of others, and over which people's control is limited.
- That languages change over time as some variants become obsolete and others emerge, and in what ways they change - sound shift, semantic shift, borrowing, morphological and syntactic change...
- That different changes accumulating in different areas can split what used to be one language into several, and that people can abandon one language and start speaking another one instead.
- That sound shifts are usually regular, and that this regularity can be used to identify potential cognates (making it easier to learn languages related to ones you know.)

There should certainly also be some semantics and pragmatics in this list, but I'm not feeling especially inspired on either subject at the moment - any thoughts?

Thursday, October 30, 2014

Some Tuareg-Songhay loans

I'm almost three-quarters of the way through Heath's Grammar of Tamashek (Tuareg of Mali). The main interest lies in its efforts to reduce the bewildering complexity of Tuareg morphology to some sort of order, an impossible task which it accomplishes more successfully than any other Tuareg grammar I've looked at so far. Aside from this, however, it's raised some interesting etymological issues.

I've wondered for years where the Korandjé verb wəy "gather (firewood)" comes from. It normally appears in the idiom a-wwəy-ts skudzi [3Sg-gather-hither wood] "she gathered in firewood". On p. 333 of Heath's grammar, I found the explanation, in the following example:

i-wwáy=ədd i-sǽɣer-æn
3MaSgS-bring.Reslt-Centrip Pl-firewood-MaPl
[He] has brought firewood here.

The Tamasheq verb in question, awəy in the imperative, is simply the normal Berber word for "take, bring" (which in Korandjé is expressed with a Songhay verb, zəw), so I would have hesitated to connect them based on a dictionary entry alone. But given this attested usage with "firewood", the semantic specialisation poses no problems. What does surprises me is that it was borrowed as a bare stem, rather than with a fossilised 3rd person prefix y/i - contrast yəf (Tashelhiyt y-arf "roast", not attested in Tamasheq), ikna "make" (Tamasheq i-kna). Usually, only stems that start with a syllabic onset are borrowed into Korandjé without the y/i.

Another probable loan into Korandjé that I noticed going through the grammar is Korandjé ləwləw "shine, gleam" - cp. Tamasheq m̀ələwləw "shine".

However, a number of words have gone the other way - from Songhay into Tuareg. Heath comments on many of these in his dictionary (eg kə̀rikəw "practice sorcery"), but not all. One that struck me is the verb ḍùkr-æt "become angry at", obviously related to Gao Songhay dukur "be angry"; I don't recall seeing this verb elsewhere in Berber (not even in Alojaly's dictionary of Tamajeq), whereas it's widespread in Songhay.

Obviously cognate are Tamasheq é-tæqq "male ostrich" and widespread Songhay forms such as Gao taatagey, Fulan Kirya taataɣey "ostrich" (the shift of g to ɣ next to non-high back vowels is regular in several Songhay varieties, and in Tamasheq qq is the geminate equivalent of ɣ). The word is generic in Songhay but specific in Tuareg - the opposite of what we saw with "bring" - which suggests to me that it was borrowed into the latter, as does the fact that I don't find the term in Alojaly's Tamajeq dictionary. However, since ostriches are extinct in most Berber-speaking areas, it's difficult to prove the direction of borrowing.

Thursday, October 23, 2014

Berber: classification, Tasahlit, roots vs. stems

Today seems to be a good week for comparative Berber linguistics - the day's haul is worth sharing:

Maarten Kossmann has uploaded his preliminary classification of Berber varieties based on shared innovations: Berber subclassification (preliminary version). He divides Berber into seven blocks:

  1. Zenaga block (Zenaga of Mauritania, Tetserrét in Niger)
  2. Tuareg block
  3. Western Moroccan block (SW Morocco, Central Morocco, i.e. Tashelhiyt and most of Tamazight)
    possibly including NW Moroccan Berber (Ghomara, Senhadja de Sraïr)
  4. Zenatic block (Eastern Morocco, Western Algeria, Saharan oases, Tunisia, Zuara) extending towards the east with Sokna, Elfoqaha, Siwa
  5. Kabyle (N Algeria), possibly linked to the western Moroccan block
  6. Ghadames (Libya), probably to be linked to Djebel Nefusa (Libya)
  7. Awdjilah (Libya)
By and large, this appears very plausible, although it should be noted that Tunisian Berber and Zuwara are already somewhat peripheral to Zenati, not sharing western Zenati's innovative distribution of initial vowel dropping, and El-Fogaha is even more so than Siwa or Sokna. (As he notes, the much greater homogeneity and clearer boundaries of Zenati in the west imply that this group arrived in Algeria and Morocco from the east.) But, in principle, it is still necessary to identify specific innovations characteristic of each of these groups. It is also clear that the Zenaga block is by far the first split on the tree, and the list ought ideally to reflect that. But the moderately high degree of mutual intelligibility poses serious obstacles to applying the family tree model to Berber, as he discusses.

The most interesting Kabyle varieties for historical reconstruction are the little-known ones of the extreme east, "Tasahlit". As it happens, Abdelaziz Berkai has just uploaded his recent thesis, a dictionary and sketch grammar of the Tasahlit of Aokas: Essai d’élaboration d’un dictionnaire Tasaḥlit (parler d’Aokas)-français. The quality of his work appears excellent, and this will no doubt be a very useful resource. The choice of dialect, however, is not entirely ideal. It is clear from Basset's dialect atlas, and from the all too rare comments in Rabdi's grammar on neighbouring varieties, that the vocabulary of Aokas is still quite close to that of Bejaia; the really divergent varieties seem to be those of the Babor Mountains and Oued el Bared, approaching Jijel, and those are the ones most likely to give an insight into the dialect of the now largely Arabised Kutama.

I haven't yet had time to properly look at Samir Ben Si Said's thesis, De la nature de la variation diatopique en kabyle: étude de la formation des singulier et pluriel nominaux, but it tackles the synchronically as well as diachronically thorny problem of Berber non-concatenative morphology, and argues for an approach based more on roots than on stems, contrasting with another important study I've been working through lately, Heath's Grammar of Tamashek (Tuareg of Mali).

Tuesday, October 21, 2014

Subject-verb order in Tumzabt

Going through Brahim and Bekir Abdessalam's brief grammar of Tumzabt Berber (الوجيز في قواعد الكتابة والنحو الأمازيغية "المزابية": الجزء الأول) recently, I was struck by their discussion of the problem of subject-verb order. Berber in general allows both verb-subject and subject-verb order, with the case ("state") of the subject depending on which order is used. Determining which order is used under which circumstances, however, poses some difficulties; the same language may be described as VSO or SVO, depending on who you ask, and the determining factors certainly differ from one variety to another (cf. eg Mettouchi fc for Kabyle). Their take on the problem combines information structure with pragmatics and verbal mood. The latter two factors can very likely be reduced to information structure too, but that would require testing; in any case, the observation that VS order is required for serialization is interesting. Here's what they had to say, translated into English (pp. 129-130):

We observe that in the first set of examples, the subject precedes the verb; this is the usual form in an Amazigh clause consisting of a verb and a subject.

In the second set of examples, the subject follows the verb. This happens in the following cases:

  1. The subject may follow the verb when it is specific and known to the speaker and listener because there is a connection between speaking of it and a previous expression involving speaking of the same subject. For instance:

    twelleh! afunas-nni yetthaḍa - Watch out, that bull rampages.

    After the two parties have parted, they meet again the next day, and one says to the other:
    yak yhaḍ ufunas ay-tessečned asennaṭṭ! - Indeed that bull you showed me yesterday really did rampage!

    Here, the subject - the bull - is specific for both parties to the conversation in the second usage, since it had been spoken of earlier.

  2. For the sake of irony, which can only be deduced from the context surrounding this expression and from the circumstances of discourse, eg if we say:

    tiɣawsiwin-ess tqimant-edd ɣel wezğen, drus mi yefra igget, ay-tinid : yebṛem werğaz ! - His affairs stay half-done, rarely does he resolve even one, and you tell me: he's a careful man!

  3. The subject may follow the verb obligatorily in the serial aorist, eg:

    yuli tazdayt yuḍa-y-as wemjer - He climbed the date palm and the sickle fell from him [and dropped the sickle].

    It may also occur directly following the verb in the future tense aorist, eg:

    ad tatef teğrest ad yireḍ isemmuṛa n tḍuft or tağrest ad tatef ad yireḍ isemmuṛa n tḍuft - When winter comes, woolen clothes are worn.

They follow this up with an observation that seems quite astonishing from a comparative Berber perspective (p. 131):

A subject following the verb is put in the construct state if definite, this being the normal case for the postverbal subject, and is put in the free state if indefinite without any need for the [indefinite] article iggen / igget ["one"].

Unfortunately, they provide no examples to illustrate this claim.

Saturday, September 20, 2014

Néologismes en n- en berbère siwi

(experimentally posting in French - opinions?)

Très tard, j'ai commencé cet été à mieux organiser mes notes léxicographiques sur le berbère siwi d'Egypte. Ayant atteint 2300 mots après avoir transcrit trois carnets, je prend une pause pour donner une observation qui pourrait être utile un jour à l'aménagement linguistique, si ce dernier est envisageable pour un parler aussi minoritaire ... Pour former les noms déverbaux, le berbère siwi d'Egypte utilise souvent une stratégie analytique assez différente des stratégies morphologiques préférées ailleurs en berbère : la particule du génitif, n, + le nom verbal. J'en ai neuf exemples clairs, pour ne pas parler d'autres cas plus opaques. Le nom peut être le complément du verbe :

  • ačču manger : n-ačču nourriture
  • aknaf rôtir : n-aknaf viscère / aubergine rôti
  • alessa se vêtir : n-alessa vêtements
  • tiswi boire : n-tiswi boisson
ou bien l'instrument pour faire l'action du verbe:
  • ančlaħ glisser : n-ančlaħ planche de dune
  • asebded arrêter : n-asebded bouton d'arrêt
  • aṣṣey tenir : n-aṣṣey poignée
  • azerzi chasser (les mouches) : n-azerzi chasse-mouche
ou même, plus rarement, le lieu :
  • aɛenɛen s'asseoir : n-aɛenɛen la planche transversale d'un chariot sur laquelle on s'asseoit
Comme le montrent "planche de dune" et "bouton d'arrêt", cette forme reste encore productive. La plupart des nouveautés prennent naturellement les noms arabes utilisés par leurs vendeurs, mais si les siwis voulaient adopter des formes puristes, il serait facile d'appeler, par exemple, la télé n-aẓeṛṛa - alors que, en fait, le néologisme le plus connu à Siwa, chez ceux qui s'en intéressent, est la curieuse forme elmeẓṛa, apparemment dérivée de tiliẓṛi à partir de transmission orale.