Friday, January 26, 2007

Aboriginal language of the Ahaggar

Tuareg oral tradition records that, when the Tuareg arrived in the Ahaggar (Hoggar) Mountains of southern Algeria sometime in the first millennium AD, they encountered people called the Isăbătăn - a sparse population of hunters and goat herds, stereotyped in popular anecdotes as stupid. Parts of the Dag-Ghali tribe claim descent from them. What language did they speak? We may never have enough data to know for sure, but Tuareg stories provide possible clues, sometimes putting seemingly nonsensical, or at least non-Tuareg, words into the mouths of Isăbătăn characters. I recently came across an example, quoted from Pandolfi 1998:132 in Kossmann 2005:15 (transcription slightly modified):
Ikkəršərmadən tangarən damadən.
The ants come in and go out.
The morphology of the sentence is Tuareg - -ən is the normal Tuareg masculine plural suffix, while i- is a normal Tuareg nominal plural prefix. But the lexical stems - -kkəršərmad- "ant" (?), tangar- "come in", damad- "go out" - have no obvious Tuareg or broader Berber cognates. Unlikely as it is, it would be interesting if one of my readers happened to recognise an obvious source for these...

Kossmann, Maarten. 2005. Berber Loanwords in Hausa. Köln: Rüdiger Köppe.
Pandolfi, Paul. 1998. Les Touaregs de l'Ahaggar (Sahara algérien). Parenté et résidence chez les Dag-Ghali. Paris: Karthala.

Saturday, January 20, 2007

Maghreb sociolinguistics

I've just come across a couple of interesting posts on the North African sociolinguistic situation. On Aqoul, Shaheen discusses a topic close to my heart - how the dominance of French hurts the Maghreb's economy and makes new ideas take longer to reach it:
The ubiquity of French not only aggravates the dependence of the Maghreb to France, it impedes the region's ability to develop beyond its traditional (colonial) ties. Worse, it not only serves as an economic chain, it culturally acts like an albatross around their necks: everything, from law to social models is borrowed from France. Considering how inefficient and sterile French models and intellectual production can be today in most fields when compared to their anglo-saxon counterparts, the Maghreb is riding a losing horse. It isolates individual Maghrebis from most scientific and economic literature...
Some of the comments are worth reading as well, and the author links to published online papers (bibliography in a blog post - always a trend worth promoting!)

In a more personal post, "Filjazair" describes Algeria's linguistic "schizophrenia", as it appears to a heritage learner:

I hear, a lot, things like: "Don't bother learning darija, it's not worth it"; "You don't need to learn fusha [formal Arabic], nobody speaks it"; "Our language [Arabic, formal Arabic] is being corrupted - we're still in the age of colonisation!" (this from one student who'd heard that another student had to write her university thesis in French); "Everybody speaks French, that's all you need"; "Everybody speaks darija, that's all you need."


So I'm in a French class with Algerians who already know a lot of French but speak it sloppily, and who feel a need - because of work, because of school, because of future prospects - to get better. This week I start formal Arabic lessons with a tutor downtown. And along the way I'm trying to pick up snippets of darija to use in the street, so as not to have to use too much of the French or Arabic I'm learning, because French marks you as a foreigner (or worse, a snob - an Algerian in contempt of her Algerianness) and because nobody actually speaks the formal Arabic.

-n't infixation in English

Consider the following word:
Finally one said "It's astonishing. Frankly astonishing. The man has actually got charisn'tma."
"Your meaning?"
"I mean he's so dreadful he fascinates people. Like those stories he was telling... Did you notice how people kept encouraging him because they couldn't actually believe anyone would tell jokes like that in mixed company?" - Terry Pratchett, Feet of Clay, 1996, p. 289
This word may not be original to Terry Pratchett, incidentally - this document suggests it appeared in a show called My Word! back in 1976. It gets 90 ghits, (including one in Polish), concentrated disproportionately in reviews:
* "...cardboardy Paul Walker, whose lack of presence ('charisn'tma'?) sucked much of the life out of..."
* "Collectively, they exude charisn'tma, and one or two even possess what Ken Campbell once described as "the legendary Minus Quality" whereby when they exit, the stage somehow seems fuller."
* "Hard-edged r&b from the band with a front man who exudes charisma. Me? I exude charisn'tma."

In this structure, a common word's meaning is "inverted" for comic effect by adding -n't to a portion of it which resembles a common auxiliary verb. I can think of at least one other such case:
"I even attempted a beehive do, but I ran out of hairspray. So it kind of turned into a don't." - Douglas Coupland, Generation X, p. 82
Believe it or not, this one gets some ghits as well, though it's obviously hard to determine how many:
* "Scarlett's Hair-Don't"
* "Hair-Bow is a Hair-Don't"
* "Is it a hair DO or hair DON'T???"
and an entry in the Urban Dictionary.

And at the even less lexicalised end of the spectrum, we find stuff like this:
* "Another Republican’t paragon of virtue"
* "As it happens, my friend Tom Schaller noticed that many of you liked “Republican’ts” — and he suggests that we all start using it, at least as often as the GOP throws around “Democrat Party.”"
* "You’ve seen me in action. You know I’ll get you out. I’m a Mexican, not a Mexican’t!”"
* "did you hear about the lazy, unambitious bird? It was a whippoorwon't. (Yuck, I hate puns.)"

Your question for 10 points: do portmanteau words like this need to be accounted for by a full theory of morphology, or are they monstrosities that should be swept under the rug into psychologists' labs? These words' two components seem to both contribute to their meaning, but not exhaustively, in much the same way as the meaning of "catty" is partly but not completely predictable from "cat" and "-y"; how should that relation be characterised?

Monday, January 15, 2007

The plural-breaking mountains of Oman

Arabic is well-known in phonological circles for the diversity and complexity of its broken plurals (jam` at-taksiir جمع التكسير) - that is, plurals formed at least partly by internal modification of the word. The commonest type for four-consonant roots is mainly characterised by an -aa- after the second consonant. For example:

daftar-un > dafaatir-u "notebook"
kawkab-un > kawaakib-u "planet"

Certain rather regular complexities emerge when a long vowel is present in the singular; depending on position, it is either treated as an extra consonant or affects the length of an output vowel:

xaatam-un > xawaatim-u "ring"
risaal-at-un > rasaa'il-u "letter"
qaanuun-un > qawaaniin-u "law"

If the stem has more than four consonants and takes this plural types, the later ones get dropped off the edge, so to speak:

`ankaabuut-un > `anaakib-u "spider"

Now this has been the subject of some interesting work, notably in autosegmental phonology, where such phenomena have been taken as a strong argument for separating consonants and vowels into separate tiers. For Arabic, the plural morphology itself - in this case, the skeleton -a-aa-i[i]-, but there are many others - never seems to involve infixing a true consonant; diminutives in -u-ay-i- can be explained away by treating -y- as a semivowel. But that changes if we look beyond Arabic...

Jibbali/Sheri is a Semitic language spoken on the southern coast of Oman, and (despite its location) is neither descended from nor mutually intelligible with Arabic. Among other changes, it no longer has distinctive vowel length. Its commonest equivalent of the Arabic plural form described above involves the insertion of -ab-/-ɛb- instead of -aa-:

dəftɔr > defɛbtər "notebook"
kənsed > kenabsəd "shoulder"
mɛrkɛb > mirɛbkəb "boat"
muṣħar > muṣɛbħar "branding iron"

although it does have a more Arabic-like form with -o-/-ɔ- in some (mainly feminine) cases:

maħfer > moħofur "basket"
ħalḳũ-t > ħɔloḳum "Adam's apple"

Note that the -ab- plural is productive enough to apply to Arabic borrowings like dəftɔr. I would love to know how this form emerged; as far as I know, no other Semitic language has a b-infix plural.

Ratcliffe, Robert R. The 'Broken' Plural Problem in Arabic and Comparative Semitic. John Benjamins: Amsterdam 1998.

Friday, January 05, 2007

Executable articles

Mark Liberman recently made an excellent suggestion on Language Log: scientific and technical papers should include an explicit, executable recipe for generating their numbers, tables and graphs from published data. However, I think if anything he undersells the possibilities. Where executable algorithms could make a big difference to linguistics papers is in testing proposed rules - phonological principles; historical sound shifts; morphological rules...

If someone proposes a vocabulary of protoforms and a set of regular sound shifts, they are writing an explicit algorithm already. With an accompanying executable version of it, taking protoforms as input and outputting descendant forms (and programs to do some of this have already been written, eg Geoff's Sound Change Applier, Phono), you would be able to know exactly what the predicted forms in each descendant language would be, and be able to spot irregularities (in other words, prediction failures) with ease.

Likewise, if someone proposes a phonological principle (whether conceived of as a rule, as a constraint, or otherwise), you could test its effect on arbitrary data and see if it makes the right predictions. Gaps in the theory (for example, the representation of clicks in Government Phonology) or non-computable theories (an accusation sometimes made against Optimality Theory) would be conspicuous: the input would rejected, or the program would simply be unwritable. The phonology of a language would be accompanied as a matter of course by a program generating phonetic representations from phonemic inputs. In fact, a widespread and welcome trend in modern phonology is to regard individual language's phonologies as specific instantiations of universal constraints or structures; so let each theory be fully specified as a programming language, and each individual language's phonology be represented in it (as an ordered list of constraints, or a set of parameter settings, or whatever your favorite theory does). Inconsistencies or ambiguities would stand out like sore thumbs as you played around with the program. In short, executable articles would make linguistic theories more accountable, and make it easier to spot gaps and places where further work is called for.

Incidentally, I have occasionally been known to practice what I'm preaching here: way back when I was studying mathematics, I wrote a program to generate Algerian Arabic broken plurals from singulars, and found the experience very informative. (It worked, too.) Unfortunately, this program was written in Visual Basic 3.0, not a programming language that has aged very well... which is, of course, another issue that would have to be considered.

<g> in Arabic

A belated Eid Mubarak and Happy New Year to all my readers!

This post is brought to you by the letter G - a sound all too common in many languages, including many dialects of Arabic, yet absent from Classical Arabic, leading to a minor quandary for transcribers, and to substantial regional variation. In Morocco, [g] in names is typically written using a kaf ك with three dots (ڭ), as in this sign. In Algeria and Tunisia, it's typically a qaf ق with three dots (ڨ), a choice reflecting the sound shift q > g common in Bedouin dialects, but unfortunately easily confused with the fa with three dots (ڤ) often used elsewhere in the Arab world for [v]. In Egypt, a jim ج is generally used, since classical j is pronounced g in Egyptian dialect. Elsewhere in the Arab world, a kaf with a line on top (گ), as in Persian or Kurdish, is sometimes used. In adapting foreign loanwords, ghayn (eg بلغاريا Bulgaria) or jiim (eg إنجيليزية English) are usual. In a Qatari mall recently, however, I saw yet another system: Osh Kosh B'Gosh was transcribed as أوش كوش بيڠوش, with a ghayn with three dots (Malay ng). I have no idea what country this may be characteristic of - even here it appears rather unusual. Any thoughts?