Wednesday, March 11, 2009

Arabic (and Berber?) loanwords in southern Italy

Just came across a little monograph on Arabic and Berber loanwords in the dialects of the Basilicata (southern Italy): Sopravvivenze lessicali arabe e berbere in un'area dell'Italia meridionale, la Basilicata by Luigi Serra. Most of the loans listed are from Arabic, some quite obvious (eg taūt "coffin" < تابوت, źir "a copper or terracotta container for liquids" < زير, zammîl "big pannier with which various goods are transported on a beast of burden's back" < زنبيل), others rather less clear-cut.

Only three loans (and one placename) are claimed as from Berber. Two of them look acceptable, but all of them seem questionable, and they all refer to objects that there would have been no obvious reason to borrow terms for. It's possible that Berber influence can be found in southern Italian dialects, but this doesn't present a terribly convincing argument. Still, here they are:
  • źembr / źimbr / zimr / źimmr "billy-goat" (caprone, becco) < pan-Berber izimmər "ram", p. 39. (Looks good, but why the shift in species? - Also, see comments for an alternative Greek etymology.)
  • aččáta "big meal" (scorpacciata, mangiata, spanciata) < pan-Berber əčč "eat", p. 11. (The semantic and phonetic match are great, but the word is so short that coincidence seems hard to rule out.)
  • šéḍḍa "wing" (ala) < Zenati Berber "bird", eg Siwi ašṭiṭ, p. 26. The author mentions an alternative possibility - deriving it from Italian ascella "armpit" - that seems much more plausible.
  • Zaza (placename) < Berber azəzzu "thorny broom (plant sp.)" - not discussed in any detail (author cites Renisio), p. 41.

Saturday, March 07, 2009

Tawalt closing down

Tawalt is a nine-year-old Libya-focused Amazigh/Berber website with a remarkable collection of audio recordings, sketch grammars, vocabularies, and resources for some of the least well documented Berber languages - those of Tunisia, Libya, and Egypt. It is thus rather a shame that Tawalt is shutting down - updates stopping immediately, and site to go down by the end of the year. Sure, the Wayback Machine should preserve all the texts on it - but not its remarkable audio archives (which have already disappeared from the main page.) Their plans are probably related to political problems - the site's political postings had gotten rather outspoken. If you have any interest in Berber linguistics, I suggest looking around now before it disappears...

Wednesday, March 04, 2009

No, Berber isn't descended from Arabic

A few days ago I got lent a copy of a recent book in Arabic by Othmane Saadi: Dictionary of the Arabic Roots of Amazigh (Berber) Words معجم الجذور العربية للكلمات الأمازيغية (البربرية) (Tripoli: Academy of Arabic Language 2007.) My reaction, in brief, is that it's unscientific jingoistic claptrap. But I happen to have friends (not linguists, of course) who take it seriously; and I am told that the author, a proud member of the Chaoui Berber Nememcha (Nmamša) tribe, genuinely believes his own theory. I will therefore try to explain as simply as possible where the book goes wrong.

His starting point is noting the existence of strong similarities between Arabic and Berber in the vocabulary and grammar (p. C: “90% of Amazigh Berber words are pure or Arabised Arabic, and the grammar of Berber agrees with the grammar of Arabic.”) This is substantially correct, and has been known for a long time (see, for example, Igor Diakonoff's Afrasian Languages, Moscow: Nauka 1988, or at a more basic level one of my first posts), except that 90% is a substantial exaggeration – many of the comparisons he puts forward are at best questionable, as will be seen below. But he claims that the explanation for these similarities is that Berber descends from Arabic. Not just Berber either, as he says on p. B: “The term Arabitic عروبية means the ancient Arabic languages which are wrongly called the Semitic languages and which branched out from the source language Arabic thousands of years ago, such as Babylonian, and Assyrian, and Akkadian, and Phoenician Canaanite, and Aramaic, and Himyaritic, and Sabaean, and Thamudic, and Lihyanite, and Ma'inic, and ancient Egyptian, and Berber, and others.” Linguists subscribe to a rather different explanation for the observed similarities: that Berber and Arabic (and all the other languages he listed, and many he doesn't list such as Hausa and Somali) are all descended from a single language, called for convenience Proto-Afroasiatic (Greenberg 1950), which was different (and probably about equally different) from any of them.

How would you choose between these two hypotheses? Well, if the original language was different from Arabic, then you would expect some original forms to have been lost in Arabic but kept in other languages. Oddly enough, Saadi himself gives evidence for exactly that: he links the Berber ur “not” to Akkadian ul (p. 12), and the Berber -as “to him/her” to Akkadian -šu (p. 12), and the Berber nəkk “I” to Ancient Egyptian ink and Akkadian 'anāku, none of which are attested in Arabic. Unless you believe that Akkadian and Berber each independently invented the same new forms, or that they are more closely related to each other than to Arabic – which Saadi (correctly) does not claim – you have to conclude that the common ancestor of Arabic and Berber included words like ur/ul for “not”, and 'anāku for “I”, and so on, and hence was different from what we know as Arabic, just as it was different from Berber.

So maybe this common ancestor was Arabic in a different sense: Saadi argues that it was originally spoken in Arabia, so Arabic would be the one language that stayed at home, and presumably got less affected by foreign influence. Unfortunately, he doesn't have much of a case. His first argument (p. 1) is frankly risible: “Europe and North Africa were covered with ice before [18000 BC], whereas the Arabian peninsula enjoyed a climate similar to that of southern Europe now. The ice melted in the former and drought hit the latter, so mankind left the Arabian peninsula and settled North Africa and southern Europe.” The quote he cites on this actually says nothing about North Africa, and for good reason: even at the last glacial maximum North Africa was never covered by ice (see map), and was if anything more habitable before 18000 BC than it is now. He also notes (p. 2) that Berber princes have long claimed Yemenite origins. Such claims are questionable for many reasons (the desire for prestige, the originally matrilineal traditions of many Berber tribes, and no pre-Islamic attestations) – but even if true, it would prove nothing about the language: people change their language all the time without changing their ancestry, as any emigrant can tell you. The rest of his argument is a hotchpotch of miscellaneous quotes which at best claim that various early North African peoples or languages or cultures originated in the Middle East; in a particularly ludicrous case, he blithely quotes Bousquet (1957) to the effect that the Berber language “came from Asia Minor” [Turkey!] None of these quotes so much as mention the Arabian peninsula.

In fact, the linguistic evidence means that Proto-Semitic may well have been spoken in Arabia and certainly was spoken in the Middle East, but the common ancestor of Berber, Egyptian, and Semitic was most likely located in Africa. You see, as noted above, these three language families are also quite closely related to Chadic (spoken mainly in Nigeria and Chad) and Cushitic (spoken around the Horn of Africa) – which means that 4 out of 5 branches of this family are native to Africa. It is more likely that one branch left Africa than that 4 branches each separately followed the same narrow path across Sinai or crossed the Red Sea. (For theoretical background, see Campbell 2004.)

In other words: whether the similarities this book gathers between Arabic and Berber are valid or not, they don't do anything to support the author's claim that Berber descends from Arabic. Do they at least have the merit of being valid comparisons? Sometimes, but not with any consistency. Many of his comparisons look rather far-fetched, eg on p. D:

taməṭṭuṯ “woman” < Ar. ṭāmiṯ طامث “menstruator”
argaz “man” < Ar. rakīza(tu l-'usrā) ركيزة الأسرى “pillar (of the family)”
ixəf “head” < Ar. xf' خفأ “appear”, because the head stands out
tadaγt “armpit” < Ar. daγdaγah دغدغة “tickling”
alγəm “camel” < Ar. luγām لغام “the foam that comes out of camels' mouths”

Many others are clearly genuine loanwords, often featuring sounds that cannot be reconstructed for Proto-Berber, though I don't think many of these are original suggestions, eg:

(p. D) axərraz “cobbler” < Ar. xaraza خرز “to sew leather”
(p. H) abrid “road” < Ar. barīd بريد (confirmed by the Tuareg pronunciation of this word, abărid)
(p. 38) ləbṣəl “onion” < Ar. baṣal بصل (Siwi happens to preserve an older word for "onion": afəllu)
(p. 78) taħzamt “belt” < Ar. ħizām حزام

A couple are known Phoenician loanwords:

(p. 57) agadir, ažadir "wall" - Ar. jidār جدار

A few are well-known Afroasiatic cognates, and scattered among them may be other valid cognates:

(p. 250) iləs “tongue” - Ar. lisān لسان
(p. 110) iđammən “blood” - Ar. dam دم
(p. 292) tiqqad “burning” - Ar. wqd وقد

But the book makes no attempt to distinguish between words taken from Arabic comparatively recently and words inherited from the common ancestor of Berber and Arabic, and seems to assume that any word found in both dialectal Arabic (Darja) and Berber must automatically be originally Arabic, rather than possibly being a borrowing from Berber into Arabic. There is a well-known technique for sorting out inherited cognates from loanwords from coincidental similarities: sound correspondences. Sounds don't usually change at random: they change systematically, just as all j's in Egyptian Arabic become g. You establish which Berber sounds normally correspond to which Arabic ones under what circumstances, based on looking at what happens in the clearest cases; that gives you a standard by which to judge the doubtful ones. Saadi has made no effort to do this, and the unfortunate result is that in his comparisons the chaff far outweighs the wheat.

Berber and Arabic both descend from the same language, but that language was neither Berber nor Arabic, and probably didn't come from Arabia - and if you want to know about that common source, then you'll learn more from the works of Diakonoff or Greenberg, or even from more problematic sources like Orel and Stolbova 1999 or Militarev's online database, than from Saadi 2007.

Wednesday, February 25, 2009

Endangered Languages Week

It's half-over already, but I really ought to mention: Endangered Languages Week is happening at SOAS this week, and may be of interest to readers in London.

Also, a interesting news story, a reminder that many countries still have legal restrictions on what language you can speak where: A prominent Kurdish lawmaker gave a speech in his native Kurdish in Turkey’s Parliament on Tuesday, breaking taboos and also the law in Turkey.

Sunday, February 22, 2009

`baskundza igwạḍən!

I don't suppose there are more than about two or three people on earth who care, but I just figured out an etymology that's been puzzling me for a while. In Kwaṛandzyəy, the word for "genie" is agwəḍ, plural igwạḍən. It looks Berber for its form alone, but I had never found it in any dictionary - until now, going through Taine-Cheikh's new Zenaga dictionary, when I came across ugṛuđ̣an (original singular *ugṛuḍ) "démons, diables (plus dangereux, plus forts que les autres)". It turns out to have been borrowed into Hassaniya too - īgṛäwṭən. The loss of is more or less regular in Kwarandzyey (usually it's restricted to intervocalic positions, but there are a few other examples like this); so is the shortening of a long vowel to ə in a final closed syllable, with a w remaining to indicate its former quality. Quite possibly the next commenter will tell me that actually this word is well-known in Kabylie or Morocco or something, but for now it's another piece of evidence for my claim that Kwarandzyey includes a number of loanwords specifically from the Zenaga branch of Berber.

UPDATE: see comments - it wasn't the next commented, but the third one who established that this word is attested in southern Morocco too, which makes sense both since that region is also fairly close to Tabelbala and since it tends to be easier to find Zenaga cognates there than further north or east.

Friday, February 20, 2009

The Tyranny of Morphology

Coming out of an airport, you have to pick one of two exits: "Goods to Declare" or "Nothing to Declare". You have to go through one to get out; but (at least in Customs' eyes), by going through either exit, you state whether or not the contents of your luggage are legally subject to import duties. If you feel so scrupulously honest and so intensely secretive that you decide you have to leave that question unanswered - your only option is to stay inside.

Often your language does that too (Whorf said it first.) Just like the airports, the trick is to set things up in such a way that trying not to answer the question is either unacceptable (ungrammatical) or automatically interpreted as implying a particular answer. If you're talking about a friend in English, you don't have to indicate whether the friend is male or female until you refer back to the friend with "he" or "she"; in Arabic or Spanish, you have to state which it is from the start; and in Chinese or Songhay you can get away with never saying it at all. If you believe something definitely happens at some point, but don't want to say whether it's already happened or not yet, there's no simple way to say that. At best, you end up having to use cumbersome disjunctions like, if you're into apocalyptic prophecies, "The Antichrist either will be born some day or already has been"; and disjunctions like that will always be interpreted as meaning that you don't know which, not that you know but don't feel it's relevant.

In Korean (according to a talk by Peter Sells I heard today), a special verbal affix -si- (one among many, many politeness indicators) is used to indicate that the human subject of the verb (loosely speaking - it may also be a possessor of the subject, or a topic) is notionally of higher social status than the speaker. Thus:

sensayng-nim-i ka-si-ess-ta
teacher-HONORIFIC-NOMINATIVE go-SUBJECT.HONORIFIC-PAST-DECLARATIVE
"The teacher went."

vs.

koyangi-i ka-ess-ta
cat-NOMINATIVE go-PAST-DECLARATIVE
"The cat went."

The thing is, this means you can't be neutral about the subject. If you don't use this suffix with a subject that would normally take it, like "teacher" or "pastor", your listener will assume that you don't respect them so highly. You can't even get away with being ambiguous - I'm told that a disjunction of politeness levels, like *"The teacher went(honorific) or went(unmarked) away", is totally unacceptable. There are genres, such as academic writing or journalism, where politeness morphology is not normally used, allowing you to be neutral on this; but in a face-to-face conversation, as far as I understand, no such solution is available. (Any Korean readers should feel free to correct me!)

No language is likely to be able to stop you from saying what you want to say, if you try hard enough. But things like this can make it a lot harder to avoid saying what you don't necessarily want to say.

"Written in Islamic"

I don't usually do current events posts, but this one was cute enough to warrant a micro-post: egregious ex-Senator Rick Santorum declares that Muslims think that “The Quran is perfect just the way it is, that’s why it is only written in Islamic.” In most speeches, a sentence like that would be a major embarrassment; in this one, it's merely his only linguistics-related blooper.

(Via Angry Arab.)

Sunday, February 15, 2009

Fusha: the Straussian choice?

I came across a review of a book called Why are the Arabs not Free? The Politics of Writing, by an Egyptian psychoanalyst. I haven't read it (nor Adonis, whom he discusses below) but the quote presents an interesting perspective on Arabic diglossia:
My understanding of the political significance of this divorce between political and demotic Arabic and the key place of writing in the perpetuation of despotism crystallised when I read the work of our great poet Adonis, entitled The Book. It is one of the most revolutionary books I've read in Arabic literature. Apart from its provocative title, it lays bare the truth of our political history as having been a series of assassinations in a struggle for power. But it's written in such a high style that it's a difficult text even for the educated, without taking into account the vast majority of illiterate folk. So, it's no wonder that The Book has remained a 'dead letter'. I may say that I once heard Adonis declare that he won't ever write except in 'grammatical' Arabic because he prefers writing in a 'dead language'. One may wonder if his choice doesn't also represent his method for dealing with the condition [the German-born American political philosopher] Leo Strauss describes in his Persecution and the Art of Writing. The authorities are happy to ignore such books because in the unlikely event that they themselves have understood them, they know that their message will only reach a very limited number of people.
A tempting hypothesis in some ways, this idea that Fusha acts to insulate the majority of the population from the debates of intellectuals, keeping the powers that be safer from ideologically-inspired opposition and the intellectuals themselves safer (in the short term!) from popular reactions to their speculations. But is the issue really that people have trouble with the language, or just don't read much? Both are true to some degree, but in an era where TV shows and news programs in standard Arabic command large audiences across the Arab world, it's not plausible to blame everything on the difficulty of the language.

Elsewhere in the article he is said to imply that giving the colloquial greater status will "reduce any feeling of powerlessness as a result of a lack of formal linguistic expertise". That seems harder to argue with, given that many (probably most) people who can understand standard Arabic fine can't put together more than a sentence or two without mistakes, and certainly can't sound as eloquent or clear or at ease in it as in their colloquial language. But then again, what power does speaking standard Arabic well actually entail, when plenty of ministers and millionaires can't? Only the power to take part in debates that seem to have remarkably little effect on the society around them?

Friday, February 13, 2009

Why do historical linguistics?

Unraveling the details of a given language family's history is painstaking, detail-oriented work - comparing hundreds or thousands of words to each other, looking through different languages' grammars, coming up with hypotheses to explain what you see and hoping the next language you look at doesn't disprove them... Why do it?

Well, for one thing, you end up showing interesting things about the history of the relevant part of the world, often things it would be hard or impossible to show any other way - that Madagascar was settled by people from Borneo, for example, or that Ijo slaves from Nigeria ended up on the Berbice River in Guyana, or that Persians and Swedes (along with a lot of other people!) ultimately both got their language from a common source. But that depends on your being interested in a particular region; why would a person working on the historical linguistics of (say) the Sahara care about the historical linguistics of New Guinea, or Alaska, or even Europe?

It's because people are pretty similar everywhere - we all have roughly the same mouths and the same brains, and as a result we all tend to make roughly the same kinds of changes. Looking at changes in the languages of Europe, and at which direction they went, turns out to give you a pretty good idea of what kind of changes to expect in New Guinea - and vice versa; wherever you go, k is much more likely to change to g than to n, and a word meaning "want" is much more likely to become a future tense marker than a word meaning "jump".

That means that all these individual small-scale studies are so many pieces fitting together to form a map of how language works. Describing a language (no mean challenge in itself) shows you one set of possibilities; typology tells you the possible states of a language; but historical linguistics relates them to one another, showing you which states are closely linked and which are not. You can't predict what will happen to a language, but you can see in advance what kind of changes are likely and what kind are unlikely.

For sounds, this map of changes - this network linking different states of a language to one another - will seem familiar; it corresponds closely to articulatory and/or auditory similarity. You can mostly account for it by knowing how different sounds are made (with the lips, the tongue, etc...) and which sounds are hardest to distinguish. The key test for a theory of syntax (as far as I'm concerned) is whether it can account similarly for the attested map of syntactic change.

Thursday, January 22, 2009

Oldest Papuan writing?

What are the oldest written documents in a Papuan language (ie a non-Austronesian language of the New Guinea region?) I'm not totally sure, but a strong candidate has to be the court records of Ternate. The islands of Ternate and Tidore in eastern Indonesia speak two closely related languages belonging to the non-Austronesian North Halmaheran family. They have been writing using the Jawi Arabic script since at least the 1500s; in fact, some of the earliest surviving Malay manuscripts are letters from the sultan of Ternate from about 1521.

Recently I came across an 1890 book on Ternate online: Ternate: The Residency and its Sultanate. The book includes a brief introduction to the language and a word list; it also gives reproductions of several manuscripts whose originals date back to the mid-1800s, along with translations. So if you want to try your hand at deciphering them, or just see what a Papuan language looks like in Arabic script, have a look! The page I've linked to (Arabic interpolation de-italicised) starts:

ma-dero toma hijratu-nnabiyy ṣallī `alayhi wa-sallim nyonyohi pariama calamoi si-raturomdidi si-nyagisio si-rara, tahun alif, toma-arah Sawal, i-fani futu nyagimoi si-tomodi, malam Jumaatu...

"In the year Alif of the Moslem era 1296, during the month of Sawal, on a Thursday night, the seventeenth night of the moon..."

Wednesday, January 21, 2009

Verbal adjectives in English

It may seem pretty exotic to English-speakers that in some languages adjectives behave almost exactly like verbs, but this strategy is not as un-English as it looks. Consider the following colloquial American English sentences, with more formal approximate "translations":

This rocks! - This is good. (and not *This is rocking!)
That would rock! - That would be good.
That rocked! - That was good.
That was a rockin' day. - That was a good day.

This sucks! - This is bad. (and not *This is sucking!
That would suck! - That would be bad.
That sucked! - That was bad.
That was a sucky day. - That was a bad day.

In these, a property usually expressed with an adjective ("good", "bad") is being expressed using a stative verb, but only in predicative constructions (that is, to form a sentence.) In attributive function (that is, modifying a noun) an adjective derived from this verb is used. Like other stative verbs ("know", "be") but unlike non-stative verbs, it uses the simple present form to express a current situation, not the present continuous.

Within English, this pattern may seem pretty odd. But it corresponds rather well to how adjectives are expressed in Songhay languages, eg Koyra Chiini (Heath 1999:73). There, properties are expressed in predicative contexts just like verbs, with the same mood/aspect/negation particles, and in attributive contexts usually take a suffix:

ni beer - you are big (like ni koy - you went)
hal a ma beer - until it gets big (like a ma koy - he will go)
har beer - a big man

ni futu - you are bad
har futu-nte - a bad man

In Songhay the perfect aspect is used with stative verbs to express a current situation; but, like the English simple present tense, this is the simplest indicative verb form. The chief difference is that in Songhay the predicative verbs are used for inchoative senses too, as if "That rocks!" could mean "That is becoming good" as well as "That is good".

Typologically, I find it kind of interesting that what looks like a couple of verbal adjectives should be lurking in the recesses of the English lexicon. But it also has practical applications: if I were trying to teach Songhay or a typologically similar language to Americans, I would certainly start by discussing the example of these two English words.

Friday, January 16, 2009

Coptic adjectives

A little follow-up on the previous post, based mainly on Reintges' Coptic Egyptian (Sahidic Dialect): A Learner's Grammar:

In Coptic, predication of properties is handled exactly as for nouns, including the use of an determiner with the adjective:

hen-noc gar ne neu-polytia
indef.pl-great for are their-labours.
For their labours are great.

In attribution, the structure is Determiner - A - n - B, where A can be the noun and B the adjective, or vice versa:

ou-kohi n-soouhs: a-small n convent
t-parthenos n-sabê: the-virgin n prudent

To express the material of which something is made, you use the same structure, except that only B can be the material:

t-kloole n-ouein: the-cloud n light "the cloud of light"

Note that this is separate from the attributive construction:

ntof pe-iôt pahôm "He, our father Pahom"

So can adjectives be distinguished as a separate word class, when they behave so much like nouns? The answer is yes: an adjective is an item that can occupy either A or B in the attributive structure without a change in referential meaning. (See Coptic Grammatical Categories, Shisha-Halevy, p. 53.) If you reverse the constituents of a genitive or material construction, you change the referential meaning: "a vessel of wood" vs. "vessel wood (ie wood for vessels.)" If you do so for an adjective-noun attributive construction, the referential meaning stays the same: ou-noc n-polis or ou-polis n-noc both refer to the same entity, "a big city". So for this case, Dixon's hypothesis scrapes through.

Saturday, January 10, 2009

Adjectives - who needs 'em?

Most languages have a class of words that express properties and behave differently from other words. These are called adjectives. In English, for example, words like "red" or "old" or "tall" behave differently from nouns or verbs. For example, you add -s to verbs in the present tense if their subject is 3rd person singular, like "he sings" or "she eats"; but you can't add -s to an adjective, so you say "he is red" rather than *"he reds". You can put "very" before an adjective ("very red"), but not usually before a noun (you can't say *"very food".) Verbs can't be placed between "the" and the noun (unless you add an ending like -ing or -ed), but adjectives can (you can say "the red car", but not "the move car").

It turns out, according to Dixon 2004, that practically every language - perhaps every language - has at least one separate class of words, definable purely on the grounds of their (morphosyntactic) behaviour rather than their meaning, that refer to properties. This class typically includes words expressing size, age, value, and colour, and sometimes more.

But often, a concept expressed using an adjective in one language is expressed only by a verb or a noun in another. For example, in Kwarandzyəy adjectives come between the noun and the plural marker:

ạdṛạ kədda yu
mountain small PL
"little mountains" (hills)

But there is no adjective "happy" in Kwarandzyəy; instead, you use a verb, yəfṛəħ "be happy, rejoice". And to say "the happy people", you say "the people who are happy/have rejoiced":

bạ γ i-ba-yəfṛəħ
person who they-PF-happy

Moreover, though they may always be distinguishable by some test, they usually tend to behave very much like another word class. In fact, Stassen 1997:30 (link goes to 2003) postulates that in every languages adjectives handle predication (saying "X is red", for example) in the same way as either verbs, nouns, or locations. For example, in English or Arabic, adjectives handle predication like nouns (you say "He is tall", just like "He is a footballer"); in Korean or Tamasheq, they do it like verbs; and some languages, like Japanese, have both verb-like and noun-like adjectives.

So clearly people can do without some adjectives, and clearly the behaviour of adjectives tends to be very similar to the behaviour of some other word class. Why not do without them altogether? It would be easy enough to construct a language where no morphological or syntactic tests could distinguish adjectives from verbs, or from nouns. So if practically every language does take the trouble to distinguish them, there must be some pretty powerful cognitive motivation for it - and some pretty powerful historical tendencies acting to separate adjectives from verbs and/or nouns. The question isn't directly relevant to my current work, but it's worth thinking about.

Sunday, December 28, 2008

Siwa and its significance for Arabic dialectology

Hope all my readers are having/have had a great holiday.

A paper of mine, "Siwa and its significance for Arabic dialectology", should (inshallah) be appearing in ZAL soon-ish. Basically, there's a whole lot of Arabic influence on Siwi, including things you wouldn't expect to be borrowed, like Arabic's rather unusual method of forming comparatives from adjectives. However, this influence shows clear signs of deriving, not from any dialect currently used in or even particularly near Siwa, but rather from a more archaic one, with some resemblance to the dialects of other Egyptian oases quite distant from it and some features not attested in any other Arabic dialect of Egypt or Libya. In the 1100s, according to al-Idrisi, Siwa was inhabited both by Berbers and by sedentary Arabs; I suspect that the Arabs got assimilated into the larger Berber community and that much of the Arabic element of Siwi derives from their now-extinct dialect. If this sort of thing interests you, have a look (you can download it from the link at the beginning of this paragraph) and please feel free to comment on it here or by email.

Sunday, October 12, 2008

Tifinagh at Leiden

There were two more talks at Leiden that I should have mentioned, on a subject I've always been interested in - Berber writing systems.

Ramada Elghamis is working on a thesis about Tuareg writing systems, and described the purpose of "ligatures" (a more appropriate term would be "conjuncts") in the Tifinagh of the Air region of Niger. Tuareg Tifinagh allows a number of letter pairs (rt, zt, nk...) to be combined into a single letter. It turns out that this is not artistic license, but an essential feature of the script. In traditional Tifinagh, no vowels are written - but if two letters are combined into a ligature, that means that there is no vowel between them, thus resolving a lot of ambiguities. For example (from memory, so details may be wrong), t-m-r-t is read "tamarit", a woman who is loved, whereas t-m-rt is read "tamart", beard; in unvocalised Arabic script, or in traditional Tifinagh minus the ligatures, there would be no way to distinguish the two.

Robert Kerr came up with a nice argument that Libyco-Berber, the pre-Roman script from which Tifinagh is descended, was adapted specifically from the Punic (early Carthaginian) variant of the Phoenician script, not the original Lebanese one and not the later Neo-Punic one. Basically, Old Phoenician marks no vowels at all; Punic marks a few vowels, almost always final ones; and Neo-Punic marks most vowels in all positions. Libyco-Berber (and traditional Tifinagh) also marks vowels only in final position; this rather odd idiosyncrasy is best interpreted as having been adopted from Punic rather than independently innovated.

Friday, October 10, 2008

Berberologie colloquium at Leiden

I've spent the past couple of days at the Berberologie colloquium in Leiden, and it's been great fun. There were plenty of very interesting speakers, but for me two languages stole the show: Tetserrét and Ghomara.

Tetserrét (discussed by Cécile Lux) is spoken by a Tuareg tribe, the Ayt-Tawari, in Niger. But it's not linguistically Tuareg at all - its closest relative is Zenaga, the Berber of Mauritania (not northern Berber, contrary to Wikipedia), and Tuaregs can't even understand it. It seems to be an isolated survival of the Berber language spoken in the region before the Tuareg got there. It's not in Ethnologue either. (Taine-Cheikh's new Zenaga dictionary is out, by the way, and was selling as fast as a book reasonably can in a conference of twenty people.)

But Ghomara, in northern Morocco, is something else. Across Berber, borrowed Arabic nouns typically behave like in Arabic (keeping their Arabic plurals, and not changing for case.) In Ghomara (discussed by Jamal El Hannouche), Arabic adjectives take Arabic rather than Berber agreement marking - and even some Arabic verbs get conjugated fully in Arabic, not in chance code-switching but regularly by all speakers, and up to and including pronominal object suffixes. It's not quite unprecedented worldwide, but that level of contact influence is pretty darn rare.

I didn't put Tadaksahak in the first paragraph because it's much less unfamiliar to me, but Regula Christiansen's paper on that had some interesting implications. Basically, Tadaksahak has all but lost the Songhay method of forming attributive adjectives; instead, it's substituted a simplified version of the Tuareg one (suffixing -an), which has become productive for Songhay adjectives too. The funny part is this: Songhay has a lot of CVC adjectives (stative verbs). Tuareg doesn't really do CVC adjectives; it prefers longer words. So when you add the -an to these, you typically reduplicate the adjective. For example, kan "be sweet" > kankanan "sweet". This comes worryingly close to invalidating a conjecture I had made on the borrowability of templatic morphology (but not quite!)

My own paper established that much of the Berber element of Kwarandzyey derives from an extinct close relative of Zenaga. In effect, the "Western Berber" genetic subgroup of Berber has four members: Zenaga itself (finally with a decent dictionary), Tetserrét (awaiting further publications), the large Berber element of Hassaniya, and part of the proportionally larger Berber element of Kwarandzyey.

Saturday, October 04, 2008

Translating from linguists' English to normal English

Machine translation between languages is hard, obviously. There are all sorts of reasons why just looking words up and constructing syntactic trees and changing orders appropriately isn't enough to produce a good output - mainly, the fact that to disambiguate ambiguities you often need real world knowledge, and different vocabularies are not always organised in the same way. How much that matters is really emphasised by thinking about a slightly different problem: translation from a technical vocabulary to a non-technical one within the same language.

Take the following sentences, pulled at random from a grammar on my shelf (Stroomer's Grammar of Boraana Oromo):
"Nouns ending in -ni (mostly -aani) have ultimate or penultimate stress in free variation."

"Verbs with the verb extension -ad'd'-, -at- have an AFF.IMPER.sg: -ád'd'i, -ád'd'u and a NEG.IMPER.sg: -atín(n)i, see 10.10." (p. 72)

If you are, say, a foreign worker about to be posted to northern Kenya, or a second-generation emigrant Oromo planning to go back and visit, you may well want to try and learn some Oromo from this book. But the odds are you will not know what either of these English sentences means, and that applies to quite a lot of the book.

How could you translate these sentences into terms a wider audience would understand? If you can assume a certain amount of basic knowledge (traditional parts of speech, consonants and vowels) then that makes things easier:
"Nouns ending in -ni (mostly -aani) get stressed on the last or second-to-last vowel, it doesn't matter which."

"Verbs with -ad'd'-, -at- added at the end have an imperative singular: -ád'd'i, -ád'd'u and a negative imperative singular: -atín(n)i, see 10.10."
Realistically, you can't assume that level of knowledge, certainly not in Britain at any rate (I still can't believe that what little grammar gets taught in schools here only ever seems to get taught in foreign language classes, not in English ones; that no doubt explains part of the country's comparatively low foreign language skills.) So what does that leave you with? Something like:
"When you say a word that refers to a person, place, or thing* and ends in -ni (mostly -aani), you put the emphasis at the end or just before the end, it doesn't matter which."

"If you have a word that means doing something* that has -ad'd'-, -at- added at the end, then to order one person to do that you add -ád'd'i, -ád'd'u, and to order them not to do that you add -atín(n)i, see 10.10."
(*Yes, I know that syntactic tests like whether they can be the object of a preposition yield more accurate definitions, but in practice these are a good first approximation, and the former does work even on gerunds: "Killing is a bad thing", so "killing" is a noun, but *"Kill is a bad thing", so "kill" isn't.)

Could this be done algorithmically? A simple substitution table would certainly not be enough. Just try it with any set of definitions you can think of:
"Words referring to a person, place, or thing ending in -ni (mostly -aani) have final or pre-final emphasis such that it doesn't matter which."

"Words that mean doing something with the words that mean doing something extension -ad'd'-, -at- have an agreeing order-giving one-entity: -ád'd'i, -ád'd'u and a denying order-giving one-entity: -atín(n)i, see 10.10." (p. 72)
Not terribly helpful, I think you'll agree... To come up with something a little more helpful (and I'm sure my renditions could be improved on) we had to change the whole structure of the sentence. Even then, at some point it's probably going to be more effective to just teach the person the grammatical notions and let them go forward from there than to keep giving brief explanations of the same notion over and over again.

The problem is certainly not unique to linguistics. Medicine, law, ecology - most fields have technical vocabularies that pose an obstacle to non-specialists, who will often have good reason to be interested in trying to make sense of them. Is there any role for algorithms in this (apart from obvious things like hyperlinking technical terms to dictionary entries)? It's well outside my usual field, but it would be interesting to hear of any attempts.

Saturday, September 13, 2008

Overheard from the code-switching department...

...from an Algerian here in London:


kanu supplying-lna
they.were supplying-to.us


You have a non-finite English form ("supplying") in a past continuous form, in accordance with the English construction but contrary to the Algerian Arabic one, which would require a finite form ("they supply"). You have an Algerian Arabic clitic pronoun - a form that can't stand on its own, but has to be attached to the end of something else - being stuck onto a totally unadapted verb in another language; code-switching in the middle of a phonological word! The facility with which some Algerian long-term residents of the UK combine their two languages is really rather remarkable, and would merit further study.

Thursday, September 11, 2008

Fieldwork and address books

Linguistics, with its regular sound shifts, unidirectional grammaticalisation processes, and tree diagrams, is perhaps the most satisfyingly scientific of the social sciences. But today I found myself reminded that it is still emphatically social, particularly when you want to actually gather new data about undocumented languages. Mobile phones have become ubiquitous even in such far-flung corners of the Sahara as Tabelbala and Siwa, used even by illiterate people - making it possible to keep asking people about the language well after you've gotten back to the university. So over the past months of fieldwork my phone has accumulated quite a lot of numbers, which I backed up to my computer today. The final count? At least 84 phone numbers from Tabelbala and 43 from Siwa. To put this in perspective, there are only about 3000 Kwarandzyey speakers, so I can call something like 3% of the population.

The field linguistics courses at SOAS lay a commendable emphasis on teaching the practicalities of fieldwork - what microphone, what recorder, what software... But there's a gap in the course: managing contacts. Going through these I found a few casual contacts I could barely or even not at all remember, and some people I could remember but not easily remember the relationships between. There's some information in my field notebooks, but it's scattered and not always detailed. I should have been making concise but informative notes about all these people somewhere as I took their numbers - not something you can do easily with my already somewhat antiquated mobile, but that might be a reason in itself to take a more sophisticated one along, or even to use a paper address book, if you have space in your pocket for one alongside your field notebook. If you plan to do any fieldwork, bear this in mind!

Thursday, September 04, 2008

Desert lizards


If you're an Arabic speaker from the right part of southwestern Algeria, you probably call the smooth-skinned sand-burrowing lizard referred to in English as "skink" šəṛšmala شرشمالة. I recently found the original form of this word in Al-Hilali's Berber-Arabic lexicon from 1665: asmrkal or asrmkal أسرمكال, a word composed from asrm "worm" and akal "earth". In many Berber varieties (the so-called Zenati ones), akal becomes šal, and in some Arabic dialects if there's one š ش in a word any s's س have to become ش, so you'd get شرمشال, and by metathesis شرشمال.

Are any readers familiar with skinks? What would you call them?