Jabal al-Lughat

Wednesday, August 26, 2009

The Piraha discussion continues

Via Language Log/John Cowan: Dan Everett's finally gotten around to publishing a few more examples of his claims about Piraha - notably, that they have no recursion, and in particular no subordinate clauses Even quoted speech and conditionals, he claims, are not embedded. Here it is: Pirahã culture and grammar: A response to some criticisms.

Now, recursion means being able to embed a given kind of phrase within another example of the same kind of phrase, as many times as you want. In "the door of the house", one noun phrase ("the door") is embedded within another one ("the door of the house"); in "I will visit you when it stops raining", a clause "it stops raining" is embedded within a larger one ("I will visit you when it stops raining"). You can also keep doing this ("the edge of the handle of the door of the house", "I will visit you when I know whether Khaled said that James is right about the forecast that it will rain tomorrow.") In Piraha, Everett reports that for noun phrases you can only do this once (no more than one possessor), and for clauses that you can't do it at all (he insists that all the examples that look like subordinate or adverbial clauses are actually separate sentences whose linkage is left for the listener to interpret, and in this paper presents some arguments for this.)

The thing is, a language with such properties has obvious potential to be expanded into a language like English or Arabic. For possessors, all it would take is a little analogical expansion - that's what allows us to interpret a phrase like "my brother's wife's cousin's friend's cat's teeth" as grammatical, even though you may well never have heard a noun phrase with six possessors before. For subordinate clauses, all it would take is grammaticalising some kind of erstwhile adverb or intonation pattern or quotative marker into a signal that these two clauses are more closely bound than others; such changes occur all the time in languages that already have subordinate clauses (eg "with what" > "in order to" in Algerian Arabic.) If the Piraha haven't done this, then why not? If they used to speak a language with multiple possessors and subordinate clauses in the past, why and how did they abandon these features - and if they never have, then why have most languages gained these features? In short, what motivates the expansion of grammar, and how does it happen?

One place (doubtless not the only one) where I think you can see expansion of grammar in action is technical terminology; consider mathematics. "The set of all p/q such that q!=0 and p, q are integers" is perfectly clear mathematical English, but is rather unlikely to be heard in everyday English (? "the set of all couples such that the husband is not an accountant and both the husband and wife are from Belgium"). The needs of mathematical communication have motivated the use of a kind of relative clause, with a complementiser and neither a gap nor a resumptive pronoun nor a relative pronoun, which is at best marginal in normal English; if enough people were trained as mathematicians, it might get used more widely. Maybe multiple possessors and subordinate clauses are technical features to cope with the demands of socialising with large numbers of people. Or maybe Piraha has a little more embedding than Everett reports. Speculation is fun, but a nice big, searchable, publicly available corpus would be a lot more convincing.

Friday, July 17, 2009

More on Nile Valley Berber [?]

I finally got around to borrowing Bechhaus-Gerst's Sprachwandel durch Sprachkontakt am Beispiel des Nubischen in Niltal. It's tough going because I don't really speak German, but she briefly suggests (p. 37) that the C-Group Culture of 2200 BC-1500 BC in lower Nubia, known as Temehu to the Egyptians, were Berbers (referencing Behrens 1984/5), and that Nobiin-speaking Nubians came in about 1500 BC and replaced them. This would explain the possible Berber loanwords in Nobiin, notably aman "water". Apparently, the archeology shows a change of cultures and of body types around 1500 BC, and ancient Egyptian paintings first begin depicting their southern neighbours as black around this period, while the Egyptian loanwords in Nobiin seem to date to the New Kingdom or later.

The identification of the Temehu with the Berbers is not based on linguistic evidence, as far as I know, and the small inventory of possible Berber loans in Nubian is neither conclusively established nor necessarily dates from as early as 1500 BC. So I don't know how much confidence to put in this scenario. However, it points to an interesting avenue for studies of Berber to explore. A lot of evidence suggests that Afroasiatic originated further east than North Africa, so it would make sense for there to have been Berber speakers in the Nile Valley - that could even be where Berber spread from in the first place. I previously discussed this issue in The Berbers of Southern Egypt.

The book is interesting for other reasons, incidentally - if her scenario for the development of Kenzi/Dongolawi is correct, it has borrowed an astonishing amount of grammatical material from Nobiin.

References:
Behrens, P. 1984/5. "Wanderungsbewegungen und Sprache der frühen saharanischen Viehzüchter", SUGIA 6:135-216.

Saturday, June 13, 2009

Open to interpretation

Songhay's lexical economy - the way it keeps its lexicon rather smaller than its neighbours' by using a single word to fulfill the functions of what in most languages would be several different words - has attracted the attention of several of those who have written about the language from the 1850s onwards. While Kwarandzyey (Korandje) is so full of Berber and Arabic loanwords that the size issue probably no longer applies, it still has many striking examples of polysemy. Take "open", for example.

fya (from Songhay *feeri) is best translated as "open" (its commonest sense). Of course, to open one's mouth can be to start eating - hence the frozen compound fya-mmi "open-mouth" means "breakfast". But opening is also what you do to release something from an enclosed space; hence to "open water (for something)" (fya iri), or just "open", is to irrigate, and to "open for an animal or person" is to release them. Likewise, to "open a rope (for something)" is to untie it. To release something from your grasp is to let it fall - hence to "open for something" is also to drop it. And for a man to release his wife from her obligations towards him is to end the marriage - hence to "open for a woman" is to divorce her.

We can map the connections between these easily enough, making it clear that they form a coherent network of meaning:


breakfast untie
    \    /    \
     open - release
       \      / \
       irrigate divorce

But not only will any single English translation applied literally and consistently yield ludicrous results for at least some of these cases - translating it differently in different circumstances will force you to choose a single meaning in cases where the text is ambiguous. "He opened for the woman" probably means he divorced her, but in principle it could mean he released her (eg from prison), or untied her, or (literally) dropped her; in fact, since Songhay has no gender distinctions in pronouns, it should even be able to mean "It (eg an automatic door) opened for her". And of course, this kind of ambiguity can be deliberately exploited for effect, as in puns.

In Kwarandzyey, this is never likely to cause serious ambiguity - the language is almost never written down, and it's a small enough community that the context is usually known to everyone anyway. But imagine worrying about this kind of thing in a millennia-old text in a language that no one today speaks natively, and you can really see why even the most literal translation of such a text is unavoidably an act of interpretation.

Friday, June 05, 2009

Why dead snakes are like clothes

What would you say if, in some science-fiction novel, you read of a language where the situations that in English would be described as "The clothes blew down from the clothesline", "Push that dead snake away with a stick", and "I see where he's carrying the rabbits he killed hung from his belt" were all naturally expressed with the same root, plus nothing more than different affixes? What about "I slammed together the hunks of clay I held in either hand", "I slung away the rotten tomatoes, sluicing them off the pan they were in", and "I picked up in my mouth the already chewed gum from where it was stuck on the table"? My inclination would have been to dismiss it as a neat but implausible idea, placing some strain on the reader's suspension of disbelief. But - until no more than thirty years ago - such a language existed right in California. Go to Part III of Leonard Talmy's dissertation Semantic Structures in English and Atsugewi to get the data; here's a slightly less surprising example as a taster:

s-'-w-	cu-	lup-	hiy-ik:-	a
Subject=I, Object=3rd person	from a linear object moving axially [with one end] non-obliquely against the FIGURE	for a small shiny spherical object to move	out of a snug enclosure/a socket	factual
I poked his eye out (with a stick.)

s-'-w-	pri-	lup-	nik-iy-	a
Subject=I, Object=3rd person	from the mouth/interior of a person, working ingressively, acting on the FIGURE	for a small shiny spherical object to move	all about, here and there, back and forth	factual
I rolled the round candy around in my mouth.

Of course, people are people; after explanation, the similarities are easy enough to make out, and presumably given enough time anyone can learn to look at a situation and decompose it into elements like these, rather than the elements that "leap out" at an English speaker. In fact, I suspect that having to learn to see things the way the people you talk to do is one of the subtler drivers behind contact-induced language change. But cases like this provoke thought: just how much can the attributes of a situation most relevant to formulating a sentence vary from language to language?

Friday, May 29, 2009

More downloadable Berber books online

A few more old online books in lieu of a proper post (coming soon):

Märchen der Berbern von Tamazratt in Südtunisien (1900) (to just download the file in DjVu format: here)
Poésies populaires de la Kabylie du Jurjura (1867) (or download from here)
Dichtkunst und Gedichte der Schluh (1895) (or download from here)
Manuel de berbère marocain (dialecte chleuh) (1914) (or download from here)
Loqmân berbère (1891)
Grammaire de dictionnaire abrégés de la langue berbère (1844)

Wednesday, May 20, 2009

Eastern Berber vocabularies on Google Books

Some digitised Eastern Berber vocabularies from the first half of the ~~18th~~ 19th century for your perusal, if you're into that sort of thing. I was particularly impressed to find a Sokna vocabulary - I haven't yet read any other source on that language, though admittedly I haven't looked that hard.

* Lyon's vocabulary of the Berber of Sokna, from 1820
* Hornemann's vocabulary of Siwi, from 1798 (at my homepage)
* Caillaud's vocabulary of Siwi, from 1826
* Minutoli's vocabulary of Siwi, from 1827
* Koenig's vocabulary of Siwi, from 1839 (lots of other vocabularies in here - Somali, for example, and Nubian and even Fur)

Friday, May 08, 2009

Some Zenaga (Mauritanian Berber) words

Zenaga is the barely surviving Berber language of southwestern Mauritania around Boutilimit. Here are a few words I think are found only in Zenaga (and in some cases Tetserret), all from Taine-Cheikh. Unfortunately, I haven't found any really comprehensive dictionaries of (for example) Tashelhit, so I could well be wrong. If I am (as I was with agwəḍ), I'd love to hear it!

ämkän "young herd animal (eg sheep, goat)" - p. 308
ārwiy "scorpion" (< *arwəl) - p. 452
täygaḌ "young she-goat" (< *talgaḍ) - p. 577
agaḏ̣iy "Moor, bidani (white man)" - p. 181
täššänḍuḌ "mirror" - p. 129
taʔgaṛḏ̣aS "paper". (Other varieties have similar forms, but without any final s.) - p. 24
tämärwuS "bride" (Ahaggar Tuareg has rwəs "to be in rut" - obviously related, but not quite the same sense!) - p. 451

Saturday, April 25, 2009

French among Algeria's elite

The key issue in Algerian linguistic politics - substantially overshadowing the question of the role of Berber - is what should be the language of bureaucracy and education: Standard Arabic (the official language, and the primary pre-colonial language of literacy for all Algeria) or French (the colonial language, and hence ironically the language which most of the few educated Algerians at independence had studied in.) In practice, it's settled on the one setup most certain to minimise social mobility: Standard Arabic is the primary language of education and symbolism, and French of bureaucracy and social climbing. On top of that, the language of everyday life is Algerian Arabic or Berber, from either of which reaching fluency even in Standard Arabic, let alone the much more different language French, is an uphill struggle.

I recently came across a very illustrative quote from a survey specifically focusing on minor political actors in Algeria - party cadres, journalists, bureaucrats, businessmen, trade unionists, etc:

"To a limited extent, the only space open to [political] actors with little or no knowledge of French were independent unions, independent NGOs, the Arabic press and Islamist parties. This tendency was illustrated by the fact that third-generation elites barely speaking French - only one out of ten interviewees - came from one of these domains. Most other interviewees were either Francophone or bilingual, the latter having difficulties determining which language they considered to be their mother tongue [a footnote suggests she means "primary language"]. The same interviewee often gave different answers depending on whether he filled in this author's questionnaire prior to the interview, or whether he was asked in the course of an interview what language he felt most comfortable speaking and writing. A huge majority of the third-generation interviewees according to their own assessment were better with written French than Standard Arabic. As far as oral skills went, a third of the interviewees said they spoke Standard Arabic as well as or better than French. Over half the interviewees put their oral French skills at the same level as their command of Algerian Arabic or Kabyle Berber dialect, and one out ten claimed to speak French better than anything else." (Isabelle Werenfels, Managing Instability in Algeria, pp. 85-6)

This kind of situation is a recipe for resentment. The government has spent years educating people to be better at Standard Arabic and telling them that it was everyone's duty to use it rather than French; but unfortunately their passion for reform, after creating legions of eager Standard Arabic-using job-seekers, stopped at the gates of the Civil Service. Check out Algerian government websites sometime - many of them don't so much as have Arabic versions (eg Energy, Health, ~~CNRC~~ Finance), and most default to French.

As always, I think language skills should be a barrier only when they're necessary in themselves, not merely as a badge of class membership (and regionalism - people from Algiers or Kabylie are enormously more likely to speak good French than people from, say, the Sahara.) I'd certainly prefer Standard Arabic to French - it's much more like Algerian Arabic than French is, and more a part of Algeria's identity - but in the long run it would be better to create a situation where people could use their own mother tongue for official purposes.

Friday, April 24, 2009

Healed by the right words

We all know that placebos can be surprisingly effective. But - though it's not exactly surprising - I hadn't realised that there is experimental evidence that simply saying the right thing can have a curative effect.

Two hundred patients with abnormal symptoms, but no signs of any concrete medical diagnosis, were divided randomly into two groups. The patients in one group were told "I cannot be certain what is the matter with you", and two weeks later only 39% were better"; the other group were given a firm diagnosis, with no messing about, and confidently told they would be better within a few weeks. 64% of that group got better in two weeks." (Bad Science, p. 75, citing Thomas 1987)

I can imagine a lot of factors that could affect the effectiveness of the doctor's words here - mainly anthropological, but some of them would certainly fall within the domain of linguistics. For example, the intonation pattern will affect the patient's perception of the doctor's confidence; does that affect the efficacy? Likewise, the accent and the choice of vocabulary could both affect comprehension and perceived competence, and hence presumably the efficacy. Not really my field, but it could be a line of research with unusually clear-cut potential benefits. The obvious problem with this example is that it involves doctors lying to patients, but if the effect could be reproduced without that it would certainly be worth doing.

Bibliography:
Thomas KB. General practice consultations: is there any point in being positive? BMJ (Clin Res ed) (9 May 1987); 294 (6581): 1200-2.

Thursday, April 23, 2009

"Political complexity predicts the spread of ethnolinguistic groups"

An interesting paper: Political complexity predicts the spread of ethnolinguistic groups. Two basically unsurprising claims that it's good to have calculations supporting: "pastoralists were found to have larger language areas than agriculturalists" and "languages associated with more politically complex societies cover significantly larger areas than those of less complex societies". They also present arguments that "although regions of high biological and cultural diversity do overlap to a striking degree, it is unlikely that biological diversity has any direct effect on cultural diversity on a global scale." Surprisingly, mountainousness was found to correlate with larger language areas, not smaller ones - seems a little suspicious that, though some mountainous areas are pretty un-diverse. Flaws: well, it relies on Ethnologue data and GMI maps, both of which are often unreliable, and systematically more splittist in some areas than in others; but it's not obvious that that would substantially affect the result. Also, ethnic groups, languages, and political units very often don't match up, and their measure of political complexity is based on data for ethnic groups rather than for languages.

(Via GNXP.)

Friday, April 17, 2009

A Fulani village in Algeria

Anyone acquainted with West African history will be aware of the remarkable extent of the Fulani diaspora, stretching from their original homeland in Senegal all the way to Sudan. However, I was surprised to read the following note in a history of the Tidikelt region of southern Algeria (around In-Salah):

"Le village actuel de Sahel a été créé en 1779 par Sidi Abd el Malek des Foullanes, venu à Akabli dans l'intention de se joindre à une pèlerinage, dont le départ n'eut pas lieu... Les Foullanes sont des Arabes originaires du Macena (Soudan); il y a encore des Foullanes au Sokoto; Si Hamza, le cadi d'Akabli appartient à cette tribu." (L. Voinot, Le Tidikelt, Oran:Fouque 1909, p. 63)

(The current village of Sahel was created in 1779 by Sidi Abd el Malek of the Fulani, who had come to Akabli with the intention of joining a pilgrimage whose departure never occurred... The Fulani are Arabs originating from Macina (Sudan [modern-day Mali]); there are still Fulani at Sokoto; Si Hamza, the qaid of Akabli, belongs to this tribe.)

I very much doubt there would be any traces of the language left - even assuming that Sidi Abd el Malek came with a large enough entourage to make a difference - but wouldn't it be interesting to check?

Sunday, April 12, 2009

How many words are there in a language?

In a recent discussion, the question came up of whether a language's vocabulary could be tallied (briefly addressed at Language Log a while back, and at FEL.) I have no firm answer to that (and it's logically independent of whether or not you can estimate the proportion of the vocabulary coming from a given language - that's a sampling problem.) But, notwithstanding the bizarre if occasionally entertaining acrimony of that discussion, it's actually a rather interesting question.

Clearly, any given speaker of a language - and hence any finite set of speakers - can know only a finite number of morphemes, even if you include proper names, nonce borrowings, etc. ("Words" is a different matter - if you choose to define compounds as words, some languages in principle have productive systems defining potentially infinitely many words. The technical vocabulary of chemists in English is one such case, if I recall rightly.) Equally clearly, it's practically impossible to be sure that you've enumerated all the morphemes known by even a single speaker, let alone a whole community; even if you trust (say) the OED to have done that for some subset of English speakers (which you probably shouldn't), you're certainly not likely to find any dictionary that comprehensive for most languages. Does that mean you can't count them?

Not necessarily. You don't always have to enumerate things to estimate how many of them there are, any more than a biologist has to count every single earthworm to come up with an earthworm population estimate. Here's one quick and dirty method off the top of my head (obviously indebted to Mandelbrot's discussion of coastline measurement):

Get a nice big corpus representative of the speech community in question. ("Representative" is a difficult problem right there, but let's assume for the sake of argument that it can be done.)
Find the lexicon size required to account for the 1st page, then the first 2 pages, then the first 3, and so on.
Graph the lexicon size for the first n pages against n.
Find a model that fits the observed distribution.
See what the limit as n tends to infinity of the lexicon size, if any, would be according to this model.

A bit of Googling reveals that this rather simplistic idea is not original. On p. 20 of An Introduction to Lexical Statistics, you can see just such a graph. An article behind a pay wall (Fan 2006) has an abstract indicating that for large enough corpora you get a power law.

But if it's a power law, then (since the power obviously has to be positive) that would predict no limit as n tends to infinity. How can that be, if, for the reasons discussed above, the lexicon of any finite group of speakers must be finite? My first reaction was that that would mean the model must be inapplicable for sufficiently large corpus sizes. But actually, it doesn't imply that necessarily: any finite group of speakers can also only generate a finite corpus. If the lexicon size tends to infinity as the corpus size does, then that just means your model predicts that, if they could talk for infinitely long, your speaker community would eventually make up infinitely many new morphemes - which might in some sense be a true counterfactual, but wouldn't help you estimate what the speakers actually know at any given time. In that case, we're back to the drawing board: you could substitute in a corpus size corresponding to the estimated number of morphemes that all speakers in a given generation would use in their lifetimes, but you're not going to be able to estimate that with much precision.

The main application for a lexicon size estimate - let's face it - is for language chauvinists to be able to boast about how "ours is bigger than yours". Does this result dash their hopes? Not necessarily! If the vocabulary growth curve for Language A turns out to increase faster with corpus size than the vocabulary growth curve for Language B, then for any large enough comparable pair of samples, the Language A sample will normally have a bigger vocabulary than the Language B one, and speakers of Language A can assuage their insecurities with the knowledge that, in this sense, Language A's vocabulary is larger than Language B's, even if no finite estimate is available for either of them. Of course, the number of morphemes in a language says nothing about its expressive power anyway - a language with a separate morpheme for "not to know", like ancient Egyptian, has a morpheme for which English has no equivalent morpheme, but that doesn't let it express anything English can't - but that's a separate issue.

OK, that's enough musing for tonight. Over to you, if you like this sort of thing.

Houhou yentakheb rouhou

(Warning: this post contains no significant linguistic content.)

The results are in: Bouteflika has been “re-elected” as President of Algeria with a staggering 90.24% of votes cast. According to Government figures, 74.54% of eligible voters voted (although oddly enough, the polling booths looked deserted in all the main towns.) He had already served two terms, which had been the limit, so, to let himself run for re-election, he had had the constitution changed shortly beforehand. I would start mocking the guy, but why bother? With figures like that, he's making a fool of himself with no help from me. Time was when he was willing to settle for figures that naive observers might be capable of taking seriously; as he turns senile either his intelligence or his capacity for shame must be declining. The best measure of the glory of his achievements is the 50% of Algerian youths who intend to try to leave the country.

In case you were wondering how this result was achieved, here's my best somewhat informed guess: In the countryside, especially in areas like the Sahara where tribalism is still present, the local patriarchs simply tell everyone to vote en masse for the President, on the basis that he will stay in power no matter what they do and a conspicuous display of loyalty will earn them government investment (although even that wouldn't be enough to produce things like the 97% turnout in Tissemsilt without further fraud.) In the cities or the larger towns of the north, practically nobody bothers to vote apart from people on government payrolls, so they simply exaggerate the participation figures. In Kabylie, uniquely, we have a largely rural, somewhat tribal region fed up enough with the government that even the villages have organised themselves to refuse it legitimacy, so conspicuously that even government figures acknowledge a much lower turnout. If we assume that the government figures are broadly accurate regarding relative turnout (though certainly not absolute), then the situation shows up in the negative slope on this plot of population against turnout (participation); the two 30% wilayas are Tizi-Ouzou and Bejaia, the main Kabyle regions.

Another post on this worth looking at: Victory over the People.

Wednesday, April 08, 2009

When goals create blind spots

You're watching a ball game attentively. A person in a gorilla suit walks right through the middle, remaining visible for 5 seconds. Can you imagine not noticing the gorilla guy? Well, it turns out that nearly half of all ~~people~~ undergraduate volunteers don't, if they're busy trying to count passes - and the authors of that study cite 7 other experiments confirming the same principle.

It strikes me that there's a lesson there for linguists. Often linguists study a language for a specific theoretical goal - looking at Malagasy primarily to see what VOS syntax is like, or Oneida primarily to learn how polysynthesis works, or Songhay primarily to see whether it's related to Nilo-Saharan or not. That's fair enough; no one can focus on everything at once. But we can miss some really interesting stuff by focusing on one aspect of the language to the exclusion of others. For example, when Laoust studied Siwi, he was interested almost exclusively in its Berber origins - and as a result, his generally excellent study somehow ignored the vowels e and o (which are found even in Berber words, but are not phonemic in the Moroccan Berber varieties he was more familiar with), and mistakenly attributed the Arabic elements of Siwi to the adjacent Bedouin dialects, when in fact they show some very distinctive non-Bedouin characteristics. This is something we all need to watch out for.

Sunday, April 05, 2009

Flora of the Central Sahara and elsewhere

Ever found yourself trying to sort out a plant name you've elicited, not knowing any botany worth mentioning? Well, it turns out the botanists are a step ahead of the linguists on the digital libraries game, at least in Spain: the Digital Library del Real Jardín Botánico CSIC has a pretty remarkable array of books to browse online. The one that just saved my etymology of the Kwarandzyey plant name tsifəṛfəẓ is Etudes sur la flore et la végétation de la Sahara centrale. Vol. III: Hoggar, which gives both Tamasheq and binomial names for each plant mentioned. Unfortunately it's clear that not all the works give translations of the names, but it's still worth a look.

On a similar note, I've found Sahara-Nature handy sometimes.

Thursday, March 19, 2009

Beni-Snous: Two unrelated phonetic forms for every noun?

I got flabberghasted recently by a casual statement in Destaing (1907:212)'s grammar of the Berber dialect of Beni Snous in western Algeria (near Tlemcen). I nearly missed it as I skimmed it; see if you can spot it. (The translation is mine, as are the bits in brackets.) All the numerals above 1 are from Arabic here, but that's nothing surprising - the same is true in Tarifit, and few Berber varieties have retained the numbers above 3.

"The numbers from 2 to 9 inclusive are followed by the Berber noun in the plural [eg]:

two men ..... θnāịẹ́n ịírgǟzĕn
six women ... sttá n tsénnạ̄n
[...]

From "10" to "19" inclusive, the number is followed by the Arabic singular substantive:

eleven women ... aḥdăɛâš ĕrmra (Algerian Arabic mṛa "woman" مرة; contrast Beni Snous Berber θä́mĕṭṭūθ "woman")
fifteen cows ... ḫamstaɛâš ĕrbégra (Algerian Arabic bəgṛa "cow" بڨرة)
sixteen mares ... sttɛâš ĕrɛấuda (Algerian Arabic `əwda "mare" عودة; contrast Beni Snous Berber θáimārθ "mare")

After the number nouns "twenty, thirty, forty" etc., one uses the Arabic substantive[...]

twenty women ... ɛašrîn ĕmra
fifty mules ... ḫamsîn beγla (Algerian Arabic bəγla بغلة "mule")

a thousand rams: âlĕf kebš (Algerian Arabic kəbš كبش "ram"; contrast Beni Snous Berber išérri "ram")"

If I thought it were remotely possible for Destaing's claim to be true of counting every noun in the language - rather than, say, just the six nouns he gives appropriate examples for - I would be putting together an application to head out to Tlemcen instead of making this posting. (I might still do that anyway some time, mind you.) But for rather a lot of minority languages, all or nearly all speakers are bilingual. And if all speakers are bilingual, what in principle is there to prevent the grammar from containing a rule like this?

So I ask: have you ever come across anything similar elsewhere?

Wednesday, March 18, 2009

Scanned Multi-Alphabet Arabic Manuscript Online

The Princeton Digital Library of Islamic Manuscripts has put a large number of Arabic, Persian, and Turkish scanned manuscripts online. Plenty of interesting stuff there, but one that particularly stood out for me was the untitled Treatise on ancient, alchemical and magical alphabets. Behold the Omniglot of its day! (Well, it's apparently only from the 1700s, but probably a copy of an older work.) It gives tables for the supposed alphabets of each prophet, with the letter names on one page and the letter forms on the next. I'll just point you to a few of the highlights:

"Ifranji" (ie Frank) letters, that is to say lower case Latin
Greek (also "Sabi" and "Rumi") and Coptic
Hieroglyphics (barbāwī) - see also "Suli" and "Qinani". Needless to say, none of the values given bear any discernible relation to their actual sound values.
The "letters of India", rather reminiscent of the Maldivian thaana
Syriac, listed as the language of Adam (putting it several generations back from Ibn Hazm's more conservative description of it as the language of Abraham...)
"Jacobite", basically Hebrew, and the "letters of Aaron", basically Samaritan
Armenian
Kufic (early Arabic)
A table of the directionality of various scripts
A comparative table of magical alphabets
A Hermetic alphabet (attributed to Hermes, that is) called "Secrets of the Stones"

Knowing my readers, I suspect I'll have identifications of several of the alphabets I didn't recognise coming soon - although many, perhaps most, of them are certainly made up. Extra points for anyone who can come up with a picture of a magic bowl or something actually using one of the made-up alphabets.

Two other Arabic manuscripts there of potential interest: The conquest of Africa, from Qayrawan to Zab; Book of the Roman months.

Wednesday, March 11, 2009

išni: a Berber ovine, or a Songhay goat?

In Kwarandzyey (Tabelbala), the non-specific word for a sheep or goat is išni. It looks kind of Berber, and the words for different ages or sexes of sheep and goat are definitely from Berber, so I had assumed it must be Berber. But I've never found a term like it in any Berber dictionary. Maybe some reader will tell me that the word is familiar from his/her own hometown, but I just realised that there's an alternative explanation...

The word for "(female) goat" across Songhay may be reconstructed as *hìnčìnì (Nicolai 1981 gives *hìnkìnì, but in all the Songhay languages he cites except Kwarandzyey, original *k and *č both turn into the same sound before front vowels.) Nicolai 1981 gives amkkən "male goat" as the Kwarandzyey reflex of this word, but in fact (as Kossmann first pointed out to me) that turns out to be another one of the Berber etymologies that only Zenaga seems to explain: ämkän "jeune bête (tout animal de pâturage)" (Taine-Cheikh 2008). Instead, I'd like to propose that išni is the Kwarandzyey reflex.

*n is occasionally lost in Kwarandzyey (eg gwa "see" < *guna); I don't know any rule for this so far, but here it might be motivated by dissimilation. Initial *h is lost fairly commonly (at least "water", "man", "two", "three", "hunger"), so that's not necessarily a problem. Short vowels, most commonly (but not always) *i and *u, are frequently deleted, according to a rule whose conditioning I've been investigating lately. *č regularly becomes ts, but when immediately followed by a consonant regularly simplifies to s for all but some of the most conservative speakers. And s and š are not phonologically distinct (except for younger speakers, under heavy Arabic influence); the consistent use of š here would be explained by the i's flanking it. So that would yield *hìnčìnì > *inčni > *itsni > isni = išni.

Of course, if išni is attested in Berber then all this reasoning may have to be rethought - so if you speak Berber and have heard the word before, please tell me now!

Arabic (and Berber?) loanwords in southern Italy

Just came across a little monograph on Arabic and Berber loanwords in the dialects of the Basilicata (southern Italy): Sopravvivenze lessicali arabe e berbere in un'area dell'Italia meridionale, la Basilicata by Luigi Serra. Most of the loans listed are from Arabic, some quite obvious (eg taūt "coffin" < تابوت, źir "a copper or terracotta container for liquids" < زير, zammîl "big pannier with which various goods are transported on a beast of burden's back" < زنبيل), others rather less clear-cut.

Only three loans (and one placename) are claimed as from Berber. Two of them look acceptable, but all of them seem questionable, and they all refer to objects that there would have been no obvious reason to borrow terms for. It's possible that Berber influence can be found in southern Italian dialects, but this doesn't present a terribly convincing argument. Still, here they are:

źembr / źimbr / zimr / źimmr "billy-goat" (caprone, becco) < pan-Berber izimmər "ram", p. 39. (Looks good, but why the shift in species? - Also, see comments for an alternative Greek etymology.)
aččáta "big meal" (scorpacciata, mangiata, spanciata) < pan-Berber əčč "eat", p. 11. (The semantic and phonetic match are great, but the word is so short that coincidence seems hard to rule out.)
šéḍḍa "wing" (ala) < Zenati Berber "bird", eg Siwi ašṭiṭ, p. 26. The author mentions an alternative possibility - deriving it from Italian ascella "armpit" - that seems much more plausible.
Zaza (placename) < Berber azəzzu "thorny broom (plant sp.)" - not discussed in any detail (author cites Renisio), p. 41.

Saturday, March 07, 2009

Tawalt closing down

Tawalt is a nine-year-old Libya-focused Amazigh/Berber website with a remarkable collection of audio recordings, sketch grammars, vocabularies, and resources for some of the least well documented Berber languages - those of Tunisia, Libya, and Egypt. It is thus rather a shame that Tawalt is shutting down - updates stopping immediately, and site to go down by the end of the year. Sure, the Wayback Machine should preserve all the texts on it - but not its remarkable audio archives (which have already disappeared from the main page.) Their plans are probably related to political problems - the site's political postings had gotten rather outspoken. If you have any interest in Berber linguistics, I suggest looking around now before it disappears...