Monday, July 31, 2006

Mountains of Lebanon - some etymologies

Lebanon's cities and villages, tragically now in the news, have some interesting etymologies. I always used to wonder about the different names: why is the same city called Tyre in English, but Ṣuur in Arabic? or Byblos in English, and Jubayl in Arabic? The reasons illustrate the sheer length of these towns' history, and the time depth of Greece's contact with them.

Sometime before the characteristic sound shifts of Proto-Canaanite happened - perhaps 1200 BC or so? - Tyre would have been known by a Semitic term meaning something like "peak" or "crag": θ'uur-u. This was borrowed by the early Greeks as tur-os > tyros (when u got fronted to y) > Latin Tyrus > English Tyre. Meanwhile, in Lebanon, that glottalised θ', perhaps unsurprisingly, was among the first sounds to disappear; in Canaanite (that is, Phoenician, Hebrew, and assorted minor languages of the area), it became s' (or ṣ - it's hard to be certain whether the Canaanite emphatics were glottalised or pharyngealised). Case endings also vanished. This gave the Phoenician name: S'uur (or Ṣuur). It was adopted without change into Aramaic, and thence Arabic, as the region's languages shifted over time. The regular cognate of Proto-Semitic θ'uur-u in Arabic would have been ظور đ̣uur; this root is unattested as far as I know. However, in Aramaic *θ' became ṭ, and from this source the root entered pre-Arabic as طور ṭuur "mountain", a rare but well-attested term used in the Quran, notably for Mount Sinai. (In Ugaritic, freakishly enough, *θ' became γ (gh), and γuur- "mountain" is a well-attested Ugaritic word. In Ugaritic, incidentally, Tyre was actually called ṣuur-; so either my etymology here is wrong, which is possible, or Ugaritic borrowed the name after it had already changed *θ' to γ.)

Likewise, Byblos would have started out as gubl-u (attested in Ugaritic and Akkadian), which (judging by possible Arabic cognates) may have meant "mountain" as well. This went into early Greek as gwubl-os > byblos (Mycenaean gw > Greek b, u > y) > English Byblos. In Arabic, the g of course became j; and, for some reason (maybe it was a little town at the time?), it looks like a diminutive got added, turning it from *jubl to jubayl (colloquial žbeyl), which in Arabic just means "little mountain".

Let's hope the day that these towns appear on the news for their history or their beaches, not for the bombs being dropped on them, comes more quickly than looks likely.

Thursday, July 20, 2006

Polysemy vs. homonymy: some Algerian Arabic examples

I'm recently back from Algeria (hence the blog gap), so I thought I'd post some more meditations on Algerian Arabic...

Q: Which of the following words from Algerian Arabic are cases of polysemy (different meanings with a shared conceptual core) and which of homonymy (different meanings coincidentally identical in phonetic shape)?


`ṛuṣa عروصة - bride; daughter-in-law
ħjəṛ حجر - stone; lap
bakuṛ باكور - early-ripening figs; young bonito fish


A: `ṛuṣa, from Classical Arabic `aruusah عروسة, is a case of polysemy; a new bride traditionally goes to live in her husband's family house together with her new parents-in-law, so the extension is natural.

ħjəṛ is a case of homonymy: "stone" comes from Classical ħajar حجر, and "lap" from Classical ħijr حجر. Though it would be amusing to try and find a common conceptual core, I can't see any plausible one.

bakuṛ is etymologically a case of polysemy: both derive from Classical baakuur باكور, "coming early, early; premature; precocious" (Wehr). But synchronically, given the two independent restrictions of its meaning - it isn't used to mean first fruits in general, or young fish in general - I can only take it to be a case of homonymy.

Saturday, June 24, 2006

Ohlone

I used to live in the Bay Area for a while, so naturally I tried to find out about its pre-colonial language group, Ohlone. This turned out to have been a set of fairly closely related dialects/languages stretching from San Francisco down beyond Monterey, plus the coast of the East Bay. Their only reasonably close relative is Miwok, another small language family spoken to its north and west, although wider relations with languages further north along the Pacific coast are likely. Among the more noteworthy features of Ohlone are regular metathesis processes - for example, the plural suffix can be either -mak or -kma, depending on whether it's preceded by a consonant or a vowel.

Dave Kaufman has just posted some interesting Ruminations on Rumsien, one of the southern dialects; or, if you speak Spanish, you can read a grammar of Mutsun, a southeastern dialect. Wikipedia has a map.

Wednesday, June 21, 2006

Tunisian Berber

Amazing things turn up at the University of Western Sydney: a complete thesis online offering An outline of the Shilha (Berber) vernacular of Douiret (Southern Tunisia). Check it out; the rather endangered Berber varieties of Tunisia are quite ill-documented.

Friday, June 16, 2006

North African language policy

MoorishGirl has an interesting post on an article on a round-table debate on Moroccan Arabic, or Darija, as "a medium of cultural expression". She comments:

I'm fully in favor of using Darija, because of the huge impact it would have on the creation of a reading culture. Imagine: All children's books right now are in Modern Standard Arabic, which is a language no one learns until first grade (i.e. age 6 or 7), by which time reading habits are already in place for many kids.


I think this is a crucial point. Developing a literature of sorts in Darja would allow kids to get into the habit of reading way earlier. A fair number of kids in the West are reading by the age of three; for an Algerian or Moroccan kid to even understand much of the language his/her books are written in at that age would be unheard of. With Darja literature for them to use, they could start reading before they ever started school; it might even lead to them acquiring literary Arabic faster. Moreover, an oral literary tradition already exists, best exemplified by the traditions of melhoun poetry and chaabi lyrics; the language used in these is recognizably a literary register, and all that would be needed would be to write it. My puristic instincts would also rejoice in a move with the potential to stem the tragic loss of inherited vocabulary, and overuse of French, now afflicting Darja. And after all, why should Arabic-speaking kids continue to be deprived of the chance to read in their native language now that Tamazight-speaking ones are finally getting that chance?

However, I would envision Darja as a supplement to literary Arabic, not a replacement. Arabic connects Algeria (and no doubt Morocco), not only to the Arab world but to its own past, not to mention allowing it to engage more fully with its religion. The language in which Amir Abdelkader and Ibn Khaldun wrote - and of which generations were deprived by French rule - should always be a crucial part of an Algerian education. Also - as the ongoing struggle to get adequate higher educational textbooks published even in literary Arabic reminds us - a written Darja would take centuries at least to build up a literature comparable to major languages.

As long as I'm pondering educational policy, what should be done with foreign languages is obvious: end the domination of French. Nothing wrong with French per se, but an all-French policy is a handicap in a global context, isolating Algeria in the ghetto of Francophony at a time when English is a prerequisite to serious scientific work even in Paris, and an embarrassment at home, where it remains a scandal in conservative eyes. From 3rd grade on, have a choice between French and English (and maybe even Spanish) as the second language, and raise a generation of educated North Africans that do not all share a single foreign language; only thus can the domination of French in North Africa, with all its attendant sociological divisions and economic problems, be ended. Of course, in an educational system that has a serious shortage of good teachers as it is, this is a distant dream... but dreaming can be useful.

Saturday, June 10, 2006

"-gate" suffix reaches Arabic

Algerian football fans (that is to say, probably most of the population) are up in arms about not being able to watch the World Cup unless they subscribe to ART - a Saudi company which bought up the rights to World Cup footage for the MENA region and is selling it so expensively most terrestrial stations (including Algeria's) can't afford it. I don't particularly care myself, to be honest, but I was impressed to see the following headline in the newspaper Ech Chourouk:

الجزائر على أبواب فضيحة "آرتي-غايت"!


al-Jazaa'ir `alaa 'abwaab faḍiiħat "aartii-gaayt"!
(Algeria is on the verge of an ART-gate scandal!)

The development of "-gate" from a random morpheme at the end of a hotel name into a suffix indicating a political mess (Monicagate, Fostergate, etc.) is remarkable enough; that it should be borrowed into Arabic, even in the weird world of headline idiom, is incredible to me. I guess bound morphemes aren't necessarily as hard to borrow as one might think.

Tuesday, June 06, 2006

Nandi relatives and Arabic center-embedding

Two random interesting bits thrown up by my current research:

Nandi, a Nilotic language of Kenya with VSO order, would appear to allow you to relativize virtually any constituent of a sentence. I was particularly impressed by examples like:

nikò ce:pyó:sé:t ne â:-nken ci:tà ne kí:-ká:ci kitâ:pú:t
this woman Rel 1s-know person Rel Past-give book
"This is the woman that I know the person that gave [her] a book / that [she] gave a book to."

á-ké:r-é ci:tà ne pè:nt-í: àk la:kwe:-nyi: kâpsá:pit
1sg-see-impfv person Rel 3pl-go and child-his Kapsabet
"I see the person who [he] and his son are going to Kapsabet."

Take that, Subjacency Constraints! (Well, more seriously, I'm guessing ce/ne is probably not a fronted relative pronoun, especially since it agrees in case with the head noun and not with the position of the gap, so maybe no movement is involved - but that just raises other issues, like what does the gap consist of? Surely not pro? And what is ce/ne - a complementizer?)

And, in case you've ever wondered what an Arabic incomprehensibly double center-embedded sentence would look like, here's one:

رأى الولد الذي كتب الرجل الذي عينه الرئيس الرسالة إليه أخاه
ra'aa lwaladu lladhii kataba arrajulu lladhii `ayyanahu rra'iisu rrisaalata 'ilayhi 'axaahu
saw [the-boy [that wrote [the-man [that chose-him the president]] the letter to him]] brother-his.
“The boy the man the president chose wrote to saw his brother.”

Note that Arabic's VSO order renders it less vulnerable to subject- and object-relativization in this regard, but leaves it helpless against relativization of other positions - which is nonetheless permissible.

(Nandi examples from Creider & Creider 1989.)

Saturday, June 03, 2006

A little Algerian Arabic folk poetry

I recently came across a nice book (in English for once!) on the Algerian folk poet Muhammad ben Tayeb el-Alili, The Graying of the Raven. It's titled after this stanza, from a poem about a drought:
məššərq ləlməɣrib
fiha lɣ°ṛab yšib
a `aləm əlɣib
wətħənn bəttisir


من الشرق للمغريب
فيها الغراب يشيب
ها عالم الغيب
وتحن بالتيسير

From the east to the west
The raven turns white
O Knower of the Unseen
Grant us respite

(I've substituted my slightly more literal translation.)

His works are not particularly famous, and, while worth a look, are not in the top rank of the genre - but I'll bet they're the only ones available in English. For a perhaps better example, consider Dahmane El Harrachi's famous song - I was going to try and translate the whole thing, but frankly it's not easy, so I'll just give a sampler:
šħal šəft əlbəldan əl`amrin wəlbərr əlxali
šħal ð̣iyyə`t əwqat wəšħal tzid mazal ətxəlli


اشحال شفت البلدان العامرين والبر الخالي
اشحال ضيعت اوقات واشحال تزيد مازال تخلي

How many crowded cities and empty wilds you've seen,
How much time you've wasted - and how much more will you waste?

Incidentally - yes, the pessimism of both examples is characteristic.

Wednesday, May 31, 2006

Thursday, May 25, 2006

Algerians sure can code-switch

Algerians are rightly renowned for their code-switching wherever they go (or should be). I disapprove of it in general - it often reflects the unjustly low esteem Algerians tend to have for their mother tongue, and encourages the abandonment of less commonly used Algerian Arabic (Darja) terms in favor of unnecessary French loanwords. But you can't help but love an example like this one that I just heard here in London today:

gal-li y-ḥəbb to move
say+PF+3MSg-DAT+1Sg 3sg+IMPF-want "to move"
He told me he wants to move.

What's so weird about that? The thing is, while standard English want requires a non-finite complement, Algerian Arabic ḥəbb "want, like" takes a finite complement. In fact, there are no infinitives in Algerian Arabic - only finite verbs and verbal nouns. So it looks as if the non-finiteness (presumably generated in T) of the complement in the English half is being selected, not by the Arabic verb which precedes it, but by the English translation equivalent of it. I still can't quite believe I heard this sentence.

If you found that fun, you may wish to ruminate over another sentence (Arabic/French switching) from the same conversation:

`ənd-i un problème ta` wəqt
at-me "a problem" of time
"I've got a problem of time."

and, in particular, on what syntactic tree it suggests, and whether this really fits the idea of a DP. Note also that, while Algerian Arabic does have a sort of indefinite article (waḥəd əl-), its distribution is quite different from the French one, and I don't think it would occur in the corresponding code-switching-less sentence.

Monday, May 22, 2006

Center-embedding and Japanese

Lately I've been reading some of John Hawkins' A Performance Theory of Order and Constituency, which puts forwards some very appealing ideas about how to predict the relative frequency of different word orders (both cross-linguistically and within a language) by quantifying how easy they are for humans to parse. (For example, he derives such phenomena as Heavy-NP shift, the relativization hierarchy, and even the relative frequency of the six possible basic word orders SVO/SOV, VSO, etc.) Parsing issues certainly severely affect the grammaticality of sentences, as people who follow titles posts Language Log authors write have know.

I tried out a similar example in Japanese on a friend - going by the grammar books, one would expect "John said Mary thinks Bill came" to be translated as "Jon-wa Merii-ga Biru-ga kita to omou to itta", with three successive subjects followed by three successive objects. She unhesitatingly went for, as I recall, "Biru-ga kita to Merii-ga omou to Jon-ga itta" - moving the subjects to the "wrong" places to make the sentence processable - and said that the three-successive-subject one was "difficult". I can't think of any Arabic parallels offhand - postverbal objects and resumptive pronouns in relative clauses together stop most of the obvious possibilities - and Sylheti turns out to rather cleverly block almost (not quite) all possible ways in which problematic center-embedding might emerge. So my question to you is: in your language, can you think of similar examples of incomprehensible yet nominally grammatical sentences?

Friday, May 19, 2006

National/common/unifying language for the US?

As you may have heard on Language Log, on May 17th-18th, the US Senate approved not one but two amendments - one Republican, one Democrat - on the status of English. The first amendment, by Sen. Inhofe (R-Oklahoma), amends sections 161-2 of Title 4 of the United States Code to state:

English is the national language of the United States. The Government of the United States shall preserve and enhance the role of English as the national language of the United States of America. Unless specifically stated in applicable law, no person has a right, entitlement, or claim to have the Government of the United States or any of its officials or representatives act, communicate, perform or provide services, or provide materials in any language other than English. If exceptions are made, that does not create a legal entitlement to additional services in that language or any language other than English. If any forms are issued by the Federal Government in a language other than English (or such forms are completed in a language other than English), the English language version of the form is the sole authority for all legal purposes.

The second, by Senator Salazar (R-Colorado), makes the same section rather more reasonably, if vacuously, say:

English is the common and unifying language of the United States that helps provide unity for the people of the United States. The Government of the United States shall preserve and enhance the role of English as the common and unifying language of America. Nothing herein shall diminish or expand any existing rights under the law of the United States relative to services or materials provided by the government of the United States in any language other than English.

The bill is still under debate, so it remains to be seen what, if any, of this will be left - but, after 230 years of doing just fine without one, the USA may or may not soon have a national language. Either way, it's an interesting debate to follow. I remember in San Francisco just about any governmental document seemed to be printed in English, Chinese, and Spanish; that approach - choosing the language according to what people actually spoke on a local level, rather than a national one - strikes me as eminently sensible. What I can't seem to figure out is what the plan is now that both have passed - do they stick both texts in the section, or do they just hash it out later?

Wednesday, May 17, 2006

Maya and Amnesty International

I went to the AI site just now, without a linguistic thought in my head, and what do I find?

Watemaal: Li risinkileb’ laj nat’ol na’ajej moko a’an ta li xb’ehil re xtuqub’alkil ru li ch’a’ajkilal chi rix li ch’och’

I applaud this, although I should point out that putting an international press release in Mayan (dunno which Mayan language - Chol?) is somewhat self-defeating...

Incidentally, if you haven't already seen it, check out the site Language Hat just found. I particularly liked the San Zi Jing.

Sunday, May 14, 2006

Shawi blog

Shawiyya (Chaouia) is a Zenati Berber language of eastern Algeria, spoken inland on the Sahara-facing side of the Atlas Mountains. While spread over a far larger area than Algeria's other main Berber language, Kabyle, it has only about half the population (1.4 million or so). Unlike the Kabyles, the Shawis, as their Arabic name suggests, were traditionally seminomadic (transhumant, to be exact); after independence, many seized the opportunity to settle down in the cities, and, from what I hear, this major change of lifestyle led to widespread language shift to Arabic. Shawiyya, like other Zenati dialects of northern Algeria (Chenoua, Bissa, etc.), but unlike Kabyle or the Berber varieties of the Sahara, has the interesting sound change t > h initially in many contexts. Anyway, I found a Shawi-language-focused blog the other day, to my immense surprise, which I figured was worth linking:

Awal nu Shawi

It seems to mainly post lyrics, sometimes with translations.

Friday, May 12, 2006

A new primate and a nice talk

I went to a nice inaugural talk by Prof. Jaggar here at SOAS yesterday about African linguistic diversity, Afroasiatic, and SOAS linguists' resistance, then ultimate capitulation, to Greenberg's groundbreaking African classification - the talk was rendered especially notable by his getting up along with his choir to sing Nkosi sikelele iAfrika afterwards! However, I haven't really got time to summarize it (for the classification part, you could check out my previous post Beja and beyond), so instead I'll post a link to the discovery of a new species of primate: the kipunji, a close relative of the baboon which lives in trees instead of on plains. No news yet on its communication system :)

Wednesday, May 10, 2006

Sylheti word order

I've been working on Sylheti - a highly divergent dialect of Bengali / language very closely related to Bengali spoken around Sylhet in northeastern Bangladesh - for my field methods class for a while. The particular point I'm focusing on at the moment is the positioning of complement clauses, which obeys a simple rule: if the complement clause has a separate subject, it follows the verb; otherwise, it precedes the verb. The language is otherwise SOV, I should note, so you get contrasts like:


ami exṭa apol sai.
I an apple want-1.
“I want an apple.”

ami exṭa apol xaitam sai.
I [an apple eat-COND-1] want-1.
“I want to eat an apple.”

ami sai he exṭa apol xaok.
I want-1 [he an apple eat-3-OPT].
“I want him to eat an apple.”


This doesn't fit my Japanese-based expectations of "proper" SOV languages (in Japanese, the subordinate clause would always precede the verb) but it turns out that German has basically the same word order (if you factor out the main-clause V2 order by having an initial complementizer). There are some obvious processing motivations for such an order, but it doesn't really fit the head-position parameter idea so well. I was wondering: has anyone seen similar patterns in other SOV languages?

Saturday, May 06, 2006

West African grammars in Arabic script

I want to see this talk by Hiroyuki Eto Nikolai Dobonravine (though I'm not likely to be in Dublin for it):

Arabic and Arabic-script writing tradition in West Africa dates back to the 12th century AD, if not earlier. Local scholars were familiar with the linguistic ideas which formed part of Islamic education. Arabic grammars and dictionaries were popular in the region. The interest in the study of Arabic resulted in the development of local Arabic and bilingual vocabularies, sometimes written in verse, as well as some works on Arabic grammar. A few versified vocabularies and grammars of West African languages were also composed. Almost all of them were written in Arabic and used Arabic linguistic terminology.

In the late 19th and early 20th centuries several works were written in West African languages using Arabic script. One such work, "Littafen nahwowin Hausance" ("The book of Hausa grammar"), is analysed in the paper. The work demonstrates a special approach to the parts of speech in Hausa (the verb deprived of the "person-aspect complex" is seen as a noun, although it may be used independently in the Imperative). This is a larger work of traditional lexicography, with notes on folk etymology, pragmatic rules, grammatical gender and possessive pronouns in Hausa.

The shift from Arabic to Roman script and the decline in the use of Arabic did not lead to the disappearance of the earlier linguistic tradition. New grammatical works and vocabularies in Arabic script (including a Fula-French vocabulary in Arabic script) were published. All these writings have been largely ignored by the linguists working at the universities in West Africa and abroad.

Whorf meets warmongering

Pop Whorfianism (usually in forms that Whorf would have been the first to laugh at) is something I usually associate with a slightly hippy-ish multiculturalism. However, it seems to have a certain appeal to Islamophobes as well.

The thesis they find so appealing is summarized in one James Coffman's question: "Does the Arabic Language Encourage Radical Islam?". Apparently, he did a survey in 1988 in Algiers which confirmed a number of fairly obvious facts - notably, that the younger students that year, who were the first cohort of students whose secondary education had been mainly in Arabic, were more "Islamist" than their predecessors who had gone through a partly or wholly Francophone educational system. From this, he concluded that the Arabic language encouraged "radical Islam" - not, for example, that Arabic-literate students had much easier access to "Islamist" literature (and Islamic literature in general), or that the transition to Arabic had been accompanied by a vast expansion of the school system to cover more conservative rural areas, or that many of the imported Arabic teachers who helped tide Algeria over the transition period were Islamic Brotherhood members fleeing crackdowns in Egypt, or indeed (most importantly) that the collapse of the Algerian economy in the late 1980's was encouraging the growth of anti-government ideologies. It's an old, old saw, but one that apparently still bears repeating: correlation does not equal causation.

Mind you, like most people who cite the Sapir-Whorf hypothesis, he doesn't seem to have a very clear idea of its content. On my reading of Whorf, his core idea is (plausibly enough) that a language might make its speakers more conscious of some grammaticalized categories by forcing its speakers to mark them, or less conscious of them by not providing any simple way to describe them; it would thus render some ideas more intuitive than others. For this sort of deep influence to be plausible, the speaker has to do most of his/her thinking in the language in question. But both classical Arabic and French in Algeria are only ever used by most speakers in writing, or in highly formal contexts - scarcely the sort of situation Whorf had in mind...

(PS: It seems Language Log have also just done another post on "No word for X" fallacies. Another example of ham-handed anti-Arab efforts at Whorfian analysis is alluded to on Linguistic Life.)

Friday, May 05, 2006

How to find linguistic universals

I couldn't resist posting this quote:

[In this book] I examine the general conditions under which verbal complements are licensed, and provide a possible explanation for their limited distribution. The primary reference language is English, though the proposed licensing conditions for verbal complements are assumed to hold universally.

Fortunately, the author adds:

That the main proposals of this study and the analyses do indeed carry over to other languages is shown in Chapter 5, which takes a cross-linguistic perspective.

The title of Chapter 5? "Direct Perception Complements in Other European Languages". The languages considered are German, Dutch, Italian, French, Spanish, and Portuguese, representing a grand total of two neighboring subfamilies of Indo-European.

I don't mean to poke fun at this book specifically - it looks like a very thorough analysis of clausal complements of perception verbs in English - but this so neatly encapsulates what in practice is one of the main problems of the generative program: over-reliance on English in particular and what Sapir used to call "Standard Average European" in general.

Tuesday, May 02, 2006

Reduplication in Siouan

I've finished, handed in, and now uploaded that essay I was working on, on reduplication in Siouan. The main conclusions were that:

* Proto-Siouan-Catawban (and Proto-Siouan-Yuchi, but not Proto-Macro-Siouan) productively formed pluractionals from verb stems by full stem reduplication. Every branch of the family exhibits reflexes of this process, although these have often been affected by semantic extensions and morphological contractions.
* Stoney "adversative" reduplication is most probably borrowed from a Salish language.