Wednesday, August 30, 2006

Naguib Mahfouz dies

Nobel Prize-winning Egyptian novelist Naguib Mahfouz has died at the age of 94. Literature will be the poorer without him. His best-known work is the Cairo Trilogy, recounting three generations of life in a middle-class Cairo family as the social and political changes of the early twentieth century swirl around them. The final volume's humanist Marxist message seems outdated now - and, indeed, Mahfouz would move towards mysticism in later works, ironically attracting much more hostility. However, the Cairo Trilogy tells a more timeless story as well, portraying the slow development of the characters' very different personalities as they all move away from the cheerful but self-serving hypocrisy of the first generation, taking risks and making sacrifices for national independence or personal fulfillment, for Marxism or the Muslim Brotherhood, for idealism or stupid desires, that would have been unthinkable to their (grand)father Sayyid Ahmad Abd-el-Jawad, secure in his status and unworried by the contradiction between the strict religiosity he imposed on his house and the relaxed hedonism he indulged in outside of it.

And what has this to do with linguistics, you may well ask? Well, Edward Said's efforts to persuade a New York publishing house to risk translating the Cairo Trilogy back before Mahfouz won the Nobel Prize prompted the memorably stupid response "that Arabic was a controversial language".

What? Still not linguisticky enough? Then I'll throw in the etymology of his name, نجيب محفوظ. najiib (nagiib in Egyptian dialect) literally means "noble, learned", from the root نجب njb "be noble, be excellent". maHfuuZ is a passive participle meaning "protected", from the root حفظ HfZ "guard, protect, keep, memorise", from which the word HaafiZ "a person who has memorised the Qur'an" derives.

Now go and read some Arabic literature :)

Sunday, August 27, 2006

Myths about Darja (Algerian Arabic): 1 "Darja has no rules."

In light of the interest attracted by the previous post, and of several discussions I've had about this topic in real life lately, I'll be posting regularly (?) on a few of the several myths widely believed in Algeria about Algerian Arabic, and often elsewhere about other Arabic dialects.

1. "Darja has no rules."

Every language has rules. You can see some of these rules in action by examining the effects of changing word order: for example خالد شاف روحو (khaled shaf RuHu - Khaled saw himself) is perfectly fine, but خالد روحو شاف (khaled RuHu shaf - Khaled himself saw) is totally bizarre. This is not because of some inevitable law of human thought: in Japanese, "Khaled himself saw" would be the correct order. Likewise, ما شفْتْشْ الطّونوبيل (ma sheftsh eTTunubil - I didn't see the car) is fine; شفتش ما الطونوبيل (sheftsh ma TTunubil) or ما شفت الطونوبيلش (ma sheft eTTunubilsh) are ridiculous. أنا راني نكتب (ana Rani nekteb - I'm writing) is fine; حنا راني نكتب (Hna Rani nekteb) or حنا رانا نكتب (Hna Rana nekteb) are absurd. If you speak Darja, you'll be able to see this instantly, even though no one ever taught you that one was right and the other wrong, and even though all of them would be wrong in FuSHa. The difference between Darja and FuSHa is not that FuSHa has rules and Darja breaks them; rather, Darja has different rules, and, whereas the rules of FuSHa are usually learned at great effort from teachers who learned them from grammar books written hundreds of years ago by people like Sibawayh who themselves had to go and spend hours in the desert with the few Bedouins who still spoke "proper" FuSHa, the rules of Darja are usually learned unconsciously from your own parents and relatives and followed effortlessly from the moment you're old enough to talk - as those of FuSHa were back in the 7th century when some people still spoke it as a mother tongue.

Monday, August 21, 2006

Al Jazeera and Reuters discover Algerian Arabic

Reuters and Al Jazeera English are both carrying a story about Algerian Arabic, apparently written by Algeria's El Khabar journalist Lamine Chikhi. I suppose I should be glad to see anything at all about this in the media, but unfortunately it confirms the first law of linguistics in the media: linguistics reporting is always shaky on the linguistics.
"Unlike neighbours in Morocco and Tunisia, Algerians speak a dense patois, a mixture of Arabic, Berber, French and sometimes Turkish, that most Arabs cannot fathom."

First of all, Algerian Arabic is still overwhelmingly Arabic; but reporters rarely seem to grasp the difference between true mixed languages like Michif and extensive loanwords like English or Algerian Arabic. More importantly, what do you mean unlike? Moroccan Arabic has more Berber than Algerian, and Tunisian more Turkish; how much French is in any of those three very much depends on class, cultural/political orientation, and region.

Let's try this: A car hit Mohamed, who was taken to hospital. In Algerian patois: Mohamed darbattou tonobile, dattou direct el sbitar. In this example, the verb is in Algerian dialect, the word car is in a kind of French, sbitar is Turkish, and the intonation is taken from the Berber Kabyle language.


sbitar is quite obviously Romance (Lingua Franca?) in origin - it might have come in via Turkish, but I'd like to see evidence for that. dda "took" (I assume that's the verb he had in mind) is not just Algerian but pan-Maghreb (certainly Moroccan, anyway), and has classical Arabic roots (أَدَّى) although its meaning has shifted significantly. Claiming that "the intonation is taken from the Berber Kabyle language" is a total cop-out; some elements of Algerian Arabic intonation may well derive from Berber, but there are noticeable differences as well, with Kabyle intonation tending to have a higher pitch range (from what I recall of Chaker's analysis, anyway.)

But more to the point, when will reporters (and indeed politicians) figure out the basic issue here? Language change is normal, and not unique to Algeria; borrowing foreign words is normal, and not unique to Algeria; having a substantial difference between the literary and spoken languages is common to the whole Arab world, and not unique to Algeria; a Syrian would have as much trouble understanding Moroccans or Tunisians as they would Algerians; and having been occupied by "Phoenicians, Romans, Byzantines, Arabs, Turks and French" is common to half the Mediterranean! A real story would focus on what is unique to Algerian Arabic, or at least Maghreb Arabic, and provide an account of how it got that way that wasn't limited to an indiscriminate recital of the country's history; it would at least mention the noteworthy pre-Hilali/Hilali dialect distinction, the elements shared with Andalusi Arabic, the first person singular n- shibboleth, the retention of classical words lost in the east (such as Haanuut "shop"), the Lingua Franca influence, the two or three Roman loanwords, the widely differing degree of Berber influence... I mean, why not consult an academic text first?

Sunday, August 20, 2006

Hail native Language - clothe my thoughts

I recently came across a forgotten poem by Milton addressing his mother tongue (as you do!), written to open the English section of a day of speeches at College after the Latin one was completed. English then, of course, was far from being the global lingua franca it is now: it hadn't even had a significant literary output for all that long (Shakespeare had only died in Milton's childhood), and the nascent "Anglosphere" was a few scattered coastal settlements here and there. A poem in this vein now would surely be far more boastful, and contain repeated allusions to, come to think of it, Shakespeare and Milton; but the absence of such allusions here lends it a certain universality that a modern version would lack.

Hail native Language, that by sinews weak
Didst move my first endeavouring tongue to speak,
And mad'st imperfect words with childish tripps,
Half unpronounc't, slide through my infant-lipps,
Driving dum silence from the portal dore,
Where he had mutely sate two years before:
...
I have some naked thoughts that rove about
And loudly knock to have their passage out;
And wearie of their place do only stay
Till thou hast deck't them in thy best aray;
That so they may without suspect or fears
Fly swiftly to this fair Assembly's ears...


The metaphor of language as a clothing for thought contrasts interestingly with the well-known "conduit metaphor" (IDEAS ARE OBJECTS, LANGUAGE IS A CONTAINER), even though clothes technically do contain their wearer. A container and its archetypal contents are equally non-sentient, and the container's primary purpose is to allow the transport and storage of its contents; clothes, on the other hand, archetypally adorn and protect a sentient being, who is likely to choose clothes that somehow reflect how they wish to be perceived. On the conduit metaphor, the bare idea is mere substance; on the clothing metaphor, the bare idea is a personality in its own right, a sort of homunculus getting ready to go out and meet the world. On the conduit metaphor, an idea is successfully transmitted if what it provokes in the listener accords with the author's intent; on the clothing metaphor, one can envisage the idea as having a life of its own, perhaps misunderstood by the author as well as the hearer. (And what are the thoughts to the thinker/author in this metaphor - his/her children, or servants, or perhaps even constituents?)

Friday, August 18, 2006

Quechua hits The Economist

The Economist reports on a Peruvian Congresswoman trying to raise the social status of Quechua by only speaking Quechua to Congress, forcing them to hire translators.

There are a couple of good Quechua sites out there: Runa Simi, for example, or Quechua.org.uk.

Saturday, August 12, 2006

Ayin-less in Gaza

Gaza, Arabic غزّة ghazzah, is another extremely old city of the eastern Mediterranean, having been in existence for at least three millennia. After a period of Egyptian rule, it became a member of the Philistine Pentapolis. Its name has been recorded in several forms over the years, including:

  • Hieroglyphic: q3d3ti, g3d3y, g3d3tw (says Wallis Budge);
  • Akkadian (Tell el-Amarna): Az-za-ti;
  • Akkadian (Assyrian): Kha-az-zu-tu;
  • Biblical Hebrew: `azzah;
  • Greek (Herodotus): Cadytis (probably Gaza, but some dispute)
  • Greek (Septuagint): Gaza (Γάζα)
  • Latin (Pliny): Gaza
Some sources derive the town's Hebrew name, `Azzah, from the root `zz "be strong". However, this is a folk etymology. The two proto-Semitic consonants *` (pharyngeal voiced fricative/approximant) and *gh (uvular voiced fricative) merged to ` in Biblical Hebrew as we know it; but `zz "be strong" had ` in proto-Semitic (compare Arabic عزّ `azza), whereas Gaza clearly had gh (note that Akkadian had no gh, so a null/kh alternation in transcribing it is expected.) As a matter of fact, the Septuagint provides evidence that some dialects of Hebrew retained the `/gh distinction well into the classical period; some instances of written `ayin are left untranscribed in the Septuagint's Greek (eg Yehoshua` = Ἰησοῦς; Bet-`Araba = Βαιθαραβα), while others are transcribed as gamma (`Amora = Γομορρα; `Azza = Γάζα), clearly suggesting that the pronunciation was still distinguished.

The interesting thing is that Arabic has preserved the gh in Gaza, which would be impossible if it had taken the word from 7th-century Aramaic, which has no gh either (Hebrew was almost surely extinct as a spoken language by the time Islam arrived.) Could it have been borrowed from Greek? Maybe; but, given that Herodotus notes that "Arabians" dominated the coast between Gaza and al-Arish even in his time, another obvious possibility is simply that the word Gaza entered Arabic from one Canaanite language or another well before the loss of the `/gh distinction, and didn't change.

As an interesting coda, the name Gaza may apparently be the source of the English word gauze.

Monday, August 07, 2006

Sumerian grammatical texts

Sumerian Grammatical Texts available online! The title is a misnomer - most of the texts given are early Sumerian-Akkadian lexica arranged by topic, or just plain Sumerian texts - but there are other interesting things, such as a phonetically organised syllabary (vowel order: u-a-i), and a series called "ana ittišu" (p. 30) with some rather paradigm-looking stuff, such as:

SumerianAkkadian, English
ùrsûnu, lap, bosom
ùr-bisûn-šu, his bosom
ùr-bi-šúana sûni-šu, upon his bosom
ùr-bi-šú in-garana sûni-šu iškun, he placed upon his bosom

which I guess offer a clue about the teaching methods used. These tablets were used to teach young Akkadian-speaking would-be scribes Sumerian, long after Sumerian itself had become extinct.

Friday, August 04, 2006

Kurdish giving way to Turkish in some areas?

Found a telling first-hand account of language shift. I had no idea the last decade or two had made such a difference.

until the end of 1980s the kurdish language was still preserved, because the kurds were still in their villages [...] most of them would not know one single word turkish and the women, in specific, did not know one single word turkish! [...] but at the beginning of 1990s, and since then going on, we have been losing the kurdish language [...] and it is mainly because around four or some say five thousand kurdish villages were forcibly evacuated, i should use "they were destroyed by the turkish army" instead. and more than three million people(kurds) were displaced! and of course it had its consequences! [...]

all the kurds started to go to school, where they would only speak turkish, and if, in any way someone were to speak kurdish s/he would punished for speaking kurdish and this way it would have a deterring effect on the other children(students) as well! kurdish students were despised and made fun of because of their accent so the families of those kurdish students thought that if they spoke only turkish at home it would help their children and they would be able to speak turkish better, and nobody would be able to fun of them. [...]

they only watched the turkish tv channels! and especially the mothers were very badly affected by this, because they wree the ones who would stay at home and when they did not have anything to do they would watch the tv and improve their turkish, but after a while they started to use turkish words while speaking kurdish, keep in mind that their children were not taught kurdish, so even if some of those children wanted to learn kurdish they would learn it wrong because their parents would not speak appropriate kurdish! i still cant believe that some kurds would say "qapi qepamiş bike" for "close the door" in kurdish: i have a very hard time understanding this, qepi originally is kapı(it is pronounced liek qepi in kurdish) "qepamiş" means nothing, it is supposed to mean "close", they combine turkish root of "to close" and add a kurdish suffix to it and make it kurdish. when i see people using those words, and killing kurdish it really hurts me very badly!


The extreme borrowing is an interesting point - and probably a universal of low-status languages. I can sympathise - excessively Frenchified Arabic really grates on my ears...

Monday, July 31, 2006

Mountains of Lebanon - some etymologies

Lebanon's cities and villages, tragically now in the news, have some interesting etymologies. I always used to wonder about the different names: why is the same city called Tyre in English, but Ṣuur in Arabic? or Byblos in English, and Jubayl in Arabic? The reasons illustrate the sheer length of these towns' history, and the time depth of Greece's contact with them.

Sometime before the characteristic sound shifts of Proto-Canaanite happened - perhaps 1200 BC or so? - Tyre would have been known by a Semitic term meaning something like "peak" or "crag": θ'uur-u. This was borrowed by the early Greeks as tur-os > tyros (when u got fronted to y) > Latin Tyrus > English Tyre. Meanwhile, in Lebanon, that glottalised θ', perhaps unsurprisingly, was among the first sounds to disappear; in Canaanite (that is, Phoenician, Hebrew, and assorted minor languages of the area), it became s' (or ṣ - it's hard to be certain whether the Canaanite emphatics were glottalised or pharyngealised). Case endings also vanished. This gave the Phoenician name: S'uur (or Ṣuur). It was adopted without change into Aramaic, and thence Arabic, as the region's languages shifted over time. The regular cognate of Proto-Semitic θ'uur-u in Arabic would have been ظور đ̣uur; this root is unattested as far as I know. However, in Aramaic *θ' became ṭ, and from this source the root entered pre-Arabic as طور ṭuur "mountain", a rare but well-attested term used in the Quran, notably for Mount Sinai. (In Ugaritic, freakishly enough, *θ' became γ (gh), and γuur- "mountain" is a well-attested Ugaritic word. In Ugaritic, incidentally, Tyre was actually called ṣuur-; so either my etymology here is wrong, which is possible, or Ugaritic borrowed the name after it had already changed *θ' to γ.)

Likewise, Byblos would have started out as gubl-u (attested in Ugaritic and Akkadian), which (judging by possible Arabic cognates) may have meant "mountain" as well. This went into early Greek as gwubl-os > byblos (Mycenaean gw > Greek b, u > y) > English Byblos. In Arabic, the g of course became j; and, for some reason (maybe it was a little town at the time?), it looks like a diminutive got added, turning it from *jubl to jubayl (colloquial žbeyl), which in Arabic just means "little mountain".

Let's hope the day that these towns appear on the news for their history or their beaches, not for the bombs being dropped on them, comes more quickly than looks likely.

Thursday, July 20, 2006

Polysemy vs. homonymy: some Algerian Arabic examples

I'm recently back from Algeria (hence the blog gap), so I thought I'd post some more meditations on Algerian Arabic...

Q: Which of the following words from Algerian Arabic are cases of polysemy (different meanings with a shared conceptual core) and which of homonymy (different meanings coincidentally identical in phonetic shape)?


`ṛuṣa عروصة - bride; daughter-in-law
ħjəṛ حجر - stone; lap
bakuṛ باكور - early-ripening figs; young bonito fish


A: `ṛuṣa, from Classical Arabic `aruusah عروسة, is a case of polysemy; a new bride traditionally goes to live in her husband's family house together with her new parents-in-law, so the extension is natural.

ħjəṛ is a case of homonymy: "stone" comes from Classical ħajar حجر, and "lap" from Classical ħijr حجر. Though it would be amusing to try and find a common conceptual core, I can't see any plausible one.

bakuṛ is etymologically a case of polysemy: both derive from Classical baakuur باكور, "coming early, early; premature; precocious" (Wehr). But synchronically, given the two independent restrictions of its meaning - it isn't used to mean first fruits in general, or young fish in general - I can only take it to be a case of homonymy.

Saturday, June 24, 2006

Ohlone

I used to live in the Bay Area for a while, so naturally I tried to find out about its pre-colonial language group, Ohlone. This turned out to have been a set of fairly closely related dialects/languages stretching from San Francisco down beyond Monterey, plus the coast of the East Bay. Their only reasonably close relative is Miwok, another small language family spoken to its north and west, although wider relations with languages further north along the Pacific coast are likely. Among the more noteworthy features of Ohlone are regular metathesis processes - for example, the plural suffix can be either -mak or -kma, depending on whether it's preceded by a consonant or a vowel.

Dave Kaufman has just posted some interesting Ruminations on Rumsien, one of the southern dialects; or, if you speak Spanish, you can read a grammar of Mutsun, a southeastern dialect. Wikipedia has a map.

Wednesday, June 21, 2006

Tunisian Berber

Amazing things turn up at the University of Western Sydney: a complete thesis online offering An outline of the Shilha (Berber) vernacular of Douiret (Southern Tunisia). Check it out; the rather endangered Berber varieties of Tunisia are quite ill-documented.

Friday, June 16, 2006

North African language policy

MoorishGirl has an interesting post on an article on a round-table debate on Moroccan Arabic, or Darija, as "a medium of cultural expression". She comments:

I'm fully in favor of using Darija, because of the huge impact it would have on the creation of a reading culture. Imagine: All children's books right now are in Modern Standard Arabic, which is a language no one learns until first grade (i.e. age 6 or 7), by which time reading habits are already in place for many kids.


I think this is a crucial point. Developing a literature of sorts in Darja would allow kids to get into the habit of reading way earlier. A fair number of kids in the West are reading by the age of three; for an Algerian or Moroccan kid to even understand much of the language his/her books are written in at that age would be unheard of. With Darja literature for them to use, they could start reading before they ever started school; it might even lead to them acquiring literary Arabic faster. Moreover, an oral literary tradition already exists, best exemplified by the traditions of melhoun poetry and chaabi lyrics; the language used in these is recognizably a literary register, and all that would be needed would be to write it. My puristic instincts would also rejoice in a move with the potential to stem the tragic loss of inherited vocabulary, and overuse of French, now afflicting Darja. And after all, why should Arabic-speaking kids continue to be deprived of the chance to read in their native language now that Tamazight-speaking ones are finally getting that chance?

However, I would envision Darja as a supplement to literary Arabic, not a replacement. Arabic connects Algeria (and no doubt Morocco), not only to the Arab world but to its own past, not to mention allowing it to engage more fully with its religion. The language in which Amir Abdelkader and Ibn Khaldun wrote - and of which generations were deprived by French rule - should always be a crucial part of an Algerian education. Also - as the ongoing struggle to get adequate higher educational textbooks published even in literary Arabic reminds us - a written Darja would take centuries at least to build up a literature comparable to major languages.

As long as I'm pondering educational policy, what should be done with foreign languages is obvious: end the domination of French. Nothing wrong with French per se, but an all-French policy is a handicap in a global context, isolating Algeria in the ghetto of Francophony at a time when English is a prerequisite to serious scientific work even in Paris, and an embarrassment at home, where it remains a scandal in conservative eyes. From 3rd grade on, have a choice between French and English (and maybe even Spanish) as the second language, and raise a generation of educated North Africans that do not all share a single foreign language; only thus can the domination of French in North Africa, with all its attendant sociological divisions and economic problems, be ended. Of course, in an educational system that has a serious shortage of good teachers as it is, this is a distant dream... but dreaming can be useful.

Saturday, June 10, 2006

"-gate" suffix reaches Arabic

Algerian football fans (that is to say, probably most of the population) are up in arms about not being able to watch the World Cup unless they subscribe to ART - a Saudi company which bought up the rights to World Cup footage for the MENA region and is selling it so expensively most terrestrial stations (including Algeria's) can't afford it. I don't particularly care myself, to be honest, but I was impressed to see the following headline in the newspaper Ech Chourouk:

الجزائر على أبواب فضيحة "آرتي-غايت"!


al-Jazaa'ir `alaa 'abwaab faḍiiħat "aartii-gaayt"!
(Algeria is on the verge of an ART-gate scandal!)

The development of "-gate" from a random morpheme at the end of a hotel name into a suffix indicating a political mess (Monicagate, Fostergate, etc.) is remarkable enough; that it should be borrowed into Arabic, even in the weird world of headline idiom, is incredible to me. I guess bound morphemes aren't necessarily as hard to borrow as one might think.

Tuesday, June 06, 2006

Nandi relatives and Arabic center-embedding

Two random interesting bits thrown up by my current research:

Nandi, a Nilotic language of Kenya with VSO order, would appear to allow you to relativize virtually any constituent of a sentence. I was particularly impressed by examples like:

nikò ce:pyó:sé:t ne â:-nken ci:tà ne kí:-ká:ci kitâ:pú:t
this woman Rel 1s-know person Rel Past-give book
"This is the woman that I know the person that gave [her] a book / that [she] gave a book to."

á-ké:r-é ci:tà ne pè:nt-í: àk la:kwe:-nyi: kâpsá:pit
1sg-see-impfv person Rel 3pl-go and child-his Kapsabet
"I see the person who [he] and his son are going to Kapsabet."

Take that, Subjacency Constraints! (Well, more seriously, I'm guessing ce/ne is probably not a fronted relative pronoun, especially since it agrees in case with the head noun and not with the position of the gap, so maybe no movement is involved - but that just raises other issues, like what does the gap consist of? Surely not pro? And what is ce/ne - a complementizer?)

And, in case you've ever wondered what an Arabic incomprehensibly double center-embedded sentence would look like, here's one:

رأى الولد الذي كتب الرجل الذي عينه الرئيس الرسالة إليه أخاه
ra'aa lwaladu lladhii kataba arrajulu lladhii `ayyanahu rra'iisu rrisaalata 'ilayhi 'axaahu
saw [the-boy [that wrote [the-man [that chose-him the president]] the letter to him]] brother-his.
“The boy the man the president chose wrote to saw his brother.”

Note that Arabic's VSO order renders it less vulnerable to subject- and object-relativization in this regard, but leaves it helpless against relativization of other positions - which is nonetheless permissible.

(Nandi examples from Creider & Creider 1989.)

Saturday, June 03, 2006

A little Algerian Arabic folk poetry

I recently came across a nice book (in English for once!) on the Algerian folk poet Muhammad ben Tayeb el-Alili, The Graying of the Raven. It's titled after this stanza, from a poem about a drought:
məššərq ləlməɣrib
fiha lɣ°ṛab yšib
a `aləm əlɣib
wətħənn bəttisir


من الشرق للمغريب
فيها الغراب يشيب
ها عالم الغيب
وتحن بالتيسير

From the east to the west
The raven turns white
O Knower of the Unseen
Grant us respite

(I've substituted my slightly more literal translation.)

His works are not particularly famous, and, while worth a look, are not in the top rank of the genre - but I'll bet they're the only ones available in English. For a perhaps better example, consider Dahmane El Harrachi's famous song - I was going to try and translate the whole thing, but frankly it's not easy, so I'll just give a sampler:
šħal šəft əlbəldan əl`amrin wəlbərr əlxali
šħal ð̣iyyə`t əwqat wəšħal tzid mazal ətxəlli


اشحال شفت البلدان العامرين والبر الخالي
اشحال ضيعت اوقات واشحال تزيد مازال تخلي

How many crowded cities and empty wilds you've seen,
How much time you've wasted - and how much more will you waste?

Incidentally - yes, the pessimism of both examples is characteristic.

Wednesday, May 31, 2006

Thursday, May 25, 2006

Algerians sure can code-switch

Algerians are rightly renowned for their code-switching wherever they go (or should be). I disapprove of it in general - it often reflects the unjustly low esteem Algerians tend to have for their mother tongue, and encourages the abandonment of less commonly used Algerian Arabic (Darja) terms in favor of unnecessary French loanwords. But you can't help but love an example like this one that I just heard here in London today:

gal-li y-ḥəbb to move
say+PF+3MSg-DAT+1Sg 3sg+IMPF-want "to move"
He told me he wants to move.

What's so weird about that? The thing is, while standard English want requires a non-finite complement, Algerian Arabic ḥəbb "want, like" takes a finite complement. In fact, there are no infinitives in Algerian Arabic - only finite verbs and verbal nouns. So it looks as if the non-finiteness (presumably generated in T) of the complement in the English half is being selected, not by the Arabic verb which precedes it, but by the English translation equivalent of it. I still can't quite believe I heard this sentence.

If you found that fun, you may wish to ruminate over another sentence (Arabic/French switching) from the same conversation:

`ənd-i un problème ta` wəqt
at-me "a problem" of time
"I've got a problem of time."

and, in particular, on what syntactic tree it suggests, and whether this really fits the idea of a DP. Note also that, while Algerian Arabic does have a sort of indefinite article (waḥəd əl-), its distribution is quite different from the French one, and I don't think it would occur in the corresponding code-switching-less sentence.

Monday, May 22, 2006

Center-embedding and Japanese

Lately I've been reading some of John Hawkins' A Performance Theory of Order and Constituency, which puts forwards some very appealing ideas about how to predict the relative frequency of different word orders (both cross-linguistically and within a language) by quantifying how easy they are for humans to parse. (For example, he derives such phenomena as Heavy-NP shift, the relativization hierarchy, and even the relative frequency of the six possible basic word orders SVO/SOV, VSO, etc.) Parsing issues certainly severely affect the grammaticality of sentences, as people who follow titles posts Language Log authors write have know.

I tried out a similar example in Japanese on a friend - going by the grammar books, one would expect "John said Mary thinks Bill came" to be translated as "Jon-wa Merii-ga Biru-ga kita to omou to itta", with three successive subjects followed by three successive objects. She unhesitatingly went for, as I recall, "Biru-ga kita to Merii-ga omou to Jon-ga itta" - moving the subjects to the "wrong" places to make the sentence processable - and said that the three-successive-subject one was "difficult". I can't think of any Arabic parallels offhand - postverbal objects and resumptive pronouns in relative clauses together stop most of the obvious possibilities - and Sylheti turns out to rather cleverly block almost (not quite) all possible ways in which problematic center-embedding might emerge. So my question to you is: in your language, can you think of similar examples of incomprehensible yet nominally grammatical sentences?

Friday, May 19, 2006

National/common/unifying language for the US?

As you may have heard on Language Log, on May 17th-18th, the US Senate approved not one but two amendments - one Republican, one Democrat - on the status of English. The first amendment, by Sen. Inhofe (R-Oklahoma), amends sections 161-2 of Title 4 of the United States Code to state:

English is the national language of the United States. The Government of the United States shall preserve and enhance the role of English as the national language of the United States of America. Unless specifically stated in applicable law, no person has a right, entitlement, or claim to have the Government of the United States or any of its officials or representatives act, communicate, perform or provide services, or provide materials in any language other than English. If exceptions are made, that does not create a legal entitlement to additional services in that language or any language other than English. If any forms are issued by the Federal Government in a language other than English (or such forms are completed in a language other than English), the English language version of the form is the sole authority for all legal purposes.

The second, by Senator Salazar (R-Colorado), makes the same section rather more reasonably, if vacuously, say:

English is the common and unifying language of the United States that helps provide unity for the people of the United States. The Government of the United States shall preserve and enhance the role of English as the common and unifying language of America. Nothing herein shall diminish or expand any existing rights under the law of the United States relative to services or materials provided by the government of the United States in any language other than English.

The bill is still under debate, so it remains to be seen what, if any, of this will be left - but, after 230 years of doing just fine without one, the USA may or may not soon have a national language. Either way, it's an interesting debate to follow. I remember in San Francisco just about any governmental document seemed to be printed in English, Chinese, and Spanish; that approach - choosing the language according to what people actually spoke on a local level, rather than a national one - strikes me as eminently sensible. What I can't seem to figure out is what the plan is now that both have passed - do they stick both texts in the section, or do they just hash it out later?