Monday, November 07, 2005

Demonstratives in Semitic and beyond

Rishon Rishon just posted a table comparing the words in my previous post to Hebrew. Most of them are correct; mo`ed and g'vul are not cognate, and SaH and loa` I'm not sure about. However, one is particularly interesting: ha- = 'al- "the". You often find this seeming cognate cited in works on Semitic: after all, ha- induces gemination of a subsequent non-guttural consonant - suggesting a lost consonant in the prefix assimilating to the subsequent letter - and Hebrew h- occasionally seems to soften to '- in Arabic (the causative measure hiph`iil corresponds to 'af`ala, for example.) Trouble is, the only letter in Hebrew that regularly assimilates to a following consonant is n (although l does admittedly assimilate in the verb laqaH), and the Safaitic inscriptions seem to reveal an early pre-Islamic northern dialect of Arabic which did have a definite article h- (hn- before gutturals, thus hn'lt for Al-Lat.) So are there any other possibilities?

I think so. ha- corresponds pretty well to Arabic haa "here is", which is also the obligatory prefix to the demonstrative "this" (haadhaa, haadhihi, haa'ula'). Compare also Syriac haanaa "this (m. sg.)" - which appears to reveal an added n which could explain the Hebrew doubling (and the Safaitic form) nicely. Conversely, 'al- corresponds well to Hebrew 'elleh, Arabic haa-'ulaa', Syriac haaleyn "these". The vowel doesn't correspond exactly, but then it doesn't in the certain cognate 'elleh = haa-'ulaa' either. This idea has probably already been put forward (or indeed knocked down) somewhere in the literature, for all I know, but there you go.

In either case, both definite articles would derive originally from unstressed demonstratives - a process so common it's barely worth commenting on. For example, all the Romance languages derive their definite articles from Latin demonstratives - usually illum/illa "that" (before the noun except in Romanian), but istum "that" in Sardinian. Likewise, the Coptic definite article pe- (m.)/te- (f.)/ne- (pl.) derives from the ancient Egyptian demonstrative pn (m.), tn (f.), nn (pl.) "this". Indeed, English "the" derives from the same old English word as "that". Come to think of it, I don't know of any definite articles offhand that don't derive from demonstratives; can you think of any?

16 comments:

David Boxenhorn said...

I've been looking for this information on the web for the last two days without success - you've written exactly what I hoped to find, but didn't!

A couple of questions:

The behaviour of Hebrew and Arabic articles are eerily similar - they are applied to both nouns and their adjectives (how many languages do that?) and they have the same odd behaviour in the construct state. Doesn't that seem like evidence for some kind of relationship?

And what do you make of this?

"The Arabic, Canaanite and Modern South Arabian definite article has a common origin and goes back to an original demonstrative pronoun which was a compound inflected for gender, number and probably also for case. It can be reconstructed as *han(V)- for masc. sing., *hat(V)- for fem. sing. and *hal(V)- for plural. Assimilations of -n- and -t- to the following consonant (including -n-l- > -ll- and -t-l- > ll) neutralized the opposition of gender and number and led to a reinterpretation of either hal/’al- or han/’an->’am- synchronically as basic variant. In Aramaic the suffixed definite article was due not to simple suffixation of ha but to a resegmentation of the postposed compound demonstrative ha-ze-[n(a)] and suffixation of enclitic ha > -a which has been generalized."

Lameen Souag الأمين سواق said...

I haven't yet read the Zaborski article you linked to, but at first blush I'd emend it only slightly: make it *han- m. and *'il- pl. (I'm not aware of any evidence bearing on what its feminine form would have been), following the Hebrew and Arabic plural demonstratives whose roots start with ' rather than h. I'd want to see what his argument is for 'an > 'am: that sound shift looks very ad hoc, but then again I haven't got any better suggestion.

David Boxenhorn said...

What is the importance of 'an > 'am with respect to articles?

David Boxenhorn said...

Also, is Arabic haadhaa cognate to Hebrew ze (זה) - 'this' (usually dh ~ z)?

Lameen Souag الأمين سواق said...

'am was the definite article in the Himyari dialect (?) of early Arabic; it's still the article in some Yemeni dialects of Arabic, I think. I'm not sure what the definite article in the Modern South Arabian languages is offhand. And yes, the -dhaa of haadhaa is cognate with Hebrew zeh (just don't ask me about the vowel change!)

sadiq said...

I just want to say, in reference to your combined intellects--

We're not worthy!

Anonymous said...

Just three comments from your friendly neighborhood Romance scholar:

"all the Romance languages derive their definite articles from Latin demonstratives - usually illum/illa "that" (before the noun except in Romanian), but istum "that" in Sardinian"

First, "illum" is an accusative and "illa" a nominative form: it would be usual to use the same case-form for both genders, so ille/illa (nominative)or illum/illam (accusative); most Romance forms of the definite article derive from the accusative, so the latter would be the appropriate citation form.

Second, more seriously: the Sardinian definite article derives from ipsum/ipsam, not istum/istam.

Third, it is clear that ipsum/ipsma is the older Romance definite article, traces of which are found in a number of Romance languages, and illum/illam the newer definite article.

Oh, a fourth comment: great blog, great post. Keep it up!

Lameen Souag الأمين سواق said...

Thanks - ipsa vs. ipsum was a careless mistake, though I'm more surprised about the Sardinian article coming from ipsum (I thought I had read that it was istum, but I must be misremembering.) I'd love to hear more about these traces of ipsum in other Romance languages...

Anonymous said...

Your friendly neighborhood Romance scholar is back, and in answer to our linguistic mountain-climber's request "I'd love to hear more about these traces of ipsum in other Romance languages...", has this to offer: outside of Sardinia, one finds reflexes of IPSUM/IPSAM used as definite article in Catalan dialects, especially those spoken on the Balearic Islands (in Catalan linguistics they are called ARTICLE SALAT): they coexist with (recently introduced) reflexes of ILLUM/ILLAM. Elsewhere (Gascony, Portugal, Southern Italy) traces of IPSUM/IPSAM as definite article are found in toponymy; in Old Provençal reflexes of IPSUM/IPSAM are sometimes found as definite article (SA DOMNA "the lady" instead of the usual reflexes of ILLUM/ILLAM: "the lady" is normally LA DOMNA); the same is true in Old Spanish. Some individual words in other Romance languages also point to a stage when IPSUM/IPSAM was used as definite article, or at least as a "default" demonstrative: thus, Old French ANÇOIS "before" derives from Latin ANTE IPSUM (TEMPUS) "Before the/this (time)".

Finally, and perhaps of greatest interest to Lameen Souag and his readers, there is one Berber loan from Latin, /tseburt/ "door", which seems to derive from IPSAM PORTAM, indicating that reflexes of IPSUM/IPSAM were used as definite articles in the now-extinct Romance language of North Africa.

P.S. I'd love to learn more about this word and its distribution + pronunciation in various Berber languages: one Berber speaker (Kabyle, if I remember properly) I asked about this realized the word with an initial interdental fricative and intervocalic /w/.

Lameen Souag الأمين سواق said...

Very interesting! However, I'm sceptical about Berber "door" being a loanword. It is variously /tabburt/, /taggurt/, and /tawwurt/ in Berber languages (with t > th regularly in certain contexts in many northern languages), which would normally reflect a proto-Berber form along the lines of *tawwurt. There is no Berber language, as far as I know, where this would be realized with initial ts- (except following the particle d-); however, many 19th-century French authors transcribe Arabic th as ts. In any case, ta- -t is the normal feminine affix in Berber.

Lameen Souag الأمين سواق said...

Then you have to postulate a process dissimilating geminates to -lC-, despite the fact that geminates are perfectly acceptable everywhere else in Arabic, and despite the extreme phonetic unnaturalness of a sound change that would (for instance) take kk > lk (in fact, I am not aware of any instance of such a change in any language.) On top of this, you have to explain why the epenthetic vowel is a-, rather than i- like everywhere else in Arabic (ibn, iftataHa, etc.). I consider this significantly less plausible than any of the theories discussed above.

Lameen Souag الأمين سواق said...

It doesn't explain the ha- in Hebrew; it doesn't explain the -l- in Arabic (there is no reason to believe that the -l- in julmuud was originally a dissimilation, rather than an infix, even assuming that it does derive from jmd); it doesn't explain the hn- article in Safaitic; "Philippi's law" is not operative in Arabic to begin with; in fact, the only thing this theory would explain is the gemination.

In Arabic, of course, determiners regularly precede the noun - demonstratives (hadha, dhalika), numbers, and qualifiers (kull, 'ayy, etc), exactly as you would expect if the article developed from a demonstrative.

Lameen Souag الأمين سواق said...

There is, of course, no general law assimilating lt > tt. There is no law dissimilating kk > lk either (compare shakkala.) The question therefore becomes: which is more natural? lt > tt is a very normal sound change - total regressive assimilation, which you'll find many examples of in any historical linguistics textbook. Arabic itself has other examples of lt>tt, as Sibawayh notes. tt > lt is decidedly less normal - and kk > lk, or jj > lj, are completely unprecedented. In a choice between postulating two irregular sound shifts, you have to pick the more natural one.

"auxiliary vowel to resolve 2-consonant clustering at the beginning of a word" is ad hoc. Hebrew broke up Proto-Semitic 2-consonant clusters by inserting a vowel _after_ the 1st consonant, not before (consider ben, shem, shnayim) - and, as I noted, Arabic did so by inserting i-, not a-.

"auxiliary consonant preceding the -a-" is not motivated at all. What other cases of Hebrew randomly prefixing an h to prevent an initial vowel are there?

Safaitic (a North Arabian language, I should note, not a South Arabian one) is much more closely related to Arabic than Hebrew is; in fact, it may be a dialect of Arabic. It certainly is relevant to this problem.

Short unstressed a and i vowels also need to be considered. Just because the sound shifts they have undergone are confusing doesn't mean they can be ignored entirely.

Anonymous said...
This comment has been removed by a blog administrator.
Anonymous said...

In response to herman:
in Egyptian Arabic we have: ikkursi "the chair" or iggawaami`, "the mosques". Is Egyptian an example of even further assimilation, or an example of (more conservative) original gemination that did not dissimilate yet?

There are similar examples of gemination (or traditionally "assimilation" :o) beyond the حروف شمسية/قمرية system in other Arabic dialects as well. The most notable example is that of Maltese, with gemination also affecting z (pronounced as "ts") and ċ (pronounced as "ch" in "chin"). Please note that both z and ċ are innovations (imported from Romance, probably Sicilian). How would you reconcile this conservative feature with two new consonants?
The real 64k question is, why do some consonants assimilate/dissimilate and others do not?

how do you explain:
(i)ltaqaa
stem VIII of root LQY

How about with a different set of rules for verbs and nouns? Not much of an explanation, I know, but it is one.

it is hardly possible that the alif of the "article" al- has quiesced, having once been a real glottal stop, because Arabic has kept a lot, or most, of its original glottal stops.
Why is it not possible?
If we assume that the glottal stop is a real consonant and reflects real 8th century usage (which is by no means certain, consider the way it is written), one cannot help but notice that word-initial and word-final glottal stops have evolved differently from the word-central glottal stop. The latter is mostly replaced by "ي" in modern Arabic dialects, while the former have disappeared altogether (consider words like "'akala"). Consider also the well-known shift q > ' in a number of dialects. Would that be possible if glottal stop were still alive and kicking? The word-initial glottal stop may have been dropped long before the 8th century.
All of that assuming the glottal stop ever existed...

Obviously, the way the Arabic article is written (alif laam) doesn't mean anything, as I hope you agree.
I'm afraid I must disagree. It certainly does mean something. It may not mean what we expect it to mean. The comparison with modern Dutch is rather inappropriate - some of the stuff you write may have to do with the fact that you never accepted Hus' diacritics and had to rely to clusters (like "oe" for [u] or "ie" for [i:]), other stuff may be due to the fact that the spelling has not caught up with the changes yet (like the infinitive).

One problem with the idea that it is a real word is, why does it precede a noun, and never follow it? In Arabic, a word that gives further information about another word usually follows that word.
I think we're venturing into the dark waters of language universals here. I am no expert in this particular field, but I believe that it is very unlikely for a VSO language to have a suffixed article. Also, Swedish and Bulgarian are both SVO languages and in both of them the adjective etc. precede the noun, yet both have a suffixed article.

Shame on me for not bringing up Zaborski's article first. My own country's academy of sciences' journal... :o) Anyone wants a copy, just let me know.

And by the way, awesome blog!

John Cowan said...

I know this post is several years, but I just thought I'd mention that in Balearic Catalan there is a contrast between ILLUM and IPSUM articles: la mort is 'Death', but sa mort is 'the death [of which we were speaking]'. The unmarked article is the IPSUM one, and not all nouns can take an ILLUM article at all.