Monday, November 07, 2005

Demonstratives in Semitic and beyond

Rishon Rishon just posted a table comparing the words in my previous post to Hebrew. Most of them are correct; mo`ed and g'vul are not cognate, and SaH and loa` I'm not sure about. However, one is particularly interesting: ha- = 'al- "the". You often find this seeming cognate cited in works on Semitic: after all, ha- induces gemination of a subsequent non-guttural consonant - suggesting a lost consonant in the prefix assimilating to the subsequent letter - and Hebrew h- occasionally seems to soften to '- in Arabic (the causative measure hiph`iil corresponds to 'af`ala, for example.) Trouble is, the only letter in Hebrew that regularly assimilates to a following consonant is n (although l does admittedly assimilate in the verb laqaH), and the Safaitic inscriptions seem to reveal an early pre-Islamic northern dialect of Arabic which did have a definite article h- (hn- before gutturals, thus hn'lt for Al-Lat.) So are there any other possibilities?

I think so. ha- corresponds pretty well to Arabic haa "here is", which is also the obligatory prefix to the demonstrative "this" (haadhaa, haadhihi, haa'ula'). Compare also Syriac haanaa "this (m. sg.)" - which appears to reveal an added n which could explain the Hebrew doubling (and the Safaitic form) nicely. Conversely, 'al- corresponds well to Hebrew 'elleh, Arabic haa-'ulaa', Syriac haaleyn "these". The vowel doesn't correspond exactly, but then it doesn't in the certain cognate 'elleh = haa-'ulaa' either. This idea has probably already been put forward (or indeed knocked down) somewhere in the literature, for all I know, but there you go.

In either case, both definite articles would derive originally from unstressed demonstratives - a process so common it's barely worth commenting on. For example, all the Romance languages derive their definite articles from Latin demonstratives - usually illum/illa "that" (before the noun except in Romanian), but istum "that" in Sardinian. Likewise, the Coptic definite article pe- (m.)/te- (f.)/ne- (pl.) derives from the ancient Egyptian demonstrative pn (m.), tn (f.), nn (pl.) "this". Indeed, English "the" derives from the same old English word as "that". Come to think of it, I don't know of any definite articles offhand that don't derive from demonstratives; can you think of any?

22 comments:

David Boxenhorn said...

I've been looking for this information on the web for the last two days without success - you've written exactly what I hoped to find, but didn't!

A couple of questions:

The behaviour of Hebrew and Arabic articles are eerily similar - they are applied to both nouns and their adjectives (how many languages do that?) and they have the same odd behaviour in the construct state. Doesn't that seem like evidence for some kind of relationship?

And what do you make of this?

"The Arabic, Canaanite and Modern South Arabian definite article has a common origin and goes back to an original demonstrative pronoun which was a compound inflected for gender, number and probably also for case. It can be reconstructed as *han(V)- for masc. sing., *hat(V)- for fem. sing. and *hal(V)- for plural. Assimilations of -n- and -t- to the following consonant (including -n-l- > -ll- and -t-l- > ll) neutralized the opposition of gender and number and led to a reinterpretation of either hal/’al- or han/’an->’am- synchronically as basic variant. In Aramaic the suffixed definite article was due not to simple suffixation of ha but to a resegmentation of the postposed compound demonstrative ha-ze-[n(a)] and suffixation of enclitic ha > -a which has been generalized."

Lameen Souag said...

I haven't yet read the Zaborski article you linked to, but at first blush I'd emend it only slightly: make it *han- m. and *'il- pl. (I'm not aware of any evidence bearing on what its feminine form would have been), following the Hebrew and Arabic plural demonstratives whose roots start with ' rather than h. I'd want to see what his argument is for 'an > 'am: that sound shift looks very ad hoc, but then again I haven't got any better suggestion.

David Boxenhorn said...

What is the importance of 'an > 'am with respect to articles?

David Boxenhorn said...

Also, is Arabic haadhaa cognate to Hebrew ze (זה) - 'this' (usually dh ~ z)?

Lameen Souag said...

'am was the definite article in the Himyari dialect (?) of early Arabic; it's still the article in some Yemeni dialects of Arabic, I think. I'm not sure what the definite article in the Modern South Arabian languages is offhand. And yes, the -dhaa of haadhaa is cognate with Hebrew zeh (just don't ask me about the vowel change!)

sadiq said...

I just want to say, in reference to your combined intellects--

We're not worthy!

Anonymous said...

Just three comments from your friendly neighborhood Romance scholar:

"all the Romance languages derive their definite articles from Latin demonstratives - usually illum/illa "that" (before the noun except in Romanian), but istum "that" in Sardinian"

First, "illum" is an accusative and "illa" a nominative form: it would be usual to use the same case-form for both genders, so ille/illa (nominative)or illum/illam (accusative); most Romance forms of the definite article derive from the accusative, so the latter would be the appropriate citation form.

Second, more seriously: the Sardinian definite article derives from ipsum/ipsam, not istum/istam.

Third, it is clear that ipsum/ipsma is the older Romance definite article, traces of which are found in a number of Romance languages, and illum/illam the newer definite article.

Oh, a fourth comment: great blog, great post. Keep it up!

Lameen Souag said...

Thanks - ipsa vs. ipsum was a careless mistake, though I'm more surprised about the Sardinian article coming from ipsum (I thought I had read that it was istum, but I must be misremembering.) I'd love to hear more about these traces of ipsum in other Romance languages...

Anonymous said...

Your friendly neighborhood Romance scholar is back, and in answer to our linguistic mountain-climber's request "I'd love to hear more about these traces of ipsum in other Romance languages...", has this to offer: outside of Sardinia, one finds reflexes of IPSUM/IPSAM used as definite article in Catalan dialects, especially those spoken on the Balearic Islands (in Catalan linguistics they are called ARTICLE SALAT): they coexist with (recently introduced) reflexes of ILLUM/ILLAM. Elsewhere (Gascony, Portugal, Southern Italy) traces of IPSUM/IPSAM as definite article are found in toponymy; in Old Provençal reflexes of IPSUM/IPSAM are sometimes found as definite article (SA DOMNA "the lady" instead of the usual reflexes of ILLUM/ILLAM: "the lady" is normally LA DOMNA); the same is true in Old Spanish. Some individual words in other Romance languages also point to a stage when IPSUM/IPSAM was used as definite article, or at least as a "default" demonstrative: thus, Old French ANÇOIS "before" derives from Latin ANTE IPSUM (TEMPUS) "Before the/this (time)".

Finally, and perhaps of greatest interest to Lameen Souag and his readers, there is one Berber loan from Latin, /tseburt/ "door", which seems to derive from IPSAM PORTAM, indicating that reflexes of IPSUM/IPSAM were used as definite articles in the now-extinct Romance language of North Africa.

P.S. I'd love to learn more about this word and its distribution + pronunciation in various Berber languages: one Berber speaker (Kabyle, if I remember properly) I asked about this realized the word with an initial interdental fricative and intervocalic /w/.

Lameen Souag said...

Very interesting! However, I'm sceptical about Berber "door" being a loanword. It is variously /tabburt/, /taggurt/, and /tawwurt/ in Berber languages (with t > th regularly in certain contexts in many northern languages), which would normally reflect a proto-Berber form along the lines of *tawwurt. There is no Berber language, as far as I know, where this would be realized with initial ts- (except following the particle d-); however, many 19th-century French authors transcribe Arabic th as ts. In any case, ta- -t is the normal feminine affix in Berber.

Herman said...

My opinion regarding the origin of the "definite article" in Arabic, Hebrew, etc. follows Ullendorf: the gemination of the first consonant of the noun is itself the "definite article". Then we have in some cases dissimilation to (a)l. The vowel -a- is only needed to resolve 2-consonant clustering at the beginning of a word, which is not allowed in Arabic.
This is most logical, as in Hebrew there is no dissimilation at all.

Lameen Souag said...

Then you have to postulate a process dissimilating geminates to -lC-, despite the fact that geminates are perfectly acceptable everywhere else in Arabic, and despite the extreme phonetic unnaturalness of a sound change that would (for instance) take kk > lk (in fact, I am not aware of any instance of such a change in any language.) On top of this, you have to explain why the epenthetic vowel is a-, rather than i- like everywhere else in Arabic (ibn, iftataHa, etc.). I consider this significantly less plausible than any of the theories discussed above.

Herman said...

Dissimilation, too, is a very natural phenomenon. For example, we have the root GMD, but also a word gulmuud/galmuud (Arabic, and the Hebrew book of Job). Dissimilation happens only in the case of the Qamar-consonants, and in Egyptian Arabic doesn't even happen for -K- and -G-. If there is no original gemination, why is there always alifu l-wasl? Apparently the article is not really a word, at most only a letter, and in my opinion just a gemination. The vowel -a- instead of -i- is immaterial, how can a short vowel in Arabic be evidence to anything? "Philippi's Law" allows for -i- and -a- to change places all the time. This theory I have here is the most simple and elegant theory, because it also explains Hebrew's article very easily: always gemination.

Herman said...

The interesting thing I observe is, it seems so obvious to most people that there must be a "definite article" in Arabic just like in many European languages; have you ever thought of the possibility that things work just a little differently in Hebrew and Arabic? Hundreds of languages have no article at all: Turkish, Korean, Japanese, Latin, why would Arabic have a word for it? The only "article" there originally was in Arabic, is first consonant gemination. The fact it has been interpreted later on as a "word", is irrelevant.
One problem with the idea that it is a real word is, why does it precede a noun, and never follow it? In Arabic, a word that gives further information about another word usually follows that word. Why is this so different in the case of the "article"?
Compare Arabic to (biblical) Hebrew and you'll see that primary gemination just fits.

Lameen Souag said...

It doesn't explain the ha- in Hebrew; it doesn't explain the -l- in Arabic (there is no reason to believe that the -l- in julmuud was originally a dissimilation, rather than an infix, even assuming that it does derive from jmd); it doesn't explain the hn- article in Safaitic; "Philippi's law" is not operative in Arabic to begin with; in fact, the only thing this theory would explain is the gemination.

In Arabic, of course, determiners regularly precede the noun - demonstratives (hadha, dhalika), numbers, and qualifiers (kull, 'ayy, etc), exactly as you would expect if the article developed from a demonstrative.

Herman said...

Thanks Lameen, I'll comment.

Lameen: "It doesn't explain the ha- in Hebrew; it doesn't explain the -l- in Arabic"

H.: I would say, why not?
First we have a proto-Central-Semitic noun, say:

*quds

~> C1-Gemination:
*qquds "the 'quds' "

~> auxiliary vowel to resolve 2-consonant clustering at the beginning of a word:
*( )aqquds

~> auxiliary consonant preceding the -a-
~> -h- in Hebrew: haqquds (haqqodesh)
~> nothing in Arabic: (a)qquds

In Arabic, dissimilation of some double consonants (for instance, -qq- ~> -lq-)
~> *(a)qquds ~> (a)lquds
Of which the laam was written for convenience in all cases.

I think this is a perfectly plausible theory.

Lameen: "(there is no reason to believe that the -l- in julmuud was originally a dissimilation, rather than an infix, even assuming that it does derive from jmd)"

H.: An infix is rather conjectural and, as you must admit, very unlikely. If you take a good look at the semantics of GMD and GLMD, it is very plausible the roots are related/one root.

Lameen: "it doesn't explain the hn- article in Safaitic;"

H.: Safaitic is irrelevant, it is of a much farther branch of Semitic than Arabic and Hebrew. We cannot use Safaitic to either prove or disprove anything regarding this.

Lameen: "Philippi's law" is not operative in Arabic to begin with;"

H.: I agree, but we can't seriously say that short, unstressed vowels -a- or -i- in Arabic or Hebrew are evidence to anything at all, can we? The vowel -a- preceding the gemination of the article hardly exists in Arabic, because it is "overruled" by any short vowel ending of a preceding word.

Lameen: "in fact, the only thing this theory would explain is the gemination."

H.: Which is a very important, i.e. the only thing that really matters in the article! The rest is secondary.

Lameen: "In Arabic, of course, determiners regularly precede the noun - demonstratives (hadha, dhalika), numbers, and qualifiers (kull, 'ayy, etc), exactly as you would expect if the article developed from a demonstrative."

H.: OK, but we can also argue these words form a different construction when they are combined with another word.
Your example kull is already out, btw.: it's a construct. Numbers aren't really relevant either, because they can both precede and follow a noun.

I have a question for you: if you say that in
(a)t-tuqaa
"the piety" (written: al-tuqaa), we are dealing with laam assimilating to taa, how do you explain:
(i)ltaqaa
stem VIII of root LQY? Apparently, there is no need for laam to assimilate to taa.
Don't tell me (what I always hear when I discuss this) this is an exception or something; if you disagree, I need a strong refutation. Because this is convincing evidence that, as there is no law lt- > tt-, we are dealing with dissimilation, rather than assimilation in the case of the "article" -(a)l-.

best regards,
Herman

Lameen Souag said...

There is, of course, no general law assimilating lt > tt. There is no law dissimilating kk > lk either (compare shakkala.) The question therefore becomes: which is more natural? lt > tt is a very normal sound change - total regressive assimilation, which you'll find many examples of in any historical linguistics textbook. Arabic itself has other examples of lt>tt, as Sibawayh notes. tt > lt is decidedly less normal - and kk > lk, or jj > lj, are completely unprecedented. In a choice between postulating two irregular sound shifts, you have to pick the more natural one.

"auxiliary vowel to resolve 2-consonant clustering at the beginning of a word" is ad hoc. Hebrew broke up Proto-Semitic 2-consonant clusters by inserting a vowel _after_ the 1st consonant, not before (consider ben, shem, shnayim) - and, as I noted, Arabic did so by inserting i-, not a-.

"auxiliary consonant preceding the -a-" is not motivated at all. What other cases of Hebrew randomly prefixing an h to prevent an initial vowel are there?

Safaitic (a North Arabian language, I should note, not a South Arabian one) is much more closely related to Arabic than Hebrew is; in fact, it may be a dialect of Arabic. It certainly is relevant to this problem.

Short unstressed a and i vowels also need to be considered. Just because the sound shifts they have undergone are confusing doesn't mean they can be ignored entirely.

Herman said...

You are right: the question is, what is more natural in the case of the "article", assimilation or dissimilation? Then you call the examples kk > lk and jj > lj "unprecedented".

That's interesting. because in Egyptian Arabic we have: ikkursi "the chair" or iggawaami`, "the mosques". Is Egyptian an example of even further assimilation, or an example of (more conservative) original gemination that did not dissimilate yet? I think the latter is the case.

Obviously, the way the Arabic article is written (alif laam) doesn't mean anything, as I hope you agree. If we have, in a number of consonants, dissimilation to (a)lC, the spelling may well be based on the dissimilation, not on the original gemination. In my own language, Dutch, we write quite a lot of things that have nothing to do with original phonetics.

If we then look purely at phonology, we find that, when we follow your point of view, the original Arabic article is not 'al, but l. After all, there may often be the short -a- vowel preceding -l-/the gemination, but this -a- is always overruled by any preceding short vowel in an open, ending syllable. The alif is always alifu l-wasl, meaning that it is no consonant at all, it's just a letter signifying the absence of any consonant, easy for spelling.

I don't think we can argue with that; it is hardly possible that the alif of the "article" al- has quiesced, having once been a real glottal stop, because Arabic has kept a lot, or most, of its original glottal stops. I guess شمال shimaal vs. شمأل sham'al (found this word in "Qifaa nabki" of Imru'u l-Qays) may be an exception I can think of, where hamz has disappeared, but this may be also the force of three-consonantal root building, or it may be that the poet, for reasons of rhythm, couldn't use a long vowel there.

So we have the undeniable fact that in your conviction, only the consonant -l- is the article in Arabic, not *'al. Where, then, is the link with -han-, -ham-, or -hal- or whatever in other Semitic or other than standard Arabic languages? If we present it in writing, we can make people believe that *'al has something to do with *hal, *han, and the like, but once we present the phonological facts, i.e. the Arabic "article" is -l-, and the Hebrew sister "article" is -ha(n/l/m?), it doesn't fit anymore: the parallel is too far-fetched.

The actual Hebrew "article" is just first consonant gemination, pure and simple. No problems with dissimilation here, making it even clearer.
The hey letter (which may have been unpronounced, just like the alifu l-wasl in Arabic!), and the vowel -a- are hardly surprising. The fact that we have words like shem, ben and shnayim/shtayim is no real evidence. Those haven't been geminates, first of all, second, we don't know at which point in time these forms arose and when the "article" arose. For example, the word shtayim has a non-aspirated tav, meaning that according to the masoretic rules, the shwa under shin does not represent a vowel at all, or when it does, it's a rather late development, otherwise the tav would have been aspirated. These two little words do not convince me.

Considering the above, I don't think you can maintain that Arabic uses only the auxiliary vowel -i- to resolve initial 2-consonant clustering. In the article, we have in your theory *lbayt ~> (a)lbayt. Unless you would want to argue that in (a)l, there was once a real hamz *'al, a theory that is so improbable we can disregard it here. So the argument that auxiliary vowels resolving 2-cons. clustering in Arabic are always i-, not a-, cannot stand.

My theory then, that we have *ddars ~> (a)ddars, *nnuur ~> (a)nnuur etc., and the equiv. in Hebrew, is the simplest solution.

Herman said...

I'll add something to that: earlier on I wrote:
"in Egyptian Arabic we have: ikkursi "the chair" or iggawaami`, "the mosques". Is Egyptian an example of even further assimilation, or an example of (more conservative) original gemination that did not dissimilate yet? I think the latter is the case."

I'm not an expert on Arabic/comp. Semitic linguistics at all, but I noticed that in Egyptian Arabic we have the ج Jiim of several other dialects and Modern Standard Arabic pronounced as Giim, so -G- in "give", not -J- in "jest". In Hebrew the corresponding ג gimel, pronunciation -G- in "give", seems to represent, like Egyptian Arabic ج, the older version (cf. Greek's gamma Γ, also pronounced that way). If so, Egyptian Arabic's double -KK- and -GG- in the article may also preserve the older, not the younger state, in comparison to MSA. Any thoughts?

Anonymous said...
This comment has been removed by a blog administrator.
bulbul said...

In response to herman:
in Egyptian Arabic we have: ikkursi "the chair" or iggawaami`, "the mosques". Is Egyptian an example of even further assimilation, or an example of (more conservative) original gemination that did not dissimilate yet?

There are similar examples of gemination (or traditionally "assimilation" :o) beyond the حروف شمسية/قمرية system in other Arabic dialects as well. The most notable example is that of Maltese, with gemination also affecting z (pronounced as "ts") and ċ (pronounced as "ch" in "chin"). Please note that both z and ċ are innovations (imported from Romance, probably Sicilian). How would you reconcile this conservative feature with two new consonants?
The real 64k question is, why do some consonants assimilate/dissimilate and others do not?

how do you explain:
(i)ltaqaa
stem VIII of root LQY

How about with a different set of rules for verbs and nouns? Not much of an explanation, I know, but it is one.

it is hardly possible that the alif of the "article" al- has quiesced, having once been a real glottal stop, because Arabic has kept a lot, or most, of its original glottal stops.
Why is it not possible?
If we assume that the glottal stop is a real consonant and reflects real 8th century usage (which is by no means certain, consider the way it is written), one cannot help but notice that word-initial and word-final glottal stops have evolved differently from the word-central glottal stop. The latter is mostly replaced by "ي" in modern Arabic dialects, while the former have disappeared altogether (consider words like "'akala"). Consider also the well-known shift q > ' in a number of dialects. Would that be possible if glottal stop were still alive and kicking? The word-initial glottal stop may have been dropped long before the 8th century.
All of that assuming the glottal stop ever existed...

Obviously, the way the Arabic article is written (alif laam) doesn't mean anything, as I hope you agree.
I'm afraid I must disagree. It certainly does mean something. It may not mean what we expect it to mean. The comparison with modern Dutch is rather inappropriate - some of the stuff you write may have to do with the fact that you never accepted Hus' diacritics and had to rely to clusters (like "oe" for [u] or "ie" for [i:]), other stuff may be due to the fact that the spelling has not caught up with the changes yet (like the infinitive).

One problem with the idea that it is a real word is, why does it precede a noun, and never follow it? In Arabic, a word that gives further information about another word usually follows that word.
I think we're venturing into the dark waters of language universals here. I am no expert in this particular field, but I believe that it is very unlikely for a VSO language to have a suffixed article. Also, Swedish and Bulgarian are both SVO languages and in both of them the adjective etc. precede the noun, yet both have a suffixed article.

Shame on me for not bringing up Zaborski's article first. My own country's academy of sciences' journal... :o) Anyone wants a copy, just let me know.

And by the way, awesome blog!

John Cowan said...

I know this post is several years, but I just thought I'd mention that in Balearic Catalan there is a contrast between ILLUM and IPSUM articles: la mort is 'Death', but sa mort is 'the death [of which we were speaking]'. The unmarked article is the IPSUM one, and not all nouns can take an ILLUM article at all.