[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: Voynichese = Old/unknown/extinct kind of Chinese dialect

To: voynich@xxxxxxxx
Subject: Re: AW: Voynichese = Old/unknown/extinct kind of Chinese dialect
From: Jorge Stolfi <stolfi@xxxxxxxxxxxxx>
Date: Sun, 8 Oct 2000 22:09:50 -0200 (EDT)
Delivered-to: reeds@research.att.com
In-reply-to: <39D2AE9C.446EF5C3@gte.net>
References: <BBB0694EA12CD311B2F300508B2C1DFA01BD5EC8@acnt07.ac1.dsh.de> <39D2AE9C.446EF5C3@gte.net>
Reply-to: stolfi@xxxxxxxxxxxxx
Sender: jim@xxxxxxxxxxxxx

    > [Jacques Guy:] [EVA "daiin" = Chinese "de"] would be too good to
    > be true. Chinese de is (roughly) a possessive, like English 's.
    > But daiin often occurs reduplicated, impossible in Chinese, at
    > least the Chinese(s) I know.

Those frequent (?)  repetitions may be a problem for the Chinese theory,
but they are even more of a problem for the "nomenclator" and "artificial
language" theories.

In fact, the Chinese theory has somewhat of an advantage 
over the other two, as Brian points out:

    > Actually, that duplication is perfectly possible in Chinese,
    > though they tend to avoid it because it sounds bad. There are
    > three different 'de' characters that all perform very similar
    > grammatical functions [...] If you want to play along that line
    > of thinking, look for a similar pattern of 'yi' at the beginning
    > of words, these actually have very pronounced and obvious tones
    > ('de's all slide toward neutral) but there are a bunch of them
    > and I can see how they might be appended at word beginnings.

Moreover, the very first sample of Chinese pinyin which I got 
had this repetition on line 2:

  ... ping2 jia1 zhi1 yi1 yi1 ba1 ba1 yi1 nian2 chu1 ...
                      ^^^^^^^^^^^^^^^^^^^
                                     
But now, having made the case for Chinese, let me point out what I
think is the most serious problem with that theory. Unless I have made
another big blooper, I believe the following are true:

  (1) half of the tokens have one "gallows" letter <k> <t>
      <p> <f>, half have none;
      
  (2) half of the tokens have one or more "bench" letters 
      (<ch>, <sh>, <ee>), half have none
      
  (3) there is no obvious statistical correlation between 
      presence of gallows and presence of benches.
      
These identities are not perfect (and it couldn't be, given the rate
of trancription errors), but the discrepancies seem to be within range
for random sampling of a fair coin. 

It is hard to imagine how these balanced statistics could arise in an
alphabetic (phonetic or semi-phonetic) system. Since typical words
have at most one gallows, and the gallows and benches make a compact
cluster in the word, we would be almost forced to conclude that the
gallows/bench cluster encodes the syllable's consonant, while the
remaining letters encode the vowels and tones.

But properties (1-3) above would then say that syllables with empty
consonant account for a full 25% of the text, while syllables with
non-empty consonants account for the remaining 75%. Obviously this is
not the case for Chinese, as we would expect (note that there is only
1 empty consonant but some 20 non-empty ones).

The same problem appears when we analyze only the words that have
gallows: properties (2) and (3) above say that 50% of those words have
no benches, and 50% have them. But there is only one way of placing 0
benches in a word, while there are at least four ways of placing one
bench (ch or sh, before or after the gallows), and many more ways of
placing two benches.

Said another way, it is hard to imagine a code that reserves "no
gallows" for 50% of the words, and all other 8 gallows combinations
(k, t, p, f, ke, te, pe, fe) to the other 50%.  Ditto for the 
benches.

Come to think of it, properties (1-2) are quite strange under any of the
theories I know of.  They seem to suggest that the number and type of
the gallows/benches is irrelevant, only their presence/absence do
matter. I.e., that

  okchody = ochtedy = okeeedy = oshcfhedy = ...
  
Any thoughts? 

--stolfi

Follow-Ups:
- Re: AW: Voynichese = Old/unknown/extinct kind of Chinese dialect
  - From: Brian Eric Farnell

Prev by Date: Re: WG: average word length in VMS
Next by Date: Number of syllables in Cantonese
Previous by thread: Re: WG: average word length in VMS
Next by thread: Re: AW: Voynichese = Old/unknown/extinct kind of Chinese dialect
Index(es):
- Date
- Thread