[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Sukhotin's algorithm etc



Mark Perakh wrote:
> Did anybody check the ratio of the percentage of Sukhotin's based
> supposed vowels in the text to that in the VMS alphabet? Is it within the range
> typical of natural languages? 

I  found the old Cryptologia, July 1991 (Vol. XV No.3),
with Sukhotin's algorithm explained (pp.258-261), but
also a lengthy article: "Statistical properties of two
folios to the Voynich manuscript" (pp.207-218). It
dates from before I invented Frogguy, and the transcription
system is that in Bennett's "Scientific and engineering
problem-solving with the computer". There are frequency
counts there.  The first five "vowels" identified were,
with relative frequencies rounded to the nearest 
percentage point:

1. <o>  15% 
2. <y>  13%
3. <a>   8%
4. <c>   7%
5. <cc>  4%


>  Hence, if the ratio in
> question falls within the right range, it is an argument in favor of VMS being a
> meaningful text.

There is nothing in the above distribution that suggests
an  unnatural language. In the same article I suggested
that <y>, which occurs almost only word-finally, and
accounts for 50% of word finals, is in fact  a variant
of some of the other vowels, in word-final, unstressed position.
Once again, many natural languages do that. Russian for instance,
where <o> and <a> are pronounced alike when unstressed. 


> For example, in English the percentage of vowels in a typical
> sufficiently long text is about 37%, while in the alphabet it is 23%.  In
> Finnish, which is at the top of the range (excluding those exotic tongues
> Frogguy has mastered in scores) the percentage of vowels in a text is about 65%,

This percentage, 65%, is only because Finnish  doubles long
vowels in spelling.

> but in most of other languages it does not exceed 40-45%.

If you took Polynesian languages, where vowel length either
is not recorded in writing, or is indicated by a macron over
the letter, you would have more than 50% vowels, since a
consonant is _always followed_ by at least one vowel. If you wrote
long vowels as double vowels, e.g. (I'm opening a Tahitian
dictionary at random here) `aihaaruumaa`a, instead of
`aihârûmâ`a, you'd end up with perhaps 75% vowels. (Note:
` is a consonant, it  denotes the glottal stop).

Tahitian, BTW, has nine consonants. As for its vowels,
it depends. Five or ten, depending if you count long
vowels as separate.