[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: Re: Numbered transcription

To: voynich@xxxxxxxxxxxxxx (Voynich Ms. mailing list)
Subject: VMs: Re: Numbered transcription
From: Jorge Stolfi <stolfi@xxxxxxxxxxxxx>
Date: Wed, 2 Oct 2002 00:31:02 -0300 (EST)
In-reply-to: <Pine.LNX.4.44.0210012108100.2697-100000@ufal.ms.mff.cuni.cz>
References: <Pine.LNX.4.33.0210011723340.16093-100000@astro.as.wsp.krakow.pl> <Pine.LNX.4.44.0210012108100.2697-100000@ufal.ms.mff.cuni.cz>
Reply-to: stolfi@xxxxxxxxxxxxx
Sender: owner-voynich@xxxxxxxxxxxxxx

Hi all,

I agree with Rene that format and encoding are non-issues to some
extent. We have powerful conversion tools, such as bitrans, and every
one is free to create and post an alternative version of the
interlinear file, in his favorite encoding and format.

For myself, I would rather stick to EVMT as the "reference" format for
the time being, because it is truly platform-independent, and pretty
much the simplest one that carries all the essential information ---
not to mention that I already have tons of scripts and programs based
on it. 

As I see it, switching to XML as the reference format would only make
the file more difficult to edit (need special editors), or to process
(needs lots of extra code just to parse and generate), or even to view
(need an XML-enabled mail tool). As for MS-Word, well, I don't use MS
software --- and I cannot understand why anyone would put their
important data into such a "trapdoor" format, and then be forced to
buy an expensive piece of software every few years just in order to
get the data back...

Ditto for EVA as the encoding. Glen is quite right in saying that by
choosing EVA we are prematurely commiting to certain assumptions about
glyph parsing and classification which may turn out to be wrong.
(I have my own pet peeves in this area, such as the hooked versus
straight <p>s and <f>s.) Moreover, most of the "easy" EVA code space has
been used, so it may not be possible to accomodate transcriptions
which make finer distinctions, such as those which Glen feels
are needed. I also understand that such "glyph aliasing" and "glyph
count" errors, which are only a nuisance for "language" believers, are
a very serious problem for people who see Voynichese as cipher.

On the othe hand, EVA does make all the distinctions which were made
by previous encodings, and can be fairly easily mapped to them. EVA is
even compatible with some alternative views, e.g. that <a> is the same
as <ci>, or that <ee> is a single letter (just use bitrans). So,
again, I think I will stick to EVA for the time being --- possibly
until the "code" is cracked, or at least until we feel more confident
about which finer distinctions are real. Needless to say, to address
this issue we need at least some partial transcriptions with much
finer categories than EVA --- such as the one which Glen is building.

As for fonts, I personaly find the ASCII transcribed text (EVA,
Currier, or whatever) easier to read and analyze than a VMS-fontified
page. In any case, much of the Voynichese which I have too look at is
in odd places --- in scripts and programs, in tables, in intermediate
data files, in debugging messages --- were one cannot expect it to be
properly XML-formatted and fontified.

All the best,

--stolfi

Follow-Ups:
- VMs: RE: Numbered transcription
  - From: GC

References:
- VMs: Re: Numbered transcription
  - From: Greg Stachowski
- VMs: Re: Numbered transcription
  - From: Jan Hajic

Prev by Date: VMs: Re: Numbered transcription
Next by Date: VMs: Re: Numbered transcription
Previous by thread: VMs: Re: Numbered transcription
Next by thread: VMs: RE: Numbered transcription
Index(es):
- Date
- Thread