[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

VMs: VMS Transcriptions

A general comment about the recent "transcription" threads:

I think a lot of the controversies - "what constitutes a word?" - "one
letter or two" - "XML vs HTML vs ..." hinge on the fact there are two
different types of transcriptions which are useful.

The ultimate test of all hypotheses about the VMS will be their ability
to produce a reading of it which appears reasonable, consistent and
appropriate to the circumstances of the manuscript. (Exact definitions
of these terms are TBD.)

A transcription file can help in two ways:
1.    By capturing as much detail as possible from the written
manuscript in a format which can easily be copied and disseminated (I
think of this as a "variorum edition" since it may need to contain
multiple interpretations of ambiguous features.)
2.    By providing a version which can be processed with programs in the
attempt to develop and test various hypotheses about the VMS.

To me, these should not be expected to be the same file. To allow
testing of a particular hypotheses (e.g. that "cc" is a single
character), it is necessary to create a test transcription file using
this assumption and then perform analysis on it. To try to do such
analysis from a transcription which expresses all possibilities seems
pretty impractical.

Ideally, the type #1 transcription would capture enough detail to allow
different type #2 transcriptions to be created from it for testing
different theories. However, this is also a "slippery slope" -
ultimately, you could end up merely going for finer and finer-grained
images of the original, allowing each user the complete flexibility (and
responsibility) to decide for him/herself what constitutes a "glyph".

It may turn out that a highly structured format like XML is great for
the type #1 transcription, but less useful for the type #2 because of
processing overhead.

As far as the question of technologies is concerned - open source vs.
Microsoft, XML vs. HTML vs. flat text - may I suggest that we "let a
hundred flowers bloom"?

Why not take a small section of the VMS, say a couple of pages, let each
exponent of a particular approach prototype it using that text (with the
understanding that it is provisional, of course) and see how useful they
appear to be to the group?

Bruce Grant