[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
VMs: Re: VMS Transcriptions
At 14:43 09/10/02 +0000, Philip Neal wrote:
Not being a computer wizard I've stayed out of this thread, but I
think Bruce has put his finger on something. I think that Glen is
doing valuable work and if a group collaboration can help I want to
be part of it, but I too think that various conflicting requirements
are being bounced around.
1. The transcription alphabet. Glen has drawn attention to the
neglected topic of infrequent characters and seemingly significant
variants of frequent characters. This work depends on the use of a
stroke-based transcription: but proposals for collaborative input
to a file seem to involve the EVA scheme used by most of us.
If the database holds a core transcription as EVA, then as long as (1)
there is sufficient ligature information built-in to the transcription, and
(2) there is a unique remapping between the ligatured EVA and a glyph
representation that can be done "on the fly", no-one need actually know -
they can view/edit/update the transcription in EVA or in their preferred
These are not small conditions, however. :-/
2. The master file. If various different individuals are going to
input their personal readings of disputed text on a Wiki principle,
there is a danger of the master file or master database getting
I don't advocate a Wiki for transcriptions - I think that could get out of
hand very quickly. :-O
However, Wiki for other information and resources would be very cool indeed
- we're making progress there, fingers crossed it'll be up and running
I would want to be able to extract a running
text using fairly simple scripts much as I can now extract e.g. the
Takehashi version of each line. I also do not want to be restricted
to custom scripts: I do quite a lot of work with Python scripts of
my own devising. I have no strong opinions about XML and its rivals,
but please keep it fairly simple.
XML lends itself readily to transformation into whatever format suits you,
3. The word divisions. I think that any word division is better than
none, but I don't see an easy way to hold variant divisions of a line
in a file or database in a way which could be modified collaboratively.
Whether to have a pure collaborative database (ie, with live changes) or a
moderated database (suggested changes get sent through to one or more
moderators), now that's a good question. :-)
Also: the *intention* of a transcription has to be agreed by all parties
involved. Is it to compile (a) the closest visual matches to the
glyphs/strokes as they appear on the page, or (b) the most likely set of
glyphs/strokes, also based on the adjacent stroke context?
Looking at the transcriptions we have already, ISTM as though individuals'
intentions may have varied from time to time. :-O
Cheers, .....Nick P.....