Markup of the Archimedes Palimpsest
(C. Blackwell, D.N. Smith)
This paper will present our work on the TEI-conformant XML texts to accompany the publication of the images of the Archimedes Palimpsest, the challenges presented by this complex document and by the philosophical principles guiding the project. We will present and discuss our team's choices in implementing TEI-XML for this project, with particular attention to the scheme of stand-off markup that associates Regions of Interest (ROIs) on images with text in the transcriptions, using Canonical Text Services URNs to point to ROIs, with granularity that ranges from the level of folio and column, down through lines of text and individual characters.
The Archimedes Palimpsest Project is entirely focused on bringing the multi-spectral imaging of this precious artifact to the public; all other work exists only to support that goal. The creation of electronic transcriptions of the Archimedean text of the Palimpsest intends to give the public as much help as possible in interpreting the images. This goal has defined and limited our choices as we set about employing TEI XML to represent a very complex text.
At some time before 1220 CE, a vellum codex containing at least seven works of Archimedes--Equilibrium of Planes, Spiral Lines, The Measurement of the Circle, Sphere and Cylinder, On Floating Bodies, The Method of Mechanical Theorems, and the Stomachion--was unbound, the pages scraped clean (and shuffled in the process), rebound, and overwritten as a book of prayers in Greek, or Euchologion. The resulting palimpsest presents a challenge for the imaging scientists, but also for the would-be editors of an XML transcription in support of that imaging, since the text must support at least three overlapping hierarchies: the Euchologion foliation, the original foliation of the Archimedean text (with its columns and lines), and the organic structure of Archimedean works, books, propositions, and paragraphs.
The principle witnesses to the text of the Euchologion and its underlying works of Archimedes are, of course, the images. Any XML text will exist only as an aid to interpreting those images, and to that end the we have set out to record the judgements of two careful readings, that of John Ludwig Heiberg, who studied the palimpsest between 1906 and 1915, and that of Reviel Netz and Nigel Wilson who have studied the palimpsest since its rediscovery in 1998. The former reader, Heiberg, did not have the advantage of modern imaging, but did have access to the codex before it suffered a century of damage and neglect.
The anticipated audience for this publication will be both human readers from a variety of backgrounds, and machine-assisted processes of analysis. Both require a clear association between images, and regions of interest on images, and the electronic transcriptions. Automated processing will benefit from additional intervention such as re-association of divided words. And to help both kinds of readers, the markup in the XML files should be as lightweight, generic, and human-readable as possible.