The Myth of Sisyphus is well know, even to schoolboys fascinated by Greek mythology: for having defied the gods and put Death in chains, Sisyphus was condemned to push a huge boulder up a mountain slope, and when he reached the top, the boulder would roll down the mountain, and Sysiphus would have to roll it up, again and again, for all eternity. In this paper, I will argue that this myth functions as an interesting allegory of the work of digital editors working with the TEI. When they decide to create a digital edition, they defy the limitations of traditional print editions, and give their work better chances for accessibility and perennity. But like Sisyphus, they have to pay a price for that: when a scholar publishes a print edition, he goes through the usual process of a critical edition, prepares his work for publication, then once it is published and printed, he does not need to care about his work: once it has been published, it will never be necessary to search for funds, unless a second edition is planned; libraries all over the world will take care of keeping copies safe in their collections and making them reasonably available to potential readers; the layout and typography of the work benefits from a secular tradition and is not likely to be questioned during the lifetime of the scholar.
In a word, the critical editor can move on to his next work, without giving much thought to his published edition. The scholar who engages in a digital edition, on the other hand, is a modern Sisyphus: publishing a digital edition is an exacting, never ending task! For instance:
I will review these issues, showing how scholars publishing digital editions find themselves in an absurd situation where nothing guarantees their work will always be available, unless, during their whole career, they take care of maintaining their work – not to mention the uncertain future of their work once they retire.
I will discuss potential ways to address these issues and to relieve the critical editor of his Sisyphean task, among which I will suggest a better defined status of published digital editions, and the creation of public institutions offering the equivalent of a legal deposit to digital editions.
In the course of the last years, the Library for Research on Educational History (Bibliothek für Bildungsgeschichtliche Forschung, BBF) in Berlin, Germany, has offered scholars in the history of education a service to put online unpublished editions of sources which have been transcribed and prepared but not published yet (in contrast to the more numerous editions born digital or the retrodigitization of older works). These editions, ranging from the 18th to the 20th century, mostly have been begun with a conventional printed publication in mind which has not proved feasible, though. The texts are converted from legacy formats (usually an older version of Microsoft Word) to XML, applying a limited set of semantic TEI markup. The online editions generated from the XML version are in some cases accompanied by a printed volume with selections from the full corpus. As a research library, the BBF regards close cooperation with the research community as one of its main tasks, providing TEI expertise while the scholars can concentrate on revising the text and preparing supplementary material like indexes and annotations.
The micropaper will show the specific problems and pitfalls encountered in such a conversion process, but also focus on how unpublished legacy transcriptions and editions of source material can be adapted to the TEI guidelines and put online with limited resources. Examples will be taken from the correspondences of the educator Friedrich Fröbel and the educational philosopher Eduard Spranger.
Starting point of our DFG-funded project (2007-2009) had been two series of digitized volumes. The „Allgemeine Deutsche Biographie“ in 55 volumes (ADB, publ. 1875-1912) and 23 of „Neue Deutsche Biographie“ (NDB, publ. since 1953) comprising some 47.000 articles and 88.000 persons mentioned in a separate non-XML database. The raw text had typographical encoded features we used to automatically restructure the text. Each article (in NDB) consists of functional parts like genealogy, life, works, etc. Most challenging part was the realignment of articles to persons. We had to identify persons with biographies and those mentioned in the text. We did heavily use <persName>, <birth> and <death>-tags to identify strings as persons. The TEI-Lite standard was released in order to allow those tags within text. While proof-reading the automatically encoded articles these tags could be more easily be hold apart than <name>-tags with several types. Nonetheless the TEI-Lite scheme can be validated after a simple transformation. In addition abbreviations and short-titles had been identified. Almost all persons mentioned in both series are manually identified in bibliographic Authority Files (Personennamendatei, PND). Places of birth and death are in progress to be aligned with geodatabases (namely OpenStreetMap). Both identifications result in concordance files, partly to have them maintained separately, partly to reduce the code within the TEI-encoded texts to ease readability. Next steps consists in a) „parsing“ the genealogy, to make relations between persons mentioned more explicit, b) breaking up the series/volume-structure, to have articles ordered and editable by person.
While publishing online two digitised biographical dictionaries containing biographies for about 40.000 historical persons in 47.000 articles a major challenge is to make the data available. Beside presenting the material freely available online and porting metadata into academic search engines and OA-registries we choosed to create Linked Open Data out of our biographical repository. Funded by PUBLINK (part of LOD2.eu) the AKSW helped us to provide biographical metadata in RDF. Thanks to having almost all persons aligned with the German Name Authority File (PND, already part of LOD), adding to a majority of places of birth and death identified in Geodatabases (OpenStreetMap) we created a first set of common ontologies (FOAF, DCMES) to express statements like „was born in“, „died in“, „knows“. In a second step we defined a set of mapping rules to CIDOC-CRM (actually using the OWL-DL variant Erlangen CRM). Motivation has been
The project Virtual Scriptorium St. Matthias intends to reunite the worldwide scattered codices from the library of the Benedictine abbey St. Eucharius or St. Matthias in Trier electronically. The project is realized at Stadtbibliothek and Stadtarchiv Trier as well as at the Center for Digital Humanities at the University Trier since summer 2010. About 450 codices from the period between the eights and fifteenth century will be digitized in three years. These codices concern a wide range of topics from various traditions. Beyond theological and religious writings you find a large amount of latin classics like Cicero, Priscian, Sallust or Martianus Capella. A prestigious example for the inculturation of ancient and pagan spirit is an illustrated edition of Aesops fables. No other abbey possessed as many manuscripts of Hildegard of Bingen as St. Matthias. You find also three important specimina of Dectretum Gratiani. One of them includes 60% of all glosses ever written on this work. But the richly illustrated Trierer Apokalypse from carolingian times maybe the most famous of all these codices. The project Virtual Scriptorium St. Matthias will present an electronic catalogue that sums up the knowledge from older descriptions and combines them with a presentation of the digitized codices. In this context TEI is used as a standard of XML description of manuscripts. The amount of objects requires a synchronization of these descriptions with a dynamic database to correlate them with other digitized catalogues, editions and databases like the PND. The results will be integrated in Manuscripta Mediaevalia and TextGrid. In this way the project will not only provide images and metadata but will also be included into a virtual working space in where further research and exploring will be possible, e.g. with TEI concurrent transcriptions of selected works.The project homepage www.stmatthias.uni-trier.de will be released on the first of August 2011 on a trial base. The project should be presented in a short talk and a poster. The poster will cover the project thoroughly while the short talk is supposed to sketch the advantages and some practical limits of TEI in such an enterprise.
Much research and practical effort has gone into the development and maintenance of a digital format that could form a stable foundation for texts in the digital age; the results of this work in the form of the /Guidelines for Electronic Text Encoding and Interchange/ have been widely adapted in the community. While this can indeed serve as a foundation for a digital edition of a text, the publication of texts encoded in such a way is still much less well understood and researched. The most common practice for digital publication today is to either publish in some form of web accessible form, with CD-ROM publication quickly becoming obsolete. In some rare cases, the XML source form of the edition is also available. For a researcher, this situation is in some respects much worse than it was when critical editions were published only in print, since in most cases the texts can only be *browsed* online (and every site has it's own idiosyncratic way of displaying and navigating a text) and not physically owned. This not only invalidates many of the potential advantages of digital texts, namely, making the digital edition available for machine mediated analysis, but even denies the reader the most basic form of scholarly activity, that is "active reading" or annotation of the text. What is needed here is the digital equivalent of a "college edition" of a text (and yes, we need to and can do much better than simply converting the text into ePub for consumption by electronic reading devices, but nevertheless this option also deserves attention as such devices become more sophisticated and widely adopted). To remedy this situation, a new publication form for digital texts is proposed. At the core, this is a plain text format that only contains very few traces of markup, but serves to make the textual content available to the reader. The text is published through a distributed version control system, which allows the researcher to create branches, annotate, edit or translate the text without losing the connection to the established digital edition and thus to all the other researchers, that are working on this text. If there are differing editions of a text, these editions can be represented as 'branches' in this system, but the assumption is still that there is one privileged 'master' branch that corresponds to a reading text in a critical edition. In some respects, such a text is similar to publishing a college or paperback edition of the text established in a critical edition: The reader knows that the text is based on a rigorous editing process and thus forms a safe foundation for further research, but at the same time is not burdened with all the details that might get in between her and the text, but has at every stage of his work the possibility to refer back to the critical edition if that becomes necessary. In some other respect, it resembles more the interactive communities or "social networks", that have sprung up on the internet recently and already carry a significant amount of scholarly communication. There is however a critical difference between such services and the model proposed here: In the model described here and implemented as a proof of concept as part of the Mandoku project
 the researcher, who publishes annotations in form of additional 'branches' of the master branch of a text retains control and ownership of all these additions, which constitute an essential part of his scholarly work, without compromising the ability to quickly share the results with interested colleagues. Earlier versions of these experiments used the TEI XML format as base for the texts, but it turned out to be a bad fit to the line-oriented model of texts used in version control systems. Currently, an enhanced version of the Emacs org-mode
 file format is used. This has the additional advantage of providing a flexible user interface as well as options for direct export to popular other text formats, such as HTML, PDF, OpenOffice XML and DocBook XML. A back converter to TEI XML, that will offer the option to roll the different "branches" of texts maintained in the version control system back into one single file is planned.
 cf. [http://www.mandoku.org], see also "Mandoku – An Incubator for Premodern Chinese Texts – or How to Get the Text We Want: An Inquiry into the Ideal Workflow", in: Digital Humanities 2010. Conference abstracts. London, 2010, p. 271-273.  cf. [http://www.orgmode.org]