Geography in TEI/EpiDoc Document Collections
(Tom Elliott, Sean Gillies, Sebastian Rahtz, Hafed Walda, Gabriel Bodard))
Most projects treating historical documents eventually find that they must cope with personal and geographic names. Search, analysis and presentation benefit not only from systematic tagging and regularization of names and descriptions in the documents, but also from the description and contextualization of the persons, groups, places and spaces to which they correspond. For geographical features, the expectations and opportunities in this domain continue to expand rapidly alongside the "neogeographical" revolution now transforming the world-wide web through location-based search, map mashups and virtual globe software. The past decade has also seen the maturation of "Historical GIS" as a research methodology and community of practice positioned at the intersection of geographic information science and the past-oriented humanities. With its most recent P5 revision, the Text Encoding Initiative has kept pace with these developments by expanding upon its markup guidance for names and descriptive phrases to provide also for the encoding of other data about people and places.
Scholarly analysis of the texts typically treated by a TEI/EpiDoc project demands not just the rigorous encoding of the geographic names found in documents and the historical places to which they correspond, but also the recording of finely nuanced observations on the documents' geographic contexts. These include at a minimum: place of creation or original display, place of finding and place of last observation. For any non-trivial collection of ancient documents, the proper encoding of this information inevitably engages both modern and ancient toponymy and runs the gamut from familiar to obscure. Often, the degree to which the requisite geographic assertions and identifications can be made varies from document to document, demanding mechanisms for the representation of sparse data as well as uncertainty and speculation.
This paper addresses the geographic aspects of several existing or emerging TEI/EpiDoc collections, including the Duke Databank of Documentary Papyri and the inscriptions of Aphrodisias, Roman Cyrenaica and Roman Tripolitania. The new P5 place encoding guidance supports best practice for projects like these: the development of collection-oriented gazetteers or geodatabases, populated through progressive tagging and iterative analysis of the documents and subsequently normalized or fleshed out through linkages with other authoritative gazetteers or even geospatial coordinates derived from fieldwork and map digitization. We will also consider modes for the use and presentation of this information in web contexts; in particular, we will address the use of the Pleiades project as an authoritative, web-facing gazetteer for Greek and Roman culture, as well as web feeds incorporating GeoRSS tagging for discovery, interoperability and revision notification. We will compare these techniques with those employed by a major onomastic resource, the Lexicon of Greek Personal Names. The LGPN team, in refashioning their database to employ TEI encoding methods for documenting names and persons, is also taking steps to control and express the geographic aspects of their data.