An Encoding Model for Genetic Editions

[Page]

About this Document

This document describes a draft encoding model for Genetic Editions and Genetic Editing. The document is the product of a Workgroup on Genetic Editions (chair: Fotis Jannidis), which is part of the TEI MS SIG (chairs: Elena Pierazzo, Malte Rehbein, Amanda Galley).

The workgroup's goal was to develop an Application Profile for the encoding of genetic editions and, in general, genetic phenomena. It is expressed as a TEI P5 conformant customization, integrating material from the existing TEI Guidelines, chiefly Chapter 11. Representation of Primary Sources and Chapter 12. Critical Apparatus, together with additional new material. It may eventually, at the end of the process described in the following section, constitute a free-standing new chapter of the Guidelines or remain a set of recommendations for how to customize the Guidelines, but that is a decision for the TEI Council.

The document reflects discussions held at a number of different meetings:

The document was subsequently extensively revised by Elena Pierazzo and Lou Burnard, for presentation at a panel held at the Annual TEI Members Meeting in November 2009; at a Genetic Edition workgroup held in Oxford (February 2010); at a Genetic Edition workgroup held in Würzburg (March 2010); and finally in Oxford in April of the same year for presentation to the TEI Council meeting in Dublin.

The work group was initially inspired by HMNL, the ‘HyperNietzsche Markup Language’ and following versions (GML Genetic Markup Language) produced by Paolo D'Iorio and colleagues from the HyperNietzsche project. We would like to thank Paolo D'Iorio for his invaluable contribution in the early stages of the work.

Major contributors to this document, in addition to those already cited, include Gregor Middel and Moritz Wissenbach, from the Faustprojekt at the University of Wuerzburg.

Work Plan

The planned evolution of this document and the encoding model it describes may be summarized as follows:

This draft document is publicly available for discussion and feedback from the community. The document source is maintained in the TEI subversion repository at http://tei.svn.sourceforge.net/viewvc/tei/trunk/genetic/ ; background information about the development of the proposals and associated materials, is hosted on the TEI Wiki at http://wiki.tei-c.org/index.php/Category:Genetic_Editions .

Conventions used

Although the entire document is a draft and therefore susceptible of changes, some sections are less stable than others. In particular, when a section or a particular element requires further discussion or is considered an open problem, such a section or element is marked by a * mark.

As required for TEI conformance, non-TEI elements are defined in a distinct non-TEI namespace. In the usage examples and throughout this document that namespace is mapped to the prefix ge:, while TEI elements are not marked by any namespace prefix.

1 Theoretical Framework

The genetic approach differs from other approaches to the study of texts because it aims not only to identify ‘what is on the page’, but also to reconstruct the process necessary to produce ‘what is on the page’.

The encoding model for Genetic Editing must therefore handle:- A Genetic Edition may be prepared by producing a full transcription of all extant witnesses, or by combining a full transcription of only one document (a base-text) with information derived from other witnesses by means of automatic collation.

Because our model aims to be independent of presuppositions associated with any particular theoretical framework, we begin by reviewing some typical dichotomies in editorial theory.

1.1 Fact vs. Interpretation

In German editorial theory there is a well known opposition between what is there in the source document, the record (Befund), and the interpretation of this phenomenon (Deutung). This opposition implies that there is a way to talk about the record without any interpretation. Yet at some possibly simplistic level, everything we say about a text is based on interpretation, particularly in the realm of genetic criticism. 1 At the same time, there is an obvious difference between the interpretation that some trace of ink is indeed a specific letter and the assumption that a change in one line of a manuscript must have been made at the same time as a change in another line because their effects are textually related (for example, the first change was to a rhyming word, which necessitated the second change). Therefore we propose to talk about differing levels of interpretation, thus differentiating between ‘what’s there’ (document/fact) and ‘how does it relate’ (text/interpretation).

1.2 Status vs. Process

When a scholar examines a written text, especially a manuscript, the object of the investigation is usually to discover the final result of that writing and rewriting process. The examination can take various forms: it can be approached from a codicological or documentary point of view, ‘photographing’ the resulting product, from a textual point of view, or from a genetic point of view, by trying to describe the flow of authoring.

The present proposal addresses both approaches: sections 3.1 Transcription of a document and 3.2 Textual Alterations are presented from the codicological or documentary point of view, while the other sections investigate the process of writing and re-writing of the text, thus constituting the purely genetic parts of the present proposal.

1.3 Document vs. Text

In Manuscript Studies (Editing, Codicology, Palaeography, Art History, History) the first level of enquiry is always the document, the physical support that lies in front of the scholar’s eyes.

To understand the text that is contained in the manuscript, a deep study of the manuscript itself is fundamental: the layout, the type of script, the type of support, the binding, and many other aspects are able to tell us about when, where, and why this particular text was composed. The text however represents a different level of enquiry: it is a construct, derived from the reading of the documents.

In the case of modern draft manuscripts scholars must give detailed consideration to the layout, the different stratifications of writing and the disposition of these in the physical space; all of these, together with an understanding of the text, are required to gain insight about the composition, time of revisions, and flow (flux) of the text. Furthermore, in some cases, we know that the kind of physical support used to record it not only influences but may also actually determine the text itself. For instance, the content and the length of letters are often determined by the size and quantity of the paper available to the writer; even more so for items such as postcards.

The TEI has traditionally prioritised the text level. Of the two possible views available to someone transcribing a primary source (text and document), the TEI privileges the text (hence Text Encoding Initiative). Such physical or topographical information as a typical TEI encoding provides is subordinate to the main structural encoding, whether because it is represented by empty elements (<pb/>, <lb/>, <cb/>) or attributes (<add place="">, <note place="">, or rend). The TEI thus reflects the not uncommon view that, while relevant, documents are somehow less relevant than the texts they embody; to use a bibliographical metaphor, texts are ‘substantial’ while documents are ‘accidental’.

However, for genetic editions, a focus on the document is crucial. In many cases, the only way to reconstruct the process of writing and re-writing which leads to a new text is to examine a specific document. We therefore propose to complement the existing text-focussed approach with a new encoding scheme focussed instead on the document.

We should then clarify the way we will use the following words:

1.4 Writing Acts vs. Text Stages

As noted in the previous section, the focus on purely documentary aspects in a genetic edition often serves to reconstruct the writing process, which in turn enables allow the editor to justify a genetic analysis, for example to identify which text passage has been altered first or which textual variant predates another. The encoding of the writing process, it must be observed, is not simply a different point of view about the text in hand, but rather regards as a relevant a set of information fundamentally different from that captured by a purely textual encoding.

When talking about alterations on a textual level, it is the stages of the text which vare of major interest. These stages are the results of various alterations applied to the draft; each addition, deletion, substitution etc. is viewed from the perspective of the possibly new text state it yields. For example the following substitution is well-defined from this perspective:

<subst>
 <del>landscape</del>
 <add>scenery</add>
</subst>
When considering the writing process however, a number of different possible writing acts can be imagined leading to the same change of state. For example:
<p>
 <del rend="strikethrough">landsca</del>pe <handShift new="#new_material"/>scenery
</p>

This example stresses the writing process and provides two additional pieces of information: firstly the fact that the author did not strike through the whole of the first word and secondly that a change in writing material happened just before the word scenery was written. In current encoding practice, which tends to favour the textual view, the two perspectives are often integrated, subordinating aspects of the writing process to their textual outcome or regularizing them so they do not interfere with a text-driven evaluation of the encoding.

<subst>
 <del>
  <hi rend="strikethrough">landsca</hi>pe</del>
 <add hand="#new_material">scenery</add>
</subst>

As the example shows, this integrative approach can be a viable one in many cases, but taking this pragmatic shortcut blurs a specific editorial distinction, maybe exactly the distinction one wants to emphasize (e.g. to separate Befund from Deutung). Furthermore, it makes it particularly hard to be as precise about the documentation of ‘what’s there’ as one is about the interpretation of ‘what’s the possible outcome’. Ideally in a genetic edition the editor would prefer to encode the two perspectives separately and align them later on instead of trying to integrate them right away, and thereby potentially neglect information of importance in their genetic analysis.

2 Aspects of Genetic Editions

Modern genetic editions describe the genetic process within one manuscript and over the course of two or more manuscripts, which are said to form part of a dossier, that is, the set of documents which a genetic editor considers as having contributed to the evolution of a particular text, including associated diaries, letters, etc. Usually a view of a manuscript as a single self-contained object is also offered. This is because the manuscript view provides the material basis for the relationships established by the inter-manuscript relationship. Therefore we propose to differentiate between the following aspects of a genetic edition:

Document level
topological description
description of the layout of the text and the basis for a rendition of a text as a diplomatic transcription.
Textual Alterations
like additions, deletions, substitutions.
Grouping Modifications
groups of changes at different locations of one document, used to create sets which express the editorial assumption that these changes have been undertaken in one stage.
Dossier level
Genetic Grouping
groups phenomena in more than one document, in order to describe editorial assumption that these phenomena are related in some way.
Genetic Relation
describes the genetic relation between different parts of a text, 2 or across several documents, as a series of steps on a path.
Comparison or Collation
expresses the differences between texts as the result of a comparison between documents.
Document and Dossier levels
Chronology, Date and Time
the encoding of the chronology of the text or parts of it in absolute or relative time.
Documenting Editorial Decisions
documents the arguments which are the basis for editorial decisions to encode the text in a specific way including ways to express uncertainty and alternatives.
Text Stage
a reconstructable stage in the evolution of a text, represented by a document or by a revision campaign within one or more documents, possibly assigned to a specific point in time.

3 The document level

3.1 Transcription of a document

A document-based transcription is hierarchically organised in the following way:
  • document
    • Writing Surface (page, double page, folium, etc.)
      • zone
        • Text, lines or tables
We propose to introduce a new element, <ge:document> to encode a document-based transcription, at the same level as the existing TEI text element. A full TEI document may thus comprise:
  • a TEI Header, containing metadata
  • a TEI facsimile element, containing and describing visual representations of a document
  • a <ge:document> element, containing a genetic transcription of a document
  • a TEI text element, containing an encoded version of the text constructed from the document.
The header and at least one of the other three components must be present. We do not discuss facsimile or text elements here; for these refer to the published TEI Guidelines.
In the simplest case, a document contains one or more written surfaces, of various types (pages, for example). Each surface may contain one or more distinct written zones of writing, each comprising one or more identifiable topographic lines. The following elements are used to represent these basic components:
  • document contains a document-centric transcription of a primary source, providing topographical information as well as transcription
  • surface defines a written surface in terms of a rectangular coordinate space, optionally grouping one or more graphic representations of that space, and rectangular zones of interest within it.
    type characterizes the element in some sense, using any convenient classification scheme or typology.
  • zone defines a rectangular area contained within a surface element.
    rotate indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent surface element as implied by the dimensions given in the msDesc section or by the coordinates of the surface itself. The orientation is expressed in arc degrees.
  • line contains the transcription of a topographic line in the source document
    type characterizes the element in some sense, using any convenient classification scheme or typology.
  • table contains text displayed in tabular form, in rows and columns.

Like a facsimile, a <ge:document> contains information about the written surfaces constituting a document. Because of this similarity, we would like to use the same elements (surface and zone) as proposed in the existing TEI scheme, although these place limits on what can be described. Specifically, the zone element as currently defined can represent only a rectangular area; it also lacks any way of stating the baseline applicable to any writing contained within it

The size of the writing surface is defined by a set of cartesian coordinates measured from the top left corner. The co-ordinates of all zones identified within the writing surface are given in terms of the same co-ordinates, as further discussed in the TEI proposals for facsimile. It will often be the case that explicit dimensions for a manuscript page (expressed in mm for example) are also supplied in a msDesc element in the TEI Header, but this is not a requirement; in particular there is no assumption that the co-ordinate system defined by a surface maps to any particular external dimensions, nor that the co-ordinate systems of different documents necessarily correspond.

A surface element may contain any number of zone, graphic, <ge:line> or tableelements. The graphic element is used to point to any graphic (non textual) component forming part of the page, in the usual TEI manner. The zone element is used to delimit any contiguous section of writing which the encoder wishes to identify for some purpose.

Zones can be nested and grouped, and can also overlap. Their positioning with respect to the surface element is defined by coordinate values taken from the same co-ordinate system as the surface itself, measured from the top left corner. The element carries a rotate attribute which describes (in degrees) the orientation of the surface with respect to the content (writing, images) in that zone, with respect to its normal orientation. Note that the mechanism aims to describes the process by which the content of a specific zone has been supplied (i.e. the author has physically rotated the writing surface) rather than the orientation of the writing.

Zones are arbitrarily defined by the encoder according to the layout of the writing surface and can make use of a standardised vocabulary (e.g. the top margin).

To overcome the inherent limitations of the existing zone and surface elements, we need to extend their capability to include the definition of arbitrary polygons, using an attribute such as svg:points, from the Standard Vector Graphics (SVG) XML namespace. This would also provide a way of defining a baseline for the writing. This work is not yet fully elaborated.

The attribute stage is used to indicate the stage in a writing campaign to which this zone has been assigned by the encoder, as further discussed in 6.1 Stages/ Revision campaigns below.

Within a zone, individual lines of writing are usually, but not necessarily, distinguished using the <ge:line>. Zones can also include table elements, as lines and tables represent the principal ways of organising texts on a surface. In the case of an interlineated text, interlinear writing can be treated as a line on its own (perhaps characterised by a type attribute) or as textual addition, encoded with an add (see below 3.2 Textual Alterations).

In the following imaginary example, there are two main areas of writing, the diary entry (black ink) and another (supposedly later) annotation in blue ink.
Note that the diary entry forms a zone which itself contains two zones: one containing the date, and the other containing three lines about birds. The five lines of annotation in blue ink form another zone, to record which the diary page has been rotated 90° clockwise by the author. Using the elements discussed so far, the page might be transcribed as follows:
<ge:document>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone
    ulx="10"
    uly="43"
    lrx="185"
    lry="84"
    rotate="0">

   <zone>
    <ge:line rend="right"> 1 April 2009 </ge:line>
   </zone>
   <ge:line>Fed Birds in the park today.</ge:line>
   <ge:line>Might write an article about </ge:line>
   <ge:line>the Thick-billed Warbler. </ge:line>
  </zone>
  <zone
    ulx="9"
    uly="20"
    lrx="70"
    lry="60"
    rotate="90">

   <ge:line>Samaria is a Greek </ge:line>
   <ge:line>brand of water that</ge:line>
   <ge:line>comes from the natural</ge:line>
   <ge:line>springs of Stilos, in </ge:line>
   <ge:line>Crete </ge:line>
  </zone>
 </surface></ge:document>
On the other hand, if the encoder considers it inconvenient to mark the text within <ge:line> elements, one could encode the same text as follows:
<ge:document>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone
    ulx="10"
    uly="43"
    lrx="185"
    lry="84"
    rotate="0">

   <zone>1 April 2009 </zone>
   <lb/>Fed Birds in the park today.<lb/> Might write an article about
  <lb/> the Thick-billed Warbler. </zone>
  <zone
    ulx="9"
    uly="20"
    lrx="70"
    lry="60"
    rotate="90">

   <lb/>Samaria is a Greek <lb/> brand of water that <lb/> comes from the
     natural <lb/> springs of Stilos, in <lb/> Crete </zone>
 </surface></ge:document>
For comparison, here is a typical TEI transcription of the same page, focussing on its textual structure
<div type="diary-entry">
 <dateline>
  <date value="2009-04-01"> 1 April 2009 </date>
 </dateline>
 <p>
  <lb/>Fed Birds in the park today.<lb/> Might write an article about <lb/>
   the Thick-billed Warbler. </p>
</div>
<div type="note" rend="rotated">
 <p>
  <lb/>Samaria is a Greek <lb/> brand of water that <lb/> comes from the
   natural <lb/> springs of Stilos, in <lb/> Crete</p>
</div>

Is it possible to combine both perspectives (documentary and textual views) within a single encoding? In general a document-based transcription, which is done page-by-page and possibly line-by-line, is almost certain to overlap with some part of a the text-based structure. The cleanest solution may be to encode both structures separately, providing both a <ge:document> and a distinct text solution, perhaps using some form of external pointing to link the two, and minimizing redundancy of encoding by using XInclude. This option is further discussed below and also in the TEI Guidelines.

A further, and possibly simpler, approach is to apply to the textual elements such as p exactly the same kind of ‘flattening’ approach as has been applied to the <ge:line> elements in the preceding example. Instead of marking the textual paragraphs as full elements, we mark only their frontiers, using the standard TEI milestone element, with the addition of a spanning attribute, as follows:
<surface
  ulx="0"
  uly="0"
  lrx="200"
  lry="300">

 <zone
   stage="#stage1"
   seq="0"
   ulx="10"
   uly="43"
   lrx="185"
   lry="84">

  <zone>
   <milestone unit="date" spanTo="#endDate"/>1 April 2009 <anchor xml:id="endDate"/>
  </zone>
  <milestone unit="p" spanTo="#p2"/>
  <ge:line>Fed Birds in the park today.</ge:line>
  <ge:line> Might write an article about </ge:line>
  <ge:line>the Thick-billed Warbler.</ge:line>
 </zone>
 <zone
   stage="#stage2"
   ulx="9"
   uly="20"
   lrx="70"
   lry="60"
   rotate="90">

  <milestone unit="p" xml:id="p2" spanTo="#end"/>
  <ge:line>Samaria is a Greek</ge:line>
  <ge:line>brand of water that</ge:line>
  <ge:line>comes from the natural</ge:line>
  <ge:line>springs of Stilos, in</ge:line>
  <ge:line>Crete</ge:line>
  <anchor xml:id="end"/>
 </zone>
</surface>
Elements zone, <ge:line>, and cell may all contain a selection of the phrase level elements which have been considered essential to a document based transcription. These elements are all drawn from the following existing TEI classes:
  • model.pPart.transcriptional groups phrase-level elements used for editorial transcription of pre-existing source materials.
  • model.pPart.editorial groups phrase-level elements for simple editorial interventions that may be useful both in transcribing and in authoring.
  • model.hiLike groups phrase-level elements which are typographically distinct but to which no specific function can be attributed.
  • model.gLike groups elements used to represent individual non-Unicode characters or glyphs.
  • model.global groups elements which may appear at any point within a TEI text.
A subset of the elements provided by these classes needs to be defined. In addition, a mechanism is needed to constrain the content of the selected elements such that they contain only elements which are allowed within <ge:line>. The generic seg element can also be used to encode any semantic aspect of the transcribed document (e.g. dates or names), though it is recommended that such an approach is kept to a minimum. In case extensive semantic markup is to be applied, the textual perspective should be preferred or included alongside, as suggested above.
The written surfaces of which a document is composed are not always homogenous. In the following example, taken from the Walt Whitman archive, two pieces of newsprint have been glued to a piece of blue paper on which a poem is being drafted:
Image from
                        http://www.whitmanarchive.org/resources/sleepers/duk.00258.001.jpg
Figure 1. Image from http://www.whitmanarchive.org/resources/sleepers/duk.00258.001.jpg
The two pieces of newsprint might perhaps be regarded as special kinds of zone, but they are effectively new surfaces, since they might contain additional written zones themselves (such as the numbers in this case). We therefore propose a distinct element, <ge:patch>, which can appear within a surface, and behaves effectively like one, except that it contains specialized attributes to provide additional information.
  • patch contains a part of a written surface which was originally physically distinct but became attached to it at the time that one or more written zones were created.
    binder Describe the method by which a patch is or was connected to the main surface
    type characterizes the element in some sense, using any convenient classification scheme or typology.
    height height of the patch in mm
    width width of the patch in mm
Using this element, the Whitman draft above might be encoded as follows:
<surface>
 <zone>
  <ge:line>Poem</ge:line>
  <ge:line>As in Visions of — at</ge:line>
  <ge:line>night —</ge:line>
  <ge:line>All sorts of fancies running through</ge:line>
  <ge:line>the head</ge:line>
 </zone>
 <ge:patch
   type="newsprint"
   binder="glue"
   height="40"
   width="90">
Spring has
   just set in here, and the weather.... a steamer <zone>
   <ge:metaMark function="sequence">2</ge:metaMark>
  </zone></ge:patch>
 <ge:patch
   type="newsprint"
   binder="glue"
   height="35"
   width="90">
"The shores
   on either side of the Sound are... The In- <zone>
   <ge:metaMark function="sequence">3</ge:metaMark>
  </zone></ge:patch>
</surface>
The <ge:metaMark> element used in this example is further discussed below ( 3.2.3 Metamarks)
Surfaces can also be damaged, perforated, and cut. In such cases a combination of damageSpan and gap elements may be used.
  • damageSpan/ (damaged span of text) marks the beginning of a longer sequence of text which is damaged in some way but still legible.
    group assigns an arbitrary number to each stretch of damage regarded as forming part of the same physical phenomenon.
  • gap (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible.
Suppose that a piece of paper has been cut from the page encoded by a surface element, and that we can infer the dimensions of the missing piece. This could be encoded as follows:
<surface
  ulx="0"
  uly="0"
  lrx="200"
  lry="300">

 <zone>...</zone>
 <damageSpan
   hand="#author"
   spanTo="#zoneEnd"
   agent="knife_or_scissors"
   group="1"
   extent="3x5"
   unit="cm"/>

 <zone
   ulx="9"
   uly="20"
   lrx="70"
   lry="60"/>

 <anchor xml:id="zoneEnd"/>
</surface>
In the example an empty zone element has been provided in order to give the coordinates of the missing piece of surface, assuming that those are reconstructible. If a full page is missing, damageSpan can be used within <ge:document>, perhaps in conjunction with a gap element.
<ge:document>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone>...</zone>
 </surface>
 <damageSpan spanTo="#p3"/>
 <gap extent="1" unit="folio">
  <desc>Stub of a missing folio</desc>
 </gap>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300"
   xml:id="p3">

  <zone>...</zone>
 </surface></ge:document>
Now suppose that the missing folio is by chance found somewhere; in this case it is likely that the editor might want to encode the two statuses of the document, before and after the damage, tracing the genesis of the document as well as the genesis of the text. In this case, the same mechanism used to describe the genetic history of the text can be used to describe the genesis of the document (see below 6.1 Stages/ Revision campaigns): a stage attribute linked to a <ge:stageNote> element can identify the different states in the history of the document.
<ge:document>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300">

  <zone>...</zone>
 </surface>
 <damageSpan spanTo="#P3" stage="#stage2"/>
 <gap extent="1" unit="folio" stage="#stage2">
  <desc>Stub of a missing folio</desc>
 </gap>
 <surface corresp="folio.xml#p1" stage="#stage1"/>
 <surface corresp="folio.xml#p2" stage="#stage1"/>
 <surface
   ulx="0"
   uly="0"
   lrx="200"
   lry="300"
   xml:id="P3">

  <zone>...</zone>
 </surface></ge:document>
In this example, we assume that the two recovered pages (i.e. both sides of the missing folio) have been encoded in some separate XML document which can be obtained from the URL folio.xml. This could be referenced via the corresp attribute, transcluded by means of a ptr or included by XInclude. The content could also simply be encoded within the same document as content for the otherwise empty surfaces). Note that the damageSpan and gap belong to a second stage in the life of the document; the descriptions of #stage1 and #stage2 can be found in the header, within the <ge:stageNote> section.

3.2 Textual Alterations

Traces of authorial alteration (correction, addition, deletion, etc.) are frequently found within a single document, and may also be inferred when different documents are compared. It is however an open question as to whether inter-document discrepancies at the dossier level should be regarded in the same way as intra-document alterations. If two witnesses are collated, we may observe that a word present in one is missing from the other: does it necessarily follow that this is an addition or a deletion, which we would not hesitate to mark with an add or del tag if we are transcribing a single manuscript? We return to this question below.

In this section we discuss elements introduced for the markup of alterations at the document level, within a single document, complementary to the elements already provided for this purpose by the TEI scheme. We discuss specifically:
  • ‘meta-marks’, that is a kind of authorial markup present in the source and indicating how it should be read;
  • additions, where a passage has been rewritten to fix or clarify it;
  • deletions, where a passage had been struck through to indicate that it has been removed, or where a deletion has itself been cancelled
  • transpositions, where passages have been reorganized or resequenced.
The TEI already provides the following basic elements for transcription, which constitute the model.pPart.transcriptional class:
  • add (addition) contains letters, words, or phrases inserted in the text by an author, scribe, annotator, or corrector.
  • app (apparatus entry) contains one entry in a critical apparatus, with an optional lemma and at least one reading.
  • corr (correction) contains the correct form of a passage apparently erroneous in the copy text.
  • damage contains an area of damage to the text witness.
  • del (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector.
  • orig (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
  • reg (regularization) contains a reading which has been regularized or normalized in some sense.
  • restore indicates restoration of text to an earlier state by cancellation of an editorial or authorial marking or instruction.
  • sic (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate.
  • subst (substitution) groups one or more deletions with one or more additions when the combination is to be regarded as a single intervention in the text.
  • supplied signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
  • unclear contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source.
The present proposals extend this list by adding the following elements to the same class:
  • mod represents any kind of modification identified within a text at a documentary level.
    rend (rendition) indicates how the element in question was rendered or presented in the source text.
    type characterizes the element in some sense, using any convenient classification scheme or typology.
  • modSpan/ represents any kind of modification identified within a text at a documentary level, where this extends over other XML markup constructs in the document.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • metaMark contains any textual or graphical mark in a written text intended to signal how the text should be read and not forming part of the text itself.
    function describes the function (e.g. add, delete, alternate) of the mark.
  • used/ a passage of text which has been marked as used, usually meaning that it has been transcribed to a fair copy.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • undo/ points to any marked-up intervention in a text which has subsequently been marked as to be cancelled or undone.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • redo/ points to a marked-up intervention in a text which has subsequently been marked for a second time in a different way.
    spanTo indicates the end of a span initiated by the element bearing this attribute.
  • rewrite contains a sequence of text which has been rewritten by the author, for example by over-inking, to clarify or fix it.
    cause documents the presumed cause of the repeated act of writing.
  • transposeGrp supplies a list of transpositions indicated at some point in the text, typically by means of metamarks.
  • transpose describes a single textual transposition as an ordered list of at least two pointers specifying the order in which the elements indicated should be re-combined.
.

Most of these elements imply a certain level of semantic interpretation; for instance the usage of the add element to encode, say, interlinear insertions involves a decision that the interlinear text has been deliberately inserted rather than simply misplaced. As discussed above (see 1.4 Writing Acts vs. Text Stages), the use of these elements when transcribing from a documentary perspective is a pragmatic shortcut. In cases where it is felt desirable to keep the recording of ‘what is on the page’ entirely separate from ‘what is the editor’s interpretation’ (see 1.1 Fact vs. Interpretation), we provide two generic elements, <ge:mod> and <ge:modSpan>, which can be used to record any kind of modification identified in the document. These elements can be categorised by means of their type attribute, and visual aspects of their appearance can be described by means of the rend attribute, but they provide no further interpretation of the function or intention of the passage so marked up.

Whether such a modification, for instance a struck-out passage, is to be interpreted as a deletion or as some other phenomenon (e.g. as being already used) should be be expressed using the other, more semantically motivated, elements, which function at a different level of description.

3.2.1 Additions, fixations and clarifications

A writer may sometimes rewrite material a second time without significant change and in the same place. We consider this a distinct activity from addition as usually defined because no new textual material results but the status of existing material changes. We distinguish two variants of this: fixation where the first version was a tentative draft which is subsequently fixed, for example by inking it over; and clarification, where the first version was badly written and has been rewritten for clarity. The element <ge:rewrite> is provided to cover both cases.

In this simple example, taken from the papers of Henrik Ibsen, the writer wrote the word skuldren hastily, and then returned to it to make the letter l larger and clearer:
Image from a ms of Peer Gynt, Collin 2869, 4°, I.1.1, the
                           Royal Library of Copenhagen
Figure 2. Image from a ms of Peer Gynt, Collin 2869, 4°, I.1.1, the Royal Library of Copenhagen
We might transcribe this word as follows:
<ge:line>... Sku<ge:rewrite cause="unclear">l</ge:rewrite>dren </ge:line>
The following example, taken from a manuscript of Jane Austen's Sanditon, shows a rewriting where a pencilled passage has been fixed with ink, with some modification:
Image from page 70 of the Sanditon manuscript
Figure 3. Image from page 70 of the Sanditon manuscript
In this example, Austen sees in the fixation an opportunity to manipulate the text previously written, and thus changes the pencilled could but get a young to the inked could get a young, writing the inky get on top of pencilled but and striking over with black ink the pencilled but. A simple way of encoding this might be as follows:
<ge:rewrite cause="fix" hand="#ja2" stage="#s1">Now, if we could <subst stage="#s1">
  <add>get</add>
  <del>but</del>
 </subst>
 <del stage="#s1" rend="overstrike">get</del> a young Heiress</ge:rewrite>
where the stage attribute groups the operations that belong to the rewriting stage, assuming an implied previous stage (the first layer of writing). If the first layer need to be addressed independently, another pointer can be added to the stage attribute:
<ge:line stage="#s0">
 <ge:rewrite cause="fix" hand="#ja2" stage="#s1">Now...</ge:rewrite></ge:line>
where #s0 and #s1 both point to the declaration of such stages in the header, the first one conventionally representing the first layer of writing and the second the rewriting stage.
A single rewrite may not be sufficient, and it may be that the document becomes almost unreadable as a result of repeated clarification. In the following example, we can distinguish at least two attempts to write the letters er in the word bægerklang:
Image from http://www.emunch.no/tei-mm-2008/ms.html
Figure 4. Image from http://www.emunch.no/tei-mm-2008/ms.html
We might encode this by nesting the rewrite element as follows:
<ge:line>ved Bæg<ge:rewrite cause="unclear" stage="#stage2">
  <ge:rewrite cause="unclear" stage="#stage1">er</ge:rewrite></ge:rewrite> ...</ge:line>
The stage attribute used here is discussed further below ( 6.1 Stages/ Revision campaigns).

Metamarks and other markup-like strokes can also be inked over with the same purpose as the fixation or clarification of text passages. For instance, in a draft version of Goethe’s Faust, a passage was struck through once in pencil during one revision and then again with ink during a later revision, supposedly to fixate the deletion.

Fixation of a deletion in Goethe’s Faust
Figure 5. Fixation of a deletion in Goethe’s Faust

We propose an element redo (the opposite of undo, see 3.2.7 Undoing alterations), which can be used to encode this process as follows:

<ge:line>
 <ge:redo
   xml:id="redo_3"
   hand="#g_t"
   target="#mod_1"
   cause="fix"/>

 <ge:modSpan
   xml:id="mod_1"
   rend="strikethrough"
   spanTo="#anchor_1"
   hand="#g_bl"/>
Ihr hagren, triſten, krummgezog<ge:mod rend="strikethrough">nen</ge:mod>ener Nacken</ge:line>
<ge:line>Wenn ihr nur piepſet iſt die Welt ſchon matt.<anchor xml:id="anchor_1"/></ge:line>

3.2.2 Deletions and marked as used

In general, deletion in a source is marked using the del or delSpan element. However, it is useful to distinguish cases where a passage has been ‘indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector’ (TEI P5, s.v. del) from cases where a passage has been struck through or otherwise marked as having been used or copied to another location. In this latter case, the author does not intend to suppress the content, but only to mark that it has been transferred or reused. The element <ge:used> is provided to mark this kind of ‘deletion’.

The following page from the Walt Whitman archive has been crossed through to indicate used material:
Page from
                           http://www.whitmanarchive.org/resources/sleepers/20051105_0650.jpg
Figure 6. Page from http://www.whitmanarchive.org/resources/sleepers/20051105_0650.jpg
This page contains many internal deletions, but these should be distinguished from the ‘deletion’ signalled by the large cross, which actually shows that the page has been transferred or re-used, not deleted.
Material marked as re-used in this way often spans more than one zone or line. For that reason, the <ge:used> element is a spanning element, indicating the end of the used area by means of a spanTo attribute. We might encode the above page as follows:
<surface>
 <ge:used rend="cross" spanTo="#X2"/>
 <zone>
  <ge:line rend="underline">The Poet</ge:line>
  <ge:line>
   <del rend="strikethrough">I think</del> His sight is
     the</ge:line>
  <ge:line> sight of the ? and</ge:line>
  <ge:line>has sent the instinct of the</ge:line>
  <ge:line>? dog</ge:line>
 </zone>
 <zone>
  <ge:line>I think <ge:rewrite>ten</ge:rewrite> million</ge:line>
<!-- ... -->
  <ge:line>well; those <subst>
    <del rend="strikethrough">supple-fingered gods</del>
    <add>journeymen divine.</add>
   </subst></ge:line>
  <anchor xml:id="X2"/>
 </zone>
</surface>

3.2.3 Metamarks

By metamark we mean marks such as numbers, arrows, crosses, or other symbols introduced by the writer into a document expressly for the purpose of indicating how the text is to be read. Such marks thus constitute a kind of markup of the document, rather than forming part of the text.

Unlike marginal notes or other additions to the text, meta-marks indicate a deliberate alteration of the writing (e.g. ‘move this passage over there’). We also consider as metamarks dates introduced to mark the beginning of a manuscript or a revision, but not forming part of it.

The <ge:metaMark> element carries a function attribute which specifies the function of the meta-mark and a target attribute which points to the element or elements concerned.

The following example is taken from Kundige bok 2, a 15th century legal book from the city of Göttingen, containing regulations of everyday life issued by the city council
Malte's example
Figure 7. Malte's example
In the second paragraph, the sentence beginning ‘Ock en schullen de bruwere...’ was first written along with the word lege ("read") in the left hand margin, functioning as a metamark to indicate that this sentence forms part of the regulations. A further sentence was then added, while at some later stage or stages the text and also the metamark were deleted. We might encode this as follows:
<del>
 <ge:metaMark function="flag" target="#s1">lege</ge:metaMark>
 <s xml:id="s1">Ock en schullen de bruwere des hilgen dages nicht over
 <lb/>setten noch uppe den stillen fridach bruwen.</s>
 <add>
  <s>Noch nymande <lb/>over setten, se en sehin denne erst, dat uppe den
     bonen <lb/>neyn stro noch, huw noch flaß ligghe, by pine eyner
     halven <lb/>roden, deme bruwere so wol alse dem bruwheren to
     murende.</s>
 </add>
</del>
Here are some further examples showing the use of this element, taken from the manuscript drafts of Thomas Moore's Lalla Rookh (1817). The first shows a simple use of a metamark used to take stock of the progress of composition:
Lalla Rookh
Figure 8. Lalla Rookh

At regular points throughout the various drafts of the work, a number occurs, usually in the right margin (in this instance, "100"). These numbers result from the author counting the number of verse lines he has composed to the given point, and are not part of the text, but represent a stage at which Moore is taking stock of the progress of his composition.

<surface>
 <zone>
<!-- main zone -->
  <ge:line>Be this she cried &amp; wing’d her flight</ge:line>
  <ge:line>My offering at the Gates of Bliss</ge:line>
  <ge:line>
   <del>Fully to know the odours <gap extent="1" unit="word" reason="illegible"/>
   </del></ge:line>
  <ge:line>
   <del>Tho foul to heaven the vapour went</del></ge:line>
  <ge:line>
   <del>From vulgar</del>
   <add>common</add>
   <del>victors</del>, blood like this</ge:line>
  <ge:line> Shed out for freedom, flows so bright. <ge:metaMark function="count">100</ge:metaMark></ge:line>
  <ge:line> It would not stain the purest <subst>
    <del>fount</del>
    <add>rill</add>
   </subst> .</ge:line>
  <ge:line>“That sparkles thro the fields of light. <del>
    <ge:metaMark function="count">100</ge:metaMark>
   </del></ge:line>
  <ge:line> Behold her in the skies again —</ge:line>
  <ge:line>
   <subst>
    <del>But</del>
    <add>And</add>
   </subst>, tho so fleet her pinions bore</ge:line>
  <ge:line> The spirit of the Warriors slain,</ge:line>
  <ge:line>Now reach’d &amp; pass’d the gates before her</ge:line>
 </zone>
 <zone>
<!-- left zone -->
  <ge:line> Tho foul too oft the</ge:line>
  <ge:line>tears that still</ge:line>
  <ge:line>Tho foul the droppings <add>weepings</add></ge:line>
  <ge:line>that distil</ge:line>
  <ge:line>Tho foul the tears that</ge:line>
  <ge:line>oft distil</ge:line>
  <ge:line>From glory’s faulchion’</ge:line>
 </zone>
</surface>
Lalla Rookh 2
Figure 9. Lalla Rookh 2
This example demonstrates the use of a common proof-correction mark: in the left margin of the page, adjacent to a group of three cancelled lines of verse, the word "stet" is written. "Stet", a Latin word meaning "let it stand" is commonly used by authors, editors and proofreaders where a previous action should be disregarded. In this instance, Moore is indicating that the three deleted verse lines should be let stand, as evidenced by their appearance in the first printed edition of Lalla Rookh. The word, "stet" does not form part of the text, but rather declares that a certain function be performed upon the text.
<surface>
 <zone>
  <ge:line>
   <gap extent="1" unit="word" reason="illegible"/>
   <del>in his light</del>
   <add>within</add> eyelids, within the spray</ge:line>
  <ge:line>From Eden’s fountain, when it lies</ge:line>
  <ge:line>
   <hi rend="underline">On that blue</hi>
   <del>before that</del> flower, which, Brahmins say</ge:line>
  <ge:line> Can only <add>blooms nowhere but</add> bloom in
     Paradise,</ge:line>
  <ge:line>
   <del xml:id="del1">“Nymph of a bright, fair but erring line!</del></ge:line>
  <ge:line>
   <ge:metaMark function="undo" target="#del1 #del2 #del3">stet</ge:metaMark>
   <del xml:id="del2">(He gently <add>gently</add> he said) one hope
       is thine</del></ge:line>
  <ge:line>
   <del xml:id="del3">One hope (he gently said) is thine</del></ge:line>
 </zone>
</surface>
Other examples of <ge:metaMark> can be seen in marked-up proofs such as the following, taken from the Walt Whitman archive:
http://www.whitmanarchive.org/resources/sleepers/loc.00295.jpg
Figure 10. http://www.whitmanarchive.org/resources/sleepers/loc.00295.jpg

3.2.4 Alternative Readings

Lalla Rookh 3
Figure 11. Lalla Rookh 3
In this example two alternative readings are provided, without either one being prioritised or subordinated. While the author apparently first composed the line "Alone before his native river -", at some later point, he entertained the possibility of using the word "beside" instead of "before." In the context of this manuscript, there is no indication of which word the Moore favours, so the status of these words as possible alternative readings needs to be encoded. The evidence of the first edition of Lalla Rookh shows that the word "beside" was chosen, but for the purposes of encoding this manuscript, the facility to encode two equally-possible alternative readings needs to be available.
<zone>
 <ge:line>Alone <seg type="alternative" xml:id="alt1">before</seg>
  <add place="above" type="alternative" xml:id="alt2">beside</add> his
   native river ­—</ge:line>
 <alt targets="#alt1 #alt2" mode="excl" weights="0 1"/>
</zone>

3.2.5 Transpositions

Metamarks are commonly used in the context of transposition, that is, the moving of words or blocks by the author to a different position using arrows, asterisks or numbers or other metamarks. One possible approach (used, for instance in HNML) would be to regard such transpositions as a special kind of substitution, and actually to represent the result of the transposition indicated by the metamarks in the encoding, for example by considering the segment previous to the transposition as deleted, and substituted by the one after the transposition.

Our recommendation is to record the actual state of the witness, but in such a way as to facilitate its reorganization as a distinct processing step. We propose to represent the re-alignment of transposed blocks or segments by means of a stand-off mechanism. The elements <ge:transposeGrp> and <ge:transpose> are provided for this purpose. For example, in the following extract from an Ibsen manuscript
Extracted from
Figure 12. Extracted from http://www.emunch.no/tei-mm-2008/ms.html
, the underlined numbers 1 and 2 indicate that, although the word bör precedes the word hör in the text, the order of the two words should be reversed. We may encode this as follows:
<ge:line>
 <seg xml:id="ib01">bör</seg>
 <ge:metaMark
   rend="underline"
   function="transposition"
   target="#ib1"
   place="above">
2.</ge:metaMark>
og <seg xml:id="ib02">hör</seg>
 <ge:metaMark
   rend="underline"
   function="transposition"
   target="#ib02"
   place="above">
1.</ge:metaMark></ge:line>
<ge:transposeGrp>
 <ge:transpose>
  <ptr target="#ib02"/>
  <ptr target="#ib01"/></ge:transpose></ge:transposeGrp>
Note the use of the generic seg element to identify the sections of text being transposed. When (as in the following example) the whole of line is to be transposed, there is no need to delimit the sections concerned:
Extracted from
Figure 13. Extracted from http://www.emunch.no/tei-mm-2008/ms3.html
<ge:line xml:id="ib3">
 <ge:metaMark function="transposition" place="margin-left">2.)</ge:metaMark> thi da er du med Himmelen i
Pagt; — </ge:line>
<ge:line xml:id="ib4">
 <ge:metaMark function="transposition" place="margin-left">1.)</ge:metaMark> da kan du Folkets Jøkelhjerter tine;</ge:line>
<ge:transposeGrp>
 <ge:transpose>
  <ptr target="#ib4"/>
  <ptr target="#ib3"/></ge:transpose></ge:transposeGrp>
When transposition is made, the whole element indicated is understood to be moved, not just its contents. In the above example, the metamarks are thus understood to be moved along with the lines to which they apply.

In case the area to be transposed is overlapping with some other kind of markup, the generic milestone can be used instead of seg or any other existing elements.

One or more transposeGrp elements may be supplied either embedded within the text or in the profileDesc of the header, depending on local preference. Each transposeGrp can contain one or more transpose element, each of which defines a single transposition.

3.2.6 Substitution

In the current model for the TEI subst element, one or more additions and deletions may be combined if they are considered as representing a single editorial act, a substitution. Without extension, this model could not therefore include cases such as the following example taken from Thomas Moore's Lalla Rooke
Here the word pondering is deleted, and the phrase she mus'd are added, while the word thus remains unchanged. It seems appropriate to treat all of this as a single substitution. This would require a modification to the content model of subst so as to permit text along with other members of model.pPart.transcriptional, so that this example could be encoded as follows:
<ge:line>While <subst>
  <del>pondering</del> thus <add>she
     mus'd</add>
 </subst>, her pinions fann'd</ge:line>

3.2.7 Undoing alterations

In some cases an author indicates that an alteration is itself to be altered: for example, a struck through passage may be restored via a dotted underlining, or the underlining of a passage may be deleted by a wavy line.

The TEI provides an element restore for one specific kind of alteration to an alteration, namely the undoing of a deletion. We propose a more general element, <ge:undo>. The element <ge:undo> usually encloses the element (e.g. the add, del etc.) to be undone. If it appears within such an element, the implication is that only this part of the parent element has been cancelled. For example, in this passage taken from Giacomo Leopardi's Zibaldone (p. 3595), the phrase si rechi á was underlined word by word, and then the underlining of the word si was cancelled.
This could be encoded as follows:
<ge:line> che e’ <hi rend="underline">
  <ge:undo spanTo="#x2"/>si <anchor xml:id="x2"/> rechi a’</hi>
 <del rend="overstrike">dotti</del>
 <hi rend="underline">denti</hi> l’un d’essi cibi</ge:line>

To make explicit the relation between the undoing act and the initial act, we propose an alternative way of encoding these scribal acts, namely an empty <ge:undo> element, which points to the element the effect of which is being cancelled. If more than one (not coherent) part of the deletion is undone, more than one undo element will be needed, and each part undone must be given an identifier.

In the following example, three text stages can be identified:
  • s1 (initial): This is just some sample text, we need a real example.
  • s2: This is not a real example.
  • s3: This is just some text, not a real example.
This can be encoded using empty <ge:undo> elements as follows:
<ge:line>This is <del stage="#s2" xml:id="del_1" rend="overstrike">
  <ge:undo
    target="#del_1"
    spanTo="#X02"
    rend="dotted"
    stage="#s3"/>
just some
 <anchor xml:id="X02"/>sample <ge:undo
    target="#del_1"
    spanTo="#x4"
    rend="dotted"
    stage="#s3"/>
text,<anchor xml:id="x4"/> we need
 </del>
 <add stage="#s2">not </add>a real example.</ge:line>
The target attributes on undo point to the elements representing the initial acts (the deletions) which are undone. The rend attribute indicates the way this reversion is indicated. The spanTo attributes points to the anchor which marks the point in the text where this reversion finishes. Since two non-contiguous parts of the deletion are undone, there are two <ge:undo> elements, each with the appropriate attribute values. If spanTo is not supplied, the <ge:undo> is understood to refer to the whole of the text contained by the element indicated by its target attribute.

3.2.8 Instant corrections

The use of tags such as del and add necessarily implies that the modification concerned was made at some time after the original writing. An exception to this is where a false start or ‘instant’ correction has been identified: the author starts to write, and then immediately corrects what has been written. A special mechanism is provided for this case: an instant attribute has been introduced to att.editLike, whose datatype is data.xTruthValue and false is the default value. When the value of instant is true this indicates that the addition or deletion is considered to belong to the same writing stage as the rest of the unmodified document, while false means some stage later than the current stage.

An example of false start can be seen in the following line:
http://www.whitmanarchive.org/resources/sleepers/uva.00256.001.jpg
Figure 14. http://www.whitmanarchive.org/resources/sleepers/uva.00256.001.jpg
in which we can detect the following sequence of events:
  1. The letter T is written and then immediately deleted
  2. The word The is written, deleted, and replaced by the word His
  3. The added word His is then deleted
  4. The initial letter i of the words iron necklace is overwritten with a capital I
To indicate that the first of these acts must have taken place before the others, we might encode this revision campaign as follows:
<ge:line>
 <del instant="true">T</del>
 <subst>
  <del>The</del>
  <add place="above">
   <del rend="overstrike">His</del>
  </add>
 </subst>
 <subst>
  <del rend="overwritten">i</del>
  <add place="superimposed">I</add>
 </subst>ron necklace</ge:line>

4 The dossier level

The term dossier is used to refer to the set of documents which a genetic editor considers as having contributed to the evolution of a particular text. These may include drafts, revisions, or documents related in other ways.

4.1 Assembling a dossier

Since a dossier needs to be assembled from many documents, which will most probably be encoded in distinct TEI documents, one way to assemble a dossier’s contents for further processing would be to use the existing teiCorpus element in combination with XML Inclusions (XInclude). The teiCorpus construct provides for a common place to record metadata regarding the organization of the dossier itself, independently of the metadata regarding each particular document contained within it, which would be held in a discrete TEI Header attached to that particular encoded document. The XInclude mechanism provides a convenient means of managing the many separate resources which are likely to be needed to constitute a complete dossier. For example, supposing we have a dossier comprising three documents, each of which has been encoded in its own document:

<teiCorpus> <teiHeader>
<!-- information about the dossier -->
</teiHeader>
<xi:include href="document1.xml"/>
<xi:include href="document2.xml"/>
<xi:include href="document3.xml"/>

</teiCorpus>

(Note that each of the documents referenced (document1.xml etc) should contain a complete TEI element.)

4.2 Genetic Graphs

Looking at the documents which constitute a given dossier, there are many types of relationships which can be identified, both amongst complete documents, and amongst parts of those documents, including alterations, revisions, stages and other compositional phenomena. A further complexity arises if for example an author chooses to correct two different versions at the same time, 3 . We may thus need to express that two or more documents are related in different ways; for instance, one document may be the sequel of another, one may have been drafted at the same time as another, one may contain material or treat topics related to those of another, for example a newspaper article may inspire or be quoted by a given work.

We propose to model the ‘genetic relations’ one can deduce from such penomena by means of a graph. 4 A graph is a mathematical representation composed of nodes and arcs. Nodes represent objects, which are related: Typical objects in our case are documents, document components, text stages (as defined in 6.1 Stages/ Revision campaigns below) or even single phenomena (acts of writing or a textual alterations). Each arc between two nodes then represents a relation, in our case a genetic relation, between the two objects it links. Arcs may be typed and directed to further specify the kind of relationship they represent. For our purposes one can assume that a typical genetic graph is directed and acyclic, because we generally expect the paths through a genetic graph (formally defined as an ordered set of nodes traversed by following its arcs) to represent genetic processes with an implied chronological order. In this respect a genetic graph resembles a family tree, in that there is a single terminal node, representing the final or published version of a text with many preceding nodes linking to it, either directly or via other nodes, each of which represents draft versions of various parts. Alternatively, there may be more than one final state, e. g. in the case of multiple published texts or unpublished fragments.

A visual representation of an exemplary genetic graph might look as follows:

What can be seen here is on a very abstract level: The nodes (drawn as rectangular shapes with letters) represent parts of the dossier, the directed arcs (drawn as lines with arrows) represent relationships between the parts (probably of an evolutionary type). Thus the graph can be read as A and B being precursors of C, C in turn leading to F, F to G, G to H, and H leading to the final state Z, which additionally draws influences from E and D.

All this abstract information can be encoded using the following elements (see also the Guidelines chapter 19. Graphs, Networks, and Trees):
  • graph encodes a graph, which is a collection of nodes, and arcs which connect the nodes.
    type describes the type of graph.
  • node encodes a node, a possibly labeled point in a graph.
    value provides the value of a node, which is a feature structure or other analytic element.
  • arc encodes an arc, the connection from one node to another in a graph.
    from gives the identifier of the node which is adjacent from this arc.
    to gives the identifier of the node which is adjacent to this arc.
    value provides the value of an arc, which is a feature structure or other analytic element.
<graph type="directed">
 <node xml:id="A" value="http://edition.net/witness/A">
  <label>A</label>
 </node>
 <node xml:id="B" value="http://edition.net/witness/B">
  <label>B</label>
 </node>
 <node xml:id="C" value="http://edition.net/witness/C">
  <label>C</label>
 </node>
 <node xml:id="D" value="http://edition.net/witness/X#part1">
  <label>D</label>
 </node>
 <node xml:id="E" value="http://edition.net/witness/X#part2">
  <label>E</label>
 </node>
 <node xml:id="F" value="http://edition.net/witness/F">
  <label>F</label>
 </node>
 <node xml:id="G" value="http://edition.net/witness/G">
  <label>G</label>
 </node>
 <node xml:id="H" value="http://edition.net/witness/H">
  <label>H</label>
 </node>
 <node xml:id="Z" value="http://edition.net/text/">
  <label>Z</label>
 </node>
 <arc
   xml:id="AC"
   from="#A"
   to="#C"
   value="http://edition.net/genetic/analysis#ac"/>

 <arc
   xml:id="BC"
   from="#B"
   to="#C"
   value="http://edition.net/genetic/analysis#bc"/>

 <arc
   xml:id="CF"
   from="#C"
   to="#F"
   value="http://edition.net/genetic/analysis#cf"/>

 <arc
   xml:id="FG"
   from="#F"
   to="#G"
   value="http://edition.net/genetic/analysis#fg"/>

 <arc
   xml:id="GH"
   from="#G"
   to="#H"
   value="http://edition.net/genetic/analysis#gh"/>

 <arc
   xml:id="DZ"
   from="#D"
   to="#Z"
   value="http://edition.net/genetic/analysis#dz"/>

 <arc
   xml:id="EZ"
   from="#E"
   to="#Z"
   value="http://edition.net/genetic/analysis#ez"/>

 <arc
   xml:id="HZ"
   from="#H"
   to="#Z"
   value="http://edition.net/genetic/analysis#hz"/>

</graph>

Such an abstract graph structure can be easily imported into or exported from a graph database, transformed into other XML-based representations of graph-based data models like RDF/XML, or it can be serialized to SVG for interactive visualization.

4.3 Genetic analysis

With the graph’s abstract structure defined, one can now go on and specify, what the nodes and arcs actually represent. This is done by assigning properties to them via the value attribute. In the example the nodes A-H point to individual witnesses, D and E being an exception as they point to different parts of the same witness. The node Z (the final state in our example) supposedly points to the edited text and each arc’s value points to a resource, whose contents comprise the genetic analysis, which underlies the assumed genetic relations expressed in the graph.

[...]

4.4 Constructing genetic graphs

It is highly unlikely, that editors will construct and encoded genetic graphs by hand, unless the dossier is as simply structured as the given example. Rather a typical workflow might be:

  1. The editor transcribes each witness in a separate TEI document.
  2. During the transcription process phenomena of genetic relevance are annotated with identifiers, that classify these phenoma as belonging to a certain genetic process or that simply relates these phenomena in a well-defined way.
  3. By automatically collecting and evaluating all those annotations from all transcribed documents, a preliminary graph is constructed, that can be visualized and navigated.
  4. Steps 1-3 are repeated up to the point, where the graph cannot be further refined via this procedure.
  5. Optionally, the automatically constructed graph can be annotated manually.

For example in the genetic edition of Goethe’s Faust, the final text is known and can therefore serve as a canonical reference system for constructing genetic relations. Each verse has a unique number:

<speaker>Faust.</speaker>
<lg>
 <l n="354">Habe nun, ach! Philosophie,</l>
 <l n="355">Juristerei und Medicin,</l>
 <l n="356">Und leider auch Theologie!</l>
 <l n="357">Durchaus studirt, mit heißem Bemühn.</l>
 <l n="358">Da steh' ich nun, ich armer Thor!</l>
 <l n="359">Und bin so klug als wie zuvor;</l>
</lg>

This makes it possible to express genetic relations by using these same verse numbers to index corresponding verses or ranges of verse-numbers throughout their various witnesses, stages or variants. Although a genetic graph generated from the coindexed verses will not be of a very fine granularity, it can be a useful preliminary base for subsequent refinement. By incorporating metadata for the witnesses (especially dating information) and results of collation eith other relevant information, the graph can be enriched in an iterative manner and periodically checked for consistency. Finally, together with the transcriptions, collation results, and a critical apparatus it can be archived as an integral part of the genetic edition.

5 *Collation and Critical Apparatus

As noted above, not all kinds of variation within and between documents are equivalent. For example, most people would regard authorial modifications within a single draft or between subsequent drafts as having a different significance from modifications assigned to scribal variation within a long textual tradition, despite their formal similarities.

When a passage has been visibly deleted in one version of a text we will generally mark it explicitly; if however a passage present in one version (A) is omitted in another (B), it may be a matter of uncertainty as to whether it has been deleted from B, or added to A. Even if this is certain (perhaps because the order of the two versions is known), the omission from B of material in A is not entirely the same phenomenon as an explicit deletion.

The addition (or deletion) of a segment from a version is normally a deliberate act of the author and we would like to be able to record that in positive way; whether we need another set of editorial elements or we should use the same set that are used for transcription remains an open question.

Identifying additions or deletions on the basis of a comparison of different versions of a text is possible, using existing TEI elements for critical editing such as app and its child rdg elements. This method uses the argument e silentio: for example, to identify that something is missing from a witness, all available readings must be compared, and there is no way of explicitly marking an absent (or additional) reading. For instance, the 1856 edition of Leaves of Grassof the 1856 omits the words ‘all, all’ which are included in the 1881-82 version. We may record this as an apparatus:
<l n="22">And the enraged and treacherous dispositions <app>
  <rdg wit="#Leaves81-82">, all, all</rdg>
  <rdg wit="#Leaves56"/>
 </app> sleep</l>
but this does not indicate whether the words were deleted consciously from the 1856 dedition, or added in the 1881 version. Furthermore, if we decided to use the existing add or del elements within the rdg, for example:
<l n="22">And the enraged and treacherous dispositions <app>
  <rdg wit="#Leaves81-82">
   <add>, all, all</add>
  </rdg>
  <rdg wit="#Leaves56"/>
 </app>...</l>
the result would be ambiguous: it might indicate that there is an explicit addition (for example, by interlinear or marginal interpolation) within the 1881 text, or it might indicate that this addition appears as a result of collating the 1881 and 1856 texts.
One solution might be to use a different element (say <ge:interAdd>) for addition implied by collation, reserving add for deletions that happen at the document level. Another, if all the documents have been fully transcribed, might be to use stand off techniques to represent the collation. This is a more promising possibility which the workgroup has not yet fully explored. If all the alterations occurring at document level are already encoded within each transcription, the dossier-level collation will only need to point to passages within the separate files and classify the types of readings resulting from the collation:
<app>
<rdg wit="#Leaves81-82">
<span from="#v22-5" to="#v22-8" type="add"/></rdg>
<rdg wit="#Leaves56">
<span from="#v22-5" type="del"/></rdg></app>

6 Manuscript and Dossier Levels

6.1 Stages/ Revision campaigns

A major purpose of genetic editing is the identification of ‘revision campaigns’ or, more generally, stages. A genetic editor needs to be able to assign a set of alterations (deletions, additions, substitutions, transpositions, etc.) and/or an act of writing to a particular stage, to indicate both that one or more of such phenomena preceded or followed another and also to indicate that they are related in some way, for example that one is a consequence of the other. To document this we need:
  • a system to assign phenomena to a particular stage
  • a way to characterize a stage, in itself and in relation to other stages.

The existing element creation (within the TEI Header profile description) is defined as the appropriate location for all information relating to the genesis or production of a text. We modify it slightly to permit a new stageNotes element which contains a number of stageNote elements, one for each identified stage:

  • stageNotes contains one or more descriptions of the stages which have been identified in the genesis of a text.
    ordered indicates whether or not the order in which the children of this element are presented is significant
  • stageNote documents a particular stage in the genesis of a text.
  • In the following example taken from the genetic edition of Goethe’s Faust, the editor has identified four distinct stages:

    <profileDesc>
     <creation>
      <ge:stageNotes ordered="true">
       <ge:stageNote xml:id="ST-1">First stage, written in ink by a
           writer</ge:stageNote>
       <ge:stageNote xml:id="ST-2">Second stage, written in Goethe's hand using
           pencil</ge:stageNote>
       <ge:stageNote xml:id="ST-3">Fixation of the revised passages and
           further revisions by Goethe using ink</ge:stageNote>
       <ge:stageNote xml:id="ST-4">Addition of
           another stanza in a different hand, probably
           at a later stage</ge:stageNote></ge:stageNotes>
     </creation>
    </profileDesc>

    The stageNotes elements carries an attribute ordered, which can take the values true or false (the default). The attribute specifies whether the order of child elements signifies a temporal order for the revision campaigns which they document. In the Faust example above, the editor has asserted that the four stages distinguished are ordered chronologically according to the order of the stageNote elements. Note that asserting a specific order early on, though probably one of the hardest tasks in a genetic analysis, can considerably reduce the encoding effort in assigning textual alterations to stages during the transcription, as we will see below. For instance deletions can only be assigned to a stage that follows the one in which the passage being deleted was written down. Hence, having a certain order of stages put in place before transcription begins, will allow the encoder to reduce verbose tagging, where default assumptions based on the natural order of actions can be made.

    If necessary, stageNotes elements can be nested hierarchically. This may be helpful in two cases. Firstly one can build up hypotheses about related revisions step-by-step, starting with stages of smaller coverage, whose members are certainly related, and then in a subsequent pass grouping these stages in turn, thereby extending their reach.

    <profileDesc>
     <creation>
      <ge:stageNotes>
       <ge:stageNote xml:id="o">An unrelated stage note</ge:stageNote>
       <ge:stageNotes xml:id="m" cert="low">
        <ge:stageNote xml:id="m1">Alterations on one manuscript page, certainly
             related</ge:stageNote>
        <ge:stageNote xml:id="m2">Alterations on another manuscript page,
             certainly related</ge:stageNote>
        <ge:stageNote xml:id="p">Another unrelated stage note</ge:stageNote></ge:stageNotes></ge:stageNotes>
     </creation>
    </profileDesc>

    Another use case for nested stageNote elements would be the need to express a partial ordering of revision campaigns.

    <ge:stageNotes ordered="true">
     <ge:stageNote xml:id="ST1">The first stage</ge:stageNote>
     <ge:stageNotes xml:id="STlast">
      
    <!-- We have no indication, which of the following revision campaigns took place first, but we know they followed ST1 -->.
     <ge:stageNote xml:id="ST-rev1">A revision of the first stage</ge:stageNote>
      <ge:stageNote xml:id="ST-rev2">Another revision of the first
         stage</ge:stageNote></ge:stageNotes></ge:stageNotes>

    In addition to the possibility of ordering text stages in relation to each other, stageNote elements may carry a number of attributes from the att.datable class (period, when, notBefore, notAfter, from, and to) which allow each stage to be dated as exactly or inexactly as necessary, in the same way as is currently possible for the TEI date element.

    <profileDesc>
     <creation>
      <date notAfter="1816-07-18"/>
      <ge:stageNotes ordered="true">
       <ge:stageNote xml:id="mod1" when="1816-07-16">The first draft of
       <title>Persuasion</title> is completed by the <date>July 16
             1816</date> written after the word <q>Finis</q> at <ref target="#pers-30">page 30</ref>.</ge:stageNote>
       <ge:stageNote xml:id="mod2" notBefore="1816-07-16">After the <date>16th
             of July</date> Austen starts revision of the two final chapters, by
           rewriting the end and adding a new block (<ref target="#transp-1">pages 32-35</ref>) to be inserted at <ref target="#insertion-p1">page 19</ref>. This stage is documented by the deletion of the
           date (<date>July 16 1816</date>) at <ref target="#pers-30">page
             30</ref>, and the addition of more text and of a new date
           (<date>July 18. 1816</date>) at <ref target="#pers-31">page
             31</ref></ge:stageNote>
       <ge:stageNote notBefore="1816-07-18">Before publication, after <date>July
             18th, 1816</date> chapters 10-11 were broken into three chapters,
           10, 11, 12, as witnessed by the print.</ge:stageNote></ge:stageNotes>
     </creation>
    </profileDesc>

    Each stageNote element, apart from declaring a text stage, may also contain references to other annotations contained within the teiHeader or in the document (as shown in the previous example). Such references, along with the textual content are purely documentary and do not affect the textual stage associated with any element thus referred to. The association of a textual component with a writing stage is always made explicitly, either by pointing from the stageNote attribute @target to one or more elements, or (for preference) by pointing from the element concerned to the stageNote element by means of its stage attribute:

    <ge:line stage="#firstStage">This is a <subst stage="#secondStage">
      <del>house</del>
      <add>mouse</add>
     </subst>.</ge:line>

    This simple example shows the latter of the two options: The relevant stages are declared in the header; then textual alterations and acts of writing are assigned to them. So the whole sentence was realized in the first stage, while the substitution of “house” with “mouse” happened at the second stage.

    A more complex and complete example:

    <profileDesc>
     <creation>
      <ge:stageNotes type="ordered">
       <ge:stageNote xml:id="firstStage">First stage, written in ink by a
           writer</ge:stageNote>
       <ge:stageNote xml:id="secondStage">Revised by Goethe using
           pencil</ge:stageNote>
       <ge:stageNote xml:id="thirdStage">Fixation of the revised passages and
           further revisions by Goethe using ink</ge:stageNote>
       <ge:stageNote xml:id="fourthStage">Addition of another stanza, probably
           at a later stage</ge:stageNote></ge:stageNotes>
     </creation>
    </profileDesc> [...]
    <div stage="#firstStage">
     <l n="11656">
      <subst>
       <del>Ihr</del>
       <add>
        <ge:rewrite stage="#thirdStage">
         <seg stage="#secondStage">Nun</seg></ge:rewrite>
       </add>
      </subst> wanſtige Schuften mit den Feuerbacken</l>
     <l n="11657">Ihr glüht ſo recht vom Höllen Schwefel <subst
        stage="#secondStage #thirdStage">

       <del>ſatt</del>
       <add>feiſt</add>
      </subst>.</l>
     <l n="11658">
      <delSpan spanTo="#anchor_delSpan_1" stage="#thirdStage"/>Ihr hagren,
       triſten, krummgezog<subst>
       <del>nen</del>
       <add>ener</add>
      </subst> Nacken</l>
     <l>Wenn ihr nur piepſet iſt die Welt ſchon matt.<anchor xml:id="anchor_delSpan_1"/>
     </l>
    </div>

    Note first, that a stage, once assigned to an element, is inherited by all descendents of that element, unless overridden by a subsequent assignment. So in the example above the three verses are assigned to the first stage initially. The writing of Nun (as part of the substitution in the first verse) takes place in the second stage and is repeated or fixated in the third. Also the substitution in the second verse is done repeatedly: initially it takes place in the second stage, but is fixated as a whole in the third.

    As one can see, the interpretation of what the stage assignments mean for a particular text passage and the actions upon it can be based on a number of implicit assumptions and constraints which have the effect of minimizing the amount of tagging necessary. If it is desired to make these presuppositions more explicit, one can differentiate between acts of writing and textual alterations and accordingly between their assignment to stages as follows:

    <profileDesc>
     <creation>
      <ge:stageNotes type="ordered">
       <ge:stageNote target="#zone_1 #subst_3">First stage, written in ink by a
           writer</ge:stageNote>
       <ge:stageNote
         target="#zone_2 #mod_1 #line_1 #line_2 #subst_1 #subst_2 #subst_4 #delSpan_1">
    Revised by Goethe using pencil</ge:stageNote>
       <ge:stageNote
         target="#redo_1 #redo_2 #redo_3 #subst_1 #subst_2 #delSpan_1 #add_1">
    Fixation of the revised passages and further revisions by Goethe
           using ink</ge:stageNote></ge:stageNotes>
     </creation>
    </profileDesc> [...]
    <ge:document>
     <surface>
      <zone xml:id="zone_1">
       <ge:line xml:id="line_1">
        <handShift new="#g_bl"/>
        <ge:rewrite hand="#g_t" xml:id="redo_1">Nun</ge:rewrite></ge:line>
       <ge:line>
        <handShift new="#jo_t"/>Ihr wanſtige Schuften mit den
           Feuerbacken</ge:line>
       <ge:line xml:id="line_2">
        <handShift new="#g_bl"/>
        <ge:rewrite hand="#g_t" xml:id="redo_2">feiſt</ge:rewrite></ge:line>
       <ge:line>Ihr glüht ſo recht vom Höllen Schwefel ſatt.</ge:line> [...]
      </zone>
     </surface></ge:document>
    <text>
     <body>
      <l n="11656">
       <subst xml:id="subst_1">
        <del>Ihr</del>
        <add>Nun</add>
       </subst> wanſtige Schuften mit den Feuerbacken</l>
      <l n="11657">Ihr glüht ſo recht vom Höllen Schwefel <subst xml:id="subst_2">
        <del>ſatt</del>
        <add>feiſt</add>
       </subst>.</l>
     </body>
    </text>

    Here two transcriptions of the same passage are given, one from a documentary perspective stressing the writing process, and one from the textual perspective emphasizing textual alterations. The assignment to stages is also done differently here, pointing from the stage notes to the text passages and alterations in question. From the documentary perspective, the stage assignments describe the writing process, in that they specify, which segment has been written when and how often. From the textual perspective, the markup concentrates on the order of textual alterations and makes no assumptions about the order of writing.

    Appendix A ODD

    TEI Extension for Genetic Editions -- preliminary version

    Schema geneticTEI: changed components

    AnyThing

    AnyThing Matches any element
    Module derived-module-geneticTEI
    Used by
    Declaration
    AnyThing =
       (
          element * { attribute * - (xml:id | xml:lang) { text }*, AnyThing }
        | text
       )*

    <arc>

    <arc> encodes an arc, the connection from one node to another in a graph. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/GD.html#GDGR
    Module nets
    In addition to global attributes In addition to global attributes
    value provides the value of an arc, which is a feature structure or other analytic element.
    Status Optional
    Datatype xsd:anyURI
    Values A valid identifier.
    Note
    Copied from the node element to support full-featured property graphs, where also arcs may be annotated.
    from gives the identifier of the node which is adjacent from this arc.
    Status Required
    Datatype xsd:anyURI
    Values The identifier of a node.
    to gives the identifier of the node which is adjacent to this arc.
    Status Required
    Datatype xsd:anyURI
    Values The identifier of a node.
    Used by
    May contain
    core: label
    Declaration
                            element 
                            arc
    {
       attribute value { xsd:anyURI }?,
       attribute from { xsd:anyURI },
       attribute to { xsd:anyURI },
       att.global.attributes,
       ( label, label? )?
    }
    Example
    <arc from="#T3" to="#T3">
     <label>OLD</label>
     <label>VIEUX</label>
    </arc>
    Note
    The arc element must be used if the arcs are labeled. Otherwise, arcs can be encoded using the adj, adjTo and adjFrom attributes on the node tags in the graph. Both arc tags and adjacency attributes can be used, but the resulting encoding would be highly redundant.
    Zero, one, or two children label elements may be present. The first occurence of label provides a label for the arc; the second provides a second label for the arc, and should be used if a transducer is being encoded.

    att.editLike

    att.editLike provides attributes describing the nature of a encoded scholarly intervention or interpretation of any kind.
    Module tei
    Members att.transcriptional [add addSpan del delSpan mod modSpan redo restore rewrite subst undo] affiliation am climate corr date ex expan gap geneticNote origDate origPlace origin persName placeName reg relation stageNote supplied surplus time unclear
    Attributes att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) att.responsibility (@cert, @resp)
    instant Is this an instant revision?
    Status Optional
    Datatype xsd:boolean | "unknown" | "inapplicable"
    evidence indicates the nature of the evidence supporting the reliability or accuracy of the intervention or interpretation.
    Status Optional
    Datatype xsd:Name
    Suggested values include:
    internal
    there is internal evidence to support the intervention.
    external
    there is external evidence to support the intervention.
    conjecture
    the intervention or interpretation has been made by the editor, cataloguer, or scholar on the basis of their expertise.
    source contains a list of one or more pointers indicating the sources which support the given reading.
    Status Mandatory when applicable
    Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace
    Values A space-delimited series of sigla; each sigil should correspond to a witness or witness group and occur as the value of the xml:id attribute on a witness or msDesc element elsewhere in the document.

    att.global

    att.global provides attributes common to all elements in the TEI encoding scheme.
    Module tei
    Members document geneticGrp geneticNote line metaMark mod modSpan patch redo rewrite stageNote stageNotes transpose transposeGrp undo used
    Attributes att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select) att.global.analytic (@ana) att.global.facs (@facs) att.staged (@stage)
    xml:id (identifier) provides a unique identifier for the element bearing the attribute.
    Status Optional
    Datatype xsd:ID
    Values any valid XML identifier.
    Note
    The xml:id attribute may be used to specify a canonical reference for an element; see section ??.
    n (number) gives a number (or other label) for an element, which is not necessarily unique within the document.
    Status Optional
    Datatype 1–∞ occurrences of  token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" } separated by whitespace
    Values the value may contain only letters, digits, punctuation characters, or symbols: it may not contain whitespace or word separating characters. It need not be restricted to numbers.
    Note
    The n attribute may be used to specify the numbering of chapters, sections, list items, etc.; it may also be used in the specification of a standard reference system for the text.
    xml:lang (language) indicates the language of the element content using a ‘tag’ generated according to BCP 47
    Status Optional
    Datatype xsd:language
    Values The value must conform to BCP 47. If the value is a private use code (i.e., starts with x- or contains -x-) it should, and if not it may, match the value of an ident attribute of a language element supplied in the TEI Header of the current document.
    Note
    If no value is specified for xml:lang, the xml:lang value for the immediately enclosing element is inherited; for this reason, a value should always be specified on the outermost element (TEI).
    rend (rendition) indicates how the element in question was rendered or presented in the source text.
    Status Optional
    Datatype 1–∞ occurrences of  token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" } separated by whitespace
    Values may contain any number of tokens, each of which may contain letters, punctuation marks, or symbols, but not word-separating characters.
    <head rend="align(center) case(allcaps)">
     <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle,
    <lb/>On Her <lb/>
     <hi rend="case(mixed)">New Blazing-World</hi>.
    </head>
    Note
    These Guidelines make no binding recommendations for the values of the rend attribute; the characteristics of visual presentation vary too much from text to text and the decision to record or ignore individual characteristics varies too much from project to project. Some potentially useful conventions are noted from time to time at appropriate points in the Guidelines.
    rendition points to a description of the rendering or presentation used for this element in the source text.
    Status Optional
    Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace
    Values one or more URIs, separated by whitespace.
    <head rendition="#ac #sc">
     <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her
    <lb/>
     <hi rendition="#no">New Blazing-World</hi>.
    </head>
    <!-- elsewhere... -->
    <rendition xml:id="sc" scheme="css">font-variant: small-caps</rendition>
    <rendition xml:id="no" scheme="css">font-variant: normal</rendition>
    <rendition xml:id="ac" scheme="css">text-align: center</rendition>
    Note
    The rendition attribute is used in a very similar way to the class attribute defined for XHTML but with the important distinction that its function is to describe the appearance of the source text, not necessarily to determine how that text should be presented on screen or paper.
    Where both rendition and rend are supplied, the latter is understood to override or complement the former.
    Each URI provided should indicate a <rendition> element defining the intended rendition in terms of some appropriate style language, as indicated by the scheme attribute.
    xml:base provides a base URI reference with which applications can resolve relative URI references into absolute URI references.
    Status Optional
    Datatype xsd:anyURI
    Values any syntactically valid URI reference.
    <div type="bibl">
     <head>Bibliography</head>
     <listBibl
       xml:base="http://www.lib.ucdavis.edu/BWRP/Works/">

      <bibl n="1">
       <author>
        <name>Landon, Letitia Elizabeth</name>
       </author>
       <ref target="LandLVowOf.sgm">
        <title>The Vow of the Peacock</title>
       </ref>
      </bibl>
      <bibl n="2">
       <author>
        <name>Compton, Margaret Clephane</name>
       </author>
       <ref target="NortMIrene.sgm">
        <title>Irene, a Poem in Six Cantos</title>
       </ref>
      </bibl>
      <bibl n="3">
       <author>
        <name>Taylor, Jane</name>
       </author>
       <ref target="TaylJEssay.sgm">
        <title>Essays in Rhyme on Morals and Manners</title>
       </ref>
      </bibl>
     </listBibl>
    </div>
    xml:space signals an intention about how white space should be managed by applications.
    Status Optional
    Legal values are:
    default
    the processor should treat white space according to the default XML white space handling rules
    preserve
    the processor should preserve unchanged any and all white space in the source
    Note
    The XML specification provides further guidance on the use of this attribute.

    att.staged

    att.staged groups elements which can be assigned to a specific text stage by means of the attributes it provides.
    Module tei
    Members att.global [document geneticGrp geneticNote line metaMark mod modSpan patch redo rewrite stageNote stageNotes transpose transposeGrp undo used]
    Attributes In addition to global attributes
    stage points to one or more stageNote elements which contain a description of a text-stage to which the editors think the alteration/ text passage marked by the element bearing this attribute (and its children) belongs.
    Status Optional
    Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace

    <creation>

    <creation> contains information about the creation of a text. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD4C http://www.tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD4
    Module header
    Used by
    May contain
    Declaration
                            element 
                            creation
    {
       att.global.attributes,
       macro.phraseSeq.limited,
       stageNotes?
    }
    Example
    <creation>
     <date>Before 1987</date>
    </creation>
    Example
    <creation>
     <date when="1988-07-10">10 July 1988</date>
    </creation>
    Note
    Character data and phrase-level elements.
    The creation element may be used to record details of a text's creation, e.g. the date and place it was composed, if these are of interest; it should not be confused with the publicationStmt element, which records date and place of publication.

    <document> [http://www.tei-c.org/ns/geneticEditions]

    <document> contains a document-centric transcription of a primary source, providing topographical information as well as transcription
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    core: gap
    derived-module-geneticTEI: modSpan
    Declaration
    element document { att.global.attributes, ( surface, model.global.edit* )+ }

    <fallback> [http://www.w3.org/2001/XInclude]

    <fallback> Wrapper for fallback elements if an XInclude fails
    Module derived-module-geneticTEI
    Used by
    May contain Empty element
    Declaration
    element fallback { AnyThing }

    <geneticGrp> [http://www.tei-c.org/ns/geneticEditions]

    <geneticGrp> Group texts and document which are somehow related in a genetic process
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    derived-module-geneticTEI: geneticNote
    Declaration
    element geneticGrp { att.global.attributes, geneticNote+ }

    <geneticNote> [http://www.tei-c.org/ns/geneticEditions]

    <geneticNote> describes a particular set of documents or document fragments which are considered to be mutually associated in some way.
    Module derived-module-geneticTEI
    In addition to global attributes att.typed (@type, @subtype) att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    core: p
    linking: ab linkGrp
    Declaration
                            element 
                            geneticNote
    {
       att.typed.attributes,
       att.editLike.attributes,
       att.global.attributes,
       linkGrp+,
       model.pLike+
    }

    <include> [http://www.w3.org/2001/XInclude]

    <include> The W3C XInclude element
    Module derived-module-geneticTEI
    In addition to global attributes In addition to global attributes
    href pointer to the resource being included
    Status Optional
    Datatype xsd:anyURI
    parse
    Status Optional
    Legal values are:
    xml
    [Default]
    text
    xpointer
    Status Optional
    Datatype text
    encoding
    Status Optional
    Datatype text
    accept
    Status Optional
    Datatype text
    accept-charset
    Status Optional
    Datatype text
    accept-language
    Status Optional
    Datatype text
    Used by
    May contain
    derived-module-geneticTEI: fallback
    Declaration
                            element 
                            include
    {
       attribute href { xsd:anyURI }?,
       attribute parse { "xml" | "text" }?,
       attribute xpointer { text }?,
       attribute encoding { text }?,
       attribute accept { text }?,
       attribute accept-charset { text }?,
       attribute accept-language { text }?,
       fallback?
    }

    <line> [http://www.tei-c.org/ns/geneticEditions]

    <line> contains the transcription of a topographic line in the source document
    Module derived-module-geneticTEI
    In addition to global attributes att.typed (@type, @subtype) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    Declaration
                            element 
                            line
    {
       att.typed.attributes,
       att.global.attributes,
       ( text | model.global | model.linePart )*
    }

    <metaMark> [http://www.tei-c.org/ns/geneticEditions]

    <metaMark> contains any textual or graphical mark in a written text intended to signal how the text should be read and not forming part of the text itself.
    Module derived-module-geneticTEI
    In addition to global attributes att.spanning (@spanTo) att.placement (@place) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    function describes the function (e.g. add, delete, alternate) of the mark.
    Status Optional
    Datatype token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" }
    target indicates the element(s) to which the function of the meta-mark refers. Pointers are separated by a white space
    Status Optional
    Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace
    Used by
    May contain
    Declaration
                            element 
                            metaMark
    {
       att.spanning.attributes,
       att.placement.attributes,
       att.global.attributes,
       attribute function { token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" } }?,
       attribute target { list { xsd:anyURI, xsd:anyURI* } }?,
       macro.specialPara
    }

    <milestone>

    <milestone> marks a boundary point separating any kind of section of a text, typically but not necessarily indicating a point at which some part of a standard reference system changes, where the change is not represented by a structural element. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CORS5
    Module core
    In addition to global attributes att.spanning (@spanTo) att.typed (@type, @subtype) att.sourced (@ed)
    unit provides a conventional name for the kind of section changing at this milestone.
    Status Required
    Datatype xsd:Name
    Suggested values include:
    page
    physical page breaks (synonymous with the pb element).
    column
    column breaks.
    line
    line breaks (synonymous with the lb element).
    book
    any units termed book, liber, etc.
    poem
    individual poems in a collection.
    canto
    cantos or other major sections of a poem.
    speaker
    changes of speaker or narrator.
    stanza
    stanzas within a poem, book, or canto.
    act
    acts within a play.
    scene
    scenes within a play or act.
    section
    sections of any kind.
    absent
    passages not present in the reference edition.
    unnumbered
    passages present in the text, but not to be included as part of the reference.
    Note
    If the milestone marks the beginning of a piece of text not present in the reference edition, the special value absent may be used as the value of unit. The normal interpretation is that the reference edition does not contain the text which follows, until the next milestone tag for the edition in question is encountered.
    In addition to the values suggested, other terms may be appropriate (e.g. Stephanus for the Stephanus numbers in Plato).
    Used by
    May contain Empty element
    Declaration
                            element 
                            milestone
    {
       attribute 
                            unit
       {
          "page"
        | "column"
        | "line"
        | "book"
        | "poem"
        | "canto"
        | "speaker"
        | "stanza"
        | "act"
        | "scene"
        | "section"
        | "absent"
        | "unnumbered"
        | xsd:Name
       },
       att.global.attributes,
       att.spanning.attributes,
       att.typed.attributes,
       att.sourced.attributes,
       empty
    }
    Example
    <milestone n="23" ed="La" unit="Dreissiger"/>
    ... <milestone n="24" ed="AV" unit="verse"/> ...
    Note
    For this element, the global n attribute indicates the new number or other value for the unit which changes at this milestone. The special value unnumbered should be used in passages which fall outside the normal numbering scheme, such as chapter or other headings, poem numbers or titles, etc.
    The order in which milestone elements are given at a given point is not normally significant.

    <mod> [http://www.tei-c.org/ns/geneticEditions]

    <mod> represents any kind of modification identified within a text at a documentary level.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) ) att.typed (@type, @subtype)
    Used by
    May contain
    Declaration
                            element 
                            mod
    {
       att.global.attributes,
       att.transcriptional.attributes,
       att.typed.attributes,
       macro.paraContent
    }

    <modSpan> [http://www.tei-c.org/ns/geneticEditions]

    <modSpan> represents any kind of modification identified within a text at a documentary level, where this extends over other XML markup constructs in the document.
    Module derived-module-geneticTEI
    In addition to global attributes att.spanning (@spanTo) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) ) att.typed (@type, @subtype)
    Used by
    May contain Empty element
    Declaration
                            element 
                            modSpan
    {
       att.spanning.attributes,
       att.global.attributes,
       att.transcriptional.attributes,
       att.typed.attributes,
       empty
    }

    model.gLike

    model.gLike groups elements used to represent individual non-Unicode characters or glyphs.
    Module tei
    Used by
    Members g

    model.hiLike

    model.hiLike groups phrase-level elements which are typographically distinct but to which no specific function can be attributed.
    Module tei
    Used by
    Members hi

    model.linePart

    model.linePart elements which can form part of a line
    Module derived-module-geneticTEI
    Used by
    Members model.gLike [g] model.hiLike [hi] model.pPart.editorial [abbr am choice ex expan subst] model.pPart.msdesc [catchwords dimensions handShift heraldry locus locusGrp material origDate origPlace secFol signatures stamp watermark] model.pPart.transcriptional [add app corr damage del metaMark mod orig redo reg restore rewrite sic supplied surplus unclear undo used] model.segLike [m pc phr s seg]

    model.pPart.editorial

    model.pPart.editorial groups phrase-level elements for simple editorial interventions that may be useful both in transcribing and in authoring.
    Module tei
    Used by
    Members abbr am choice ex expan subst

    model.pPart.msdesc

    model.pPart.msdesc groups phrase-level elements used in manuscript description.
    Module tei
    Used by
    Members catchwords dimensions handShift heraldry locus locusGrp material origDate origPlace secFol signatures stamp watermark

    model.pPart.transcriptional

    model.pPart.transcriptional groups phrase-level elements used for editorial transcription of pre-existing source materials.
    Module tei
    Used by
    Members add app corr damage del metaMark mod orig redo reg restore rewrite sic supplied surplus unclear undo used

    model.segLike

    model.segLike groups elements used for arbitrary segmentation.
    Module tei
    Used by
    Members m pc phr s seg

    model.zonePart

    model.zonePart elements which can form part of a zone
    Module derived-module-geneticTEI
    Used by
    Members model.gLike [g] model.hiLike [hi] model.pPart.editorial [abbr am choice ex expan subst] model.pPart.msdesc [catchwords dimensions handShift heraldry locus locusGrp material origDate origPlace secFol signatures stamp watermark] model.pPart.transcriptional [add app corr damage del metaMark mod orig redo reg restore rewrite sic supplied surplus unclear undo used] model.segLike [m pc phr s seg] line table zone

    <patch> [http://www.tei-c.org/ns/geneticEditions]

    <patch> contains a part of a written surface which was originally physically distinct but became attached to it at the time that one or more written zones were created.
    Module derived-module-geneticTEI
    In addition to global attributes att.coordinated (@start, @ulx, @uly, @lrx, @lry) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.typed (@type, @subtype)
    binder Describe the method by which a patch is or was connected to the main surface
    Status Optional
    Datatype xsd:Name
    Sample values include:
    glue
    patch is glued in place
    pin
    patch is pinned or stapled in place
    sewn
    patch is sewn in place
    flipping indicates whether the patch is attached and folded in such a way as to provide two writing surfaces
    Status Optional
    Datatype xsd:boolean
    height height of the patch in mm
    Status Optional
    Datatype xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
    width width of the patch in mm
    Status Optional
    Datatype xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
    Used by
    May contain
    Declaration
                            element 
                            patch
    {
       att.coordinated.attributes,
       att.global.attributes,
       att.typed.attributes,
       attribute binder { xsd:Name }?,
       attribute flipping { xsd:boolean }?,
       attribute 
                            height
       {
          xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
       }?,
       attribute 
                            width
       {
          xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
       }?,
       ( text | model.global | zone )*
    }

    <redo> [http://www.tei-c.org/ns/geneticEditions]

    <redo> points to a marked-up intervention in a text which has subsequently been marked for a second time in a different way.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.repeatable (@cause) att.spanning (@spanTo) att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) )
    target points to the element representing the intervention which is to be repeated.
    Status Optional
    Datatype xsd:anyURI
    Used by
    May contain Empty element
    Declaration
                            element 
                            redo
    {
       att.global.attributes,
       att.repeatable.attributes,
       att.spanning.attributes,
       att.transcriptional.attributes,
       attribute target { xsd:anyURI }?,
       empty
    }

    <rewrite> [http://www.tei-c.org/ns/geneticEditions]

    <rewrite> contains a sequence of text which has been rewritten by the author, for example by over-inking, to clarify or fix it.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.repeatable (@cause) att.spanning (@spanTo) att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) )
    Used by
    May contain
    Declaration
                            element 
                            rewrite
    {
       att.global.attributes,
       att.repeatable.attributes,
       att.spanning.attributes,
       att.transcriptional.attributes,
       macro.paraContent
    }
    Note
    Multiple rewritings are indicated by nesting one rewrite within another. In principle, a rewriting differs from a substitution in that second and subsequent rewrites do not materially alter the content of an element. Where there are minor changes made during the rewriting however these may be marked up using del, add, etc. with an appropriate value for the stage attribute.

    <stageNote> [http://www.tei-c.org/ns/geneticEditions]

    <stageNote> documents a particular stage in the genesis of a text.
    Module derived-module-geneticTEI
    In addition to global attributes att.datable (att.datable.w3c (@period, @when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.typed (@type, @subtype)
    target points to one or more elements that belong to this stage.
    Status Optional
    Datatype 1–∞ occurrences of  xsd:anyURI separated by whitespace
    Used by
    May contain
    Declaration
                            element 
                            stageNote
    {
       att.datable.attributes,
       att.editLike.attributes,
       att.global.attributes,
       att.typed.attributes,
       attribute target { list { xsd:anyURI, xsd:anyURI* } }?,
       ( macro.specialPara | stageNote )*
    }

    <stageNotes> [http://www.tei-c.org/ns/geneticEditions]

    <stageNotes> contains one or more descriptions of the stages which have been identified in the genesis of a text.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.typed (@type, @subtype)
    ordered indicates whether or not the order in which the children of this element are presented is significant
    Status Optional
    Datatype xsd:boolean
    Used by
    May contain
    derived-module-geneticTEI: stageNote stageNotes
    Declaration
                            element 
                            stageNotes
    {
       att.global.attributes,
       att.typed.attributes,
       attribute ordered { xsd:boolean }?,
       ( stageNotes | stageNote )+
    }

    <subst>

    <subst> (substitution) groups one or more deletions with one or more additions when the combination is to be regarded as a single intervention in the text.
    Module transcr
    In addition to global attributes att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) )
    type The type of substitution.
    Status Optional
    Datatype xsd:Name
    Used by
    May contain
    derived-module-geneticTEI: metaMark mod redo rewrite undo used
    textcrit: app
    Declaration
                            element 
                            subst
    {
       attribute type { xsd:Name }?,
       att.global.attributes,
       att.transcriptional.attributes,
       ( ( model.pPart.transcriptional ), ( text | model.pPart.transcriptional )+ )
    }
    Example
    ... are all included. <del hand="#RG">It is</del>
    <subst>
     <add>T</add>
     <del>t</del>
    </subst>he expressed
    Note
    Although a substitution may contain any mixture of additions and deletions; there should be an addition for each deletion bearing the same sequence number. This constraint cannot be modelled in the schema language currently deployed.

    <surface>

    <surface> defines a written surface in terms of a rectangular coordinate space, optionally grouping one or more graphic representations of that space, and rectangular zones of interest within it.
    Module transcr
    In addition to global attributes att.typed (@type, @subtype) att.coordinated (@start, @ulx, @uly, @lrx, @lry) att.declaring (@decls)
    Used by
    May contain
    Declaration
                            element 
                            surface
    {
       att.global.attributes,
       att.typed.attributes,
       att.coordinated.attributes,
       att.declaring.attributes,
       ( model.global | model.glossLike | model.graphicLike | zone | patch )*
    }
    Example
    <facsimile>
     <surface
       ulx="0"
       uly="0"
       lrx="200"
       lry="300">

      <graphic url="Bovelles-49r.png"/>
     </surface>
    </facsimile>
    Note
    The surface element represents a rectangular area of any physical surface forming part of the source material. This may be a sheet of paper, one face of a monument, a billboard, a papyrus scroll, or indeed any 2-dimensional surface.
    The coordinate space defined by this element may be thought of as a grid lrx - ulx units wide and uly - lry units high. This grid is superimposed on the whole of any image directly contained by the surface element. The coordinate values used by every zone element contained by this surface are to be understood with reference to the same grid.

    <table>

    <table> contains text displayed in tabular form, in rows and columns. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/FT.html#FTTAB1
    Module figures
    In addition to global attributes In addition to global attributes
    rows indicates the number of rows in the table.
    Status Optional
    Datatype xsd:nonNegativeInteger
    Values If no number is supplied, an application must calculate the number of rows.
    Note
    Rows should be presented from top to bottom.
    cols (columns) indicates the number of columns in each row of the table.
    Status Optional
    Datatype xsd:nonNegativeInteger
    Values If no number is supplied, an application must calculate the number of columns.
    Note
    Within each row, columns should be presented left to right.
    Used by
    May contain
    Declaration
                            element 
                            table
    {
       attribute rows { xsd:nonNegativeInteger }?,
       attribute cols { xsd:nonNegativeInteger }?,
       att.global.attributes,
       ( ( model.headLike | model.global )*, ( row, model.global* )+ )
    }
    Example
    <table rows="4" cols="4">
     <head>Poor Men's Lodgings in Norfolk (Mayhew, 1843)</head>
     <row role="label">
      <cell role="data"/>
      <cell role="data">Dossing Cribs or Lodging Houses</cell>
      <cell role="data">Beds</cell>
      <cell role="data">Needys or Nightly Lodgers</cell>
     </row>
     <row role="data">
      <cell role="label">Bury St Edmund's</cell>
      <cell role="data">5</cell>
      <cell role="data">8</cell>
      <cell role="data">128</cell>
     </row>
     <row role="data">
      <cell role="label">Thetford</cell>
      <cell role="data">3</cell>
      <cell role="data">6</cell>
      <cell role="data">36</cell>
     </row>
     <row role="data">
      <cell role="label">Attleboro'</cell>
      <cell role="data">3</cell>
      <cell role="data">5</cell>
      <cell role="data">20</cell>
     </row>
     <row role="data">
      <cell role="label">Wymondham</cell>
      <cell role="data">1</cell>
      <cell role="data">11</cell>
      <cell role="data">22</cell>
     </row>
    </table>
    Note
    Contains an optional heading and a series of rows.
    Any rendition information should be supplied using the global rend attribute, at the table, row, or cell level as appropriate.

    <transpose> [http://www.tei-c.org/ns/geneticEditions]

    <transpose> describes a single textual transposition as an ordered list of at least two pointers specifying the order in which the elements indicated should be re-combined.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    core: ptr
    Declaration
    element transpose { att.global.attributes, ( ptr, ptr+ ) }

    <transposeGrp> [http://www.tei-c.org/ns/geneticEditions]

    <transposeGrp> supplies a list of transpositions indicated at some point in the text, typically by means of metamarks.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain
    derived-module-geneticTEI: transpose
    Declaration
    element transposeGrp { att.global.attributes, transpose+ }

    <unclear>

    <unclear> contains a word, phrase, or passage which cannot be transcribed with certainty because it is illegible or inaudible in the source. http://www.tei-c.org/release/doc/tei-p5-doc/en/html/PH.html#PHDA http://www.tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#COEDADD
    Module core
    In addition to global attributes att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) att.typed (@type, @subtype)
    reason indicates why the material is hard to transcribe.
    Status Optional
    Datatype 1–∞ occurrences of  token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" } separated by whitespace
    Values one or more words describing the difficulty, e.g. faded, background noise, passing truck, illegible, eccentric ductus.
    <div>
     <head>Rx</head>
     <p>500 mg <unclear reason="illegible">placebo</unclear>
     </p>
    </div>
    hand Where the difficulty in transcription arises from action (partial deletion, etc.) assignable to an identifiable hand, signifies the hand responsible for the action.
    Status Optional
    Datatype xsd:anyURI
    Values must be one of the hand identifiers declared in the document header (see section ??).
    agent Where the difficulty in transcription arises from damage, categorizes the cause of the damage, if it can be identified.
    Status Optional
    Datatype xsd:Name
    Sample values include:
    rubbing
    damage results from rubbing of the leaf edges
    mildew
    damage results from mildew on the leaf surface
    smoke
    damage results from smoke
    Used by
    May contain
    Declaration
                            element 
                            unclear
    {
       attribute 
                            reason
       {
          list
          {
             token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" },
             token { pattern = "(\p{L}|\p{N}|\p{P}|\p{S})+" }*
          }
       }?,
       attribute hand { xsd:anyURI }?,
       attribute agent { xsd:Name }?,
       att.global.attributes,
       att.editLike.attributes,
       att.typed.attributes,
       macro.paraContent
    }
    Note
    The same element is used for all cases of uncertainty in the transcription of element content, whether for written or spoken material. For other aspects of certainty, uncertainty, and reliability of tagging and transcription, see chapter ??.
    The damage, gap, del, unclear and supplied elements may be closely allied in use. See section ?? for discussion of which element is appropriate for which circumstance.

    <undo> [http://www.tei-c.org/ns/geneticEditions]

    <undo> points to any marked-up intervention in a text which has subsequently been marked as to be cancelled or undone.
    Module derived-module-geneticTEI
    In addition to global attributes att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage)) att.spanning (@spanTo) att.transcriptional (@hand, @status, @seq) (att.editLike (@instant, @evidence, @source) (att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max)) ) (att.responsibility (@cert, @resp)) )
    target points to the element representing the intervention to be undone.
    Status Optional
    Datatype xsd:anyURI
    Used by
    May contain Empty element
    Declaration
                            element 
                            undo
    {
       att.global.attributes,
       att.spanning.attributes,
       att.transcriptional.attributes,
       attribute target { xsd:anyURI }?,
       empty
    }

    <used> [http://www.tei-c.org/ns/geneticEditions]

    <used> a passage of text which has been marked as used, usually meaning that it has been transcribed to a fair copy.
    Module derived-module-geneticTEI
    In addition to global attributes att.spanning (@spanTo) att.global (@xml:id, @n, @xml:lang, @rend, @rendition, @xml:base, @xml:space) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.staged (@stage))
    Used by
    May contain Empty element
    Declaration
    element used { att.spanning.attributes, att.global.attributes, empty }
    Note
    The mark is often a strikethrough, but can be any author-specific mark.

    <zone>

    <zone> defines a rectangular area contained within a surface element.
    Module transcr
    In addition to global attributes att.coordinated (@start, @ulx, @uly, @lrx, @lry)
    rotate indicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent surface element as implied by the dimensions given in the msDesc section or by the coordinates of the surface itself. The orientation is expressed in arc degrees.
    Status Optional
    Datatype xsd:nonNegativeInteger
    Used by
    May contain
    Declaration
                            element 
                            zone
    {
       attribute rotate { xsd:nonNegativeInteger }?,
       att.global.attributes,
       att.coordinated.attributes,
       ( text | model.zonePart | model.global )*
    }
    Example
    <facsimile>
     <surface
       ulx="50"
       uly="20"
       lrx="400"
       lry="280">

      <zone
        ulx="0"
        uly="0"
        lrx="500"
        lry="321">

       <graphic url="graphic.png "/>
      </zone>
     </surface>
    </facsimile>
    Note
    The position of every zone for a given surface is always defined by reference to the coordinate system defined for that surface. Any graphic element contained by a zone represents the whole of the zone.

    Schema geneticTEI: unchanged components

    TEI: (TEI document) contains a single TEI-conformant document, comprising a TEI header and a text, either in isolation or as part of a teiCorpus element.
    ab: (anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph.
    abbr: (abbreviation) contains an abbreviation of any sort.
    accMat: (accompanying material) contains details of any significant additional material which may be closely associated with the manuscript being described, such as non-contemporaneous documents or fragments bound in with the manuscript at some earlier historical period.
    acquisition: contains any descriptive or other information concerning the process by which a manuscript or manuscript part entered the holding institution.
    actor: Name of an actor appearing within a cast list.
    add: (addition) contains letters, words, or phrases inserted in the text by an author, scribe, annotator, or corrector.
    addSpan: (added span of text) marks the beginning of a longer sequence of text added by an author, scribe, annotator or corrector (see also add).
    additional: groups additional information, combining bibliographic information about a manuscript, or surrogate copies of it with curatorial or administrative information.
    additions: contains a description of any significant additions found within a manuscript, such as marginalia or other annotations.
    addrLine: (address line) contains one line of a postal address.
    address: contains a postal address, for example of a publisher, an organization, or an individual.
    adminInfo: (administrative information) contains information about the present custody and availability of the manuscript, and also about the record description itself.
    affiliation: (affiliation) contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor.
    alt: (alternation) identifies an alternation or a set of choices among elements or passages.
    altGrp: (alternation group) groups a collection of alt elements and possibly pointers.
    altIdentifier: (alternative identifier) contains an alternative or former structured identifier used for a manuscript, such as a former catalogue number.
    am: (abbreviation marker) contains a sequence of letters or signs present in an abbreviation which are omitted or replaced in the expanded form of the abbreviation.
    anchor: (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element.
    app: (apparatus entry) contains one entry in a critical apparatus, with an optional lemma and at least one reading.
    appInfo: (application information) records information about an application which has edited the TEI file.
    application: provides information about an application which has acted upon the document.
    argument: A formal list or prose description of the topics addressed by a subdivision of a text.
    att.ascribed: provides attributes for elements representing speech or action that can be ascribed to a specific individual.
    att.canonical: provides attributes which can be used to associate a representation such as a name or title with canonical information about the object being named or referenced.
    att.coordinated: elements which can be positioned within a two dimensional coordinate system.
    att.damaged: provides attributes describing the nature of any physical damage affecting a reading.
    att.datable: provides attributes for normalization of elements that contain dates, times, or datable events.
    att.datable.iso: provides attributes for normalization of elements that contain datable events using the ISO 8601 standard.
    att.datable.w3c: provides attributes for normalization of elements that contain datable events using the W3C datatypes.
    att.declarable: provides attributes for those elements in the TEI Header which may be independently selected by means of the special purpose decls attribute.
    att.declaring: provides attributes for elements which may be independently associated with a particular declarable element within the header, thus overriding the inherited default for that element.
    att.dimensions: provides attributes for describing the size of physical objects.
    att.divLike: provides attributes common to all elements which behave in the same way as divisions.
    att.global.analytic: provides additional global attributes for associating specific analyses or interpretations with appropriate portions of a text.
    att.global.facs: groups elements corresponding with all or part of an image, because they contain an alternative representation of it, typically but not necessarily a transcription of it.
    att.global.linking: defines a set of attributes for hypertext and other linking, which are enabled for all elements when the additional tag set for linking is selected.
    att.handFeatures: provides attributes describing aspects of the hand in which a manuscript is written.
    att.internetMedia: provides attributes for specifying the type of a computer resource using a standard taxonomy.
    att.interpLike: provides attributes for elements which represent a formal analysis or interpretation.
    att.measurement: provides attributes to represent a regularized or normalized measurement.
    att.msExcerpt: (manuscript excerpt) provides attributes used to describe excerpts from a manuscript placed in a description thereof.
    att.naming: provides attributes common to elements which refer to named persons, places, organizations etc.
    att.personal: (attributes for components of personal names) common attributes for those elements which form part of a personal name.
    att.placement: provides attributes for describing where on the source page or object a textual element appears.
    att.pointing: defines a set of attributes used by all elements which point to other elements by means of one or more URI references.
    att.pointing.group: defines a set of attributes common to all elements which enclose groups of pointer elements.
    att.ranging: provides attributes for describing numerical ranges.
    att.rdgPart: attributes for elements which mark the beginning or ending of a fragmentary manuscript or other witness.
    att.repeatable:
    att.responsibility: provides attributes indicating who is responsible for something asserted by the markup and the degree of certainty associated with it.
    att.scoping: provides attributes for selecting particular elements within a document by means of XPath.
    att.segLike: provides attributes for elements used for arbitrary segmentation.
    att.sourced: provides attributes identifying the source edition from which some encoded feature derives.
    att.spanning: provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it.
    att.tableDecoration: provides attributes used to decorate rows or cells of a table.
    att.textCritical: defines a set of attributes common to all elements representing variant readings in text critical work.
    att.transcriptional: provides attributes specific to elements encoding authorial or scribal intervention in a text when transcribing manuscript or similar sources.
    att.translatable: provides attributes used to indicate the status of a translatable portion of an ODD document.
    att.typed: provides attributes which can be used to classify or subclassify elements in any way.
    author: in a bibliographic reference, contains the name(s) of the author(s), personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority.
    authority: (release authority) supplies the name of a person or other agency responsible for making an electronic file available, other than a publisher or distributor.
    availability: supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, etc.
    back: (back matter) contains any appendixes, etc. following the main part of a text.
    bibl: (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged.
    biblFull: (fully-structured bibliographic citation) contains a fully-structured bibliographic citation, in which all components of the TEI file description are present.
    biblScope: (scope of citation) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work.
    binding: contains a description of one binding, i.e. type of covering, boards, etc. applied to a manuscript.
    bindingDesc: (binding description) describes the present and former bindings of a manuscript, either as a series of paragraphs or as a series of distinct binding elements, one for each binding of the manuscript.
    body: (text body) contains the whole body of a single unitary text, excluding any front or back matter.
    byline: contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.
    cRefPattern: (canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI.
    camera: describes a particular camera angle or viewpoint in a screen play.
    caption: contains the text of a caption or other text displayed as part of a film script or screenplay.
    castGroup: (cast list grouping) groups one or more individual castItem elements within a cast list.
    castItem: (cast list item) contains a single entry within a cast list, describing either a single role or a list of non-speaking roles.
    castList: (cast list) contains a single cast list or dramatis personae.
    catDesc: (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal textDesc.
    catRef: (category reference) specifies one or more defined categories within some taxonomy or text typology.
    catchwords: describes the system used to ensure correct ordering of the quires making up a codex or incunable, typically by means of annotations at the foot of the page.
    category: contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy.
    cb: (column break) marks the boundary between one column of a text and the next in a standard reference system.
    cell: contains one cell of a table.
    certainty: indicates the degree of certainty associated with some aspect of the text markup.
    change: summarizes a particular change or correction made to a particular version of an electronic text which is shared between several researchers.
    char: (character) provides descriptive information about a character.
    charDecl: (character declarations) provides information about nonstandard characters and glyphs.
    charName: (character name) contains the name of a character, expressed following Unicode conventions.
    charProp: (character property) provides a name and value for some property of the parent character or glyph.
    choice: groups a number of alternative encodings for the same point in a text.
    cit: (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example.
    classCode: (classification code) contains the classification code used for this text in some standard classification system.
    classDecl: (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text.
    climate: (climate) contains information about the physical climate of a place.
    closer: groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter.
    collation: contains a description of how the leaves or bifolia are physically arranged.
    collection: contains the name of a collection of manuscripts, not necessarily located within a single repository.
    colophon: contains the colophon of a manuscript item: that is, a statement providing information regarding the date, place, agency, or reason for production of the manuscript.
    condition: contains a description of the physical condition of the manuscript.
    corr: (correction) contains the correct form of a passage apparently erroneous in the copy text.
    correction: (correction principles) states how and under what circumstances corrections have been made in the text.
    country: (country) contains the name of a geo-political unit, such as a nation, country, colony, or commonwealth, larger than or administratively superior to a region and smaller than a bloc.
    custEvent: (custodial event) describes a single event during the custodial history of a manuscript.
    custodialHist: (custodial history) contains a description of a manuscript's custodial history, either as running prose or as a series of dated custodial events.
    damage: contains an area of damage to the text witness.
    damageSpan: (damaged span of text) marks the beginning of a longer sequence of text which is damaged in some way but still legible.
    date: contains a date in any format.
    dateline: contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer.
    decoDesc: (decoration description) contains a description of the decoration of a manuscript, either as a sequence of paragraphs, or as a sequence of topically organised decoNote elements.
    decoNote: (note on decoration) contains a note describing either a decorative component of a manuscript, or a fairly homogenous class of such components.
    del: (deletion) contains a letter, word, or passage deleted, marked as deleted, or otherwise indicated as superfluous or spurious in the copy text by an author, scribe, annotator, or corrector.
    delSpan: (deleted span of text) marks the beginning of a longer sequence of text deleted, marked as deleted, or otherwise signaled as superfluous or spurious by an author, scribe, annotator, or corrector.
    depth: contains a measurement measured across the spine of a book or codex, or (for other text-bearing objects) perpendicular to the measurement given by the ‘width’ element.
    desc: (description) contains a brief description of the object documented by its parent element, including its intended usage, purpose, or application where this is appropriate.
    dim: contains any single measurement forming part of a dimensional specification of some sort.
    dimensions: contains a dimensional specification.
    distinct: identifies any word or phrase which is regarded as linguistically distinct, for example as archaic, technical, dialectal, non-preferred, etc., or as forming part of a sublanguage.
    distributor: supplies the name of a person or other agency responsible for the distribution of a text.
    div: (text division) contains a subdivision of the front, body, or back of a text.
    divGen: (automatically generated text division) indicates the location at which a textual division generated automatically by a text-processing application is to appear.
    docAuthor: (document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline).
    docDate: (document date) contains the date of a document, as given (usually) on a title page.
    docEdition: (document edition) contains an edition statement as presented on a title page of a document.
    docImprint: (document imprint) contains the imprint statement (place and date of publication, publisher name), as given (usually) at the foot of a title page.
    docTitle: (document title) contains the title of a document, including all its constituents, as given on a title page.
    eLeaf: (leaf or terminal node of an embedding tree) provides explicitly for a leaf of an embedding tree, which may also be encoded with the eTree element.
    eTree: (embedding tree) provides an alternative to tree element for representing ordered rooted tree structures.
    edition: (edition) describes the particularities of one edition of a text.
    editionStmt: (edition statement) groups information relating to one edition of a text.
    editor: secondary statement of responsibility for a bibliographic item, for example the name of an individual, institution or organization, (or of several such) acting as editor, compiler, translator, etc.
    editorialDecl: (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text.
    email: (electronic mail address) contains an e-mail address identifying a location to which e-mail messages can be delivered.
    emph: (emphasized) marks words or phrases which are stressed or emphasized for linguistic or rhetorical effect.
    encodingDesc: (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived.
    epigraph: contains a quotation, anonymous or attributed, appearing at the start of a section or chapter, or on a title page.
    epilogue: contains the epilogue to a drama, typically spoken by an actor out of character, possibly in association with a particular performance or venue.
    ex: (editorial expansion) contains a sequence of letters added by an editor or transcriber when expanding an abbreviation.
    expan: (expansion) contains the expansion of an abbreviation.
    explicit: contains the explicit of a manuscript item, that is, the closing words of the text proper, exclusive of any rubric or colophon which might follow it.
    extent: describes the approximate size of a text as stored on some carrier medium, whether digital or non-digital, specified in any convenient units.
    facsimile: contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text.
    figDesc: (description of figure) contains a brief prose description of the appearance or content of a graphic figure, for use when documenting an image without displaying it.
    figure: groups elements representing or containing graphic information such as an illustration or figure.
    fileDesc: (file description) contains a full bibliographic description of an electronic file.
    filiation: contains information concerning the manuscript's filiation, i.e. its relationship to other surviving manuscripts of the same text, its protographs, antigraphs and apographs.
    finalRubric: contains the string of words that denotes the end of a text division, often with an assertion as to its author and title, usually set off from the text itself by red ink, by a different size or type of script, or by some other such visual device.
    floatingText: contains a single text of any kind, whether unitary or composite, which interrupts the text containing it at any point and after which the surrounding text resumes.
    foliation: describes the numbering system or systems used to count the leaves or pages in a codex.
    foreign: (foreign) identifies a word or phrase as belonging to some language other than that of the surrounding text.
    forest: provides for groups of rooted trees.
    forestGrp: (forest group) provides for groups of forests.
    formula: contains a mathematical or other formula.
    front: (front matter) contains any prefatory matter (headers, title page, prefaces, dedications, etc.) found at the start of a document, before the main body.
    funder: (funding body) specifies the name of an individual, institution, or organization responsible for the funding of a project or text.
    fw: (forme work) contains a running head (e.g. a header, footer), catchword, or similar material appearing on the current page.
    g: (character or glyph) represents a non-standard character or glyph.
    gap: (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible.
    geoDecl: (geographic coordinates declaration) documents the notation and the datum used for geographic coordinates expressed as content of the <geo> element elsewhere within the document.
    gloss: identifies a phrase or word used to provide a gloss or definition for some other word or phrase.
    glyph: (character glyph) provides descriptive information about a character glyph.
    glyphName: (character glyph name) contains the name of a glyph, expressed following Unicode conventions for character names.
    graph: encodes a graph, which is a collection of nodes, and arcs which connect the nodes.
    graphic: indicates the location of an inline graphic, illustration, or figure.
    group: contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc.
    handDesc: (description of hands) contains a description of all the different kinds of writing used in a manuscript.
    handNote: (note on hand) describes a particular style or hand distinguished within a manuscript.
    handNotes: contains one or more handNote elements documenting the different hands identified within the source texts.
    handShift: marks the beginning of a sequence of text written in a new hand, or the beginning of a scribal stint.
    height: contains a measurement measured along the axis at right angles to the bottom of the written surface, i.e. parallel to the spine for a codex or book.
    heraldry: contains a heraldic formula or phrase, typically found as part of a blazon, coat of arms, etc.
    hi: (highlighted) marks a word or phrase as graphically distinct from the surrounding text, for reasons concerning which no claim is made.
    history: groups elements describing the full history of a manuscript or manuscript part.
    hyphenation: summarizes the way in which hyphenation in a source text has been treated in an encoded version of it.
    iNode: (intermediate (or internal) node) represents an intermediate (or internal) node of a tree.
    idno: (identifying number) supplies any number or other identifier used to identify a bibliographic item in a standardized way.
    imprimatur: contains a formal statement authorizing the publication of a work, sometimes required to appear on a title page or its verso.
    incipit: contains the incipit of a manuscript item, that is the opening words of the text proper, exclusive of any rubric which might precede it, of sufficient length to identify the work uniquely; such incipts were, in fomer times, frequently used a means of reference to a work, in place of a title.
    index: (index entry) marks a location to be indexed for whatever purpose.
    institution: contains the name of an organization such as a university or library, with which a manuscript is identified, generally its holding institution.
    interp: (interpretation) summarizes a specific interpretative annotation which can be linked to a span of text.
    interpGrp: (interpretation group) collects together a set of related interpretations which share responsibility or type.
    interpretation: describes the scope of any analytic or interpretive information added to the text in addition to the transcription.
    item: contains one component of a list.
    join: identifies a possibly fragmented segment of text, by pointing at the possibly discontiguous elements which compose it.
    joinGrp: (join group) groups a collection of join elements and possibly pointers.
    keywords: contains a list of keywords or phrases identifying the topic or nature of a text.
    l: (verse line) contains a single, possibly incomplete, line of verse.
    label: contains the label associated with an item in a list; in glossaries, marks the term being defined.
    lacunaEnd: indicates the end of a lacuna in a mostly complete textual witness.
    lacunaStart: indicates the beginning of a lacuna in the text of a mostly complete textual witness.
    langUsage: (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text.
    language: characterizes a single language or sublanguage used within a text.
    layout: describes how text is laid out on the page, including information about any ruling, pricking, or other evidence of page-preparation techniques.
    layoutDesc: (layout description) collects the set of layout descriptions applicable to a manuscript.
    lb: (line break) marks the start of a new (typographic) line in some edition or version of a text.
    leaf: encodes the leaves (terminal nodes) of a tree.
    lem: (lemma) contains the lemma, or base text, of a textual variation.
    lg: (line group) contains a group of verse lines functioning as a formal unit, e.g. a stanza, refrain, verse paragraph, etc.
    linkGrp: (link group) defines a collection of associations or hypertextual links.
    list: (list) contains any sequence of items organized as a list.
    listBibl: (citation list) contains a list of bibliographic citations of any kind.
    listEvent: (list of events) contains a list of descriptions, each of which provides information about an identifiable event.
    listNym: (list of canonical names) contains a list of nyms, that is, standardized names for any thing.
    listWit: (witness list) lists definitions for all the witnesses referred to by a critical apparatus, optionally grouped hierarchically.
    localName: (locally-defined property name) contains a locally defined name for some property.
    locus: defines a location within a manuscript or manuscript part, usually as a (possibly discontinuous) sequence of folio references.
    locusGrp: groups a number of locations which together form a distinct but discontinuous item within a manuscript or manuscript part, according to a specific foliation.
    m: (morpheme) represents a grammatical morpheme.
    macro.limitedContent: (paragraph content) defines the content of prose elements that are not used for transcription of extant materials.
    macro.paraContent: (paragraph content) defines the content of paragraphs and similar elements.
    macro.phraseSeq: (phrase sequence) defines a sequence of character data and phrase-level elements.
    macro.phraseSeq.limited: (limited phrase sequence) defines a sequence of character data and those phrase-level elements that are not typically used for transcribing extant documents.
    macro.specialPara: ('special' paragraph content) defines the content model of elements such as notes or list items, which either contain a series of component-level elements or else have the same structure as a paragraph, containing a series of phrase-level and inter-level elements.
    macro.xtext: (extended text) defines a sequence of character data and gaiji elements.
    mapping: (character mapping) contains one or more characters which are related to the parent character or glyph in some respect, as specified by the type attribute.
    material: contains a word or phrase describing the material of which a manuscript (or part of a manuscript) is composed.
    measure: contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name.
    measureGrp: (measure group) contains a group of dimensional specifications which relate to the same object, for example the height and width of a manuscript page.
    mentioned: marks words or phrases mentioned, not used.
    model.addrPart: groups elements such as names or postal codes which may appear as part of a postal address.
    model.addressLike: groups elements used to represent a postal or e-mail address.
    model.applicationLike: groups elements used to record application-specific information about a document in its header.
    model.biblLike: groups elements containing a bibliographic description.
    model.biblPart: groups elements which represent components of a bibliographic description.
    model.castItemPart: groups component elements of an entry in a cast list, such as dramatic role or actor's name.
    model.catDescPart: groups component elements of the TEI Header Category Description.
    model.choicePart: groups elements (other than choice itself) which can be used within a choice alternation.
    model.common: groups common chunk- and inter-level elements.
    model.dateLike: groups elements containing temporal expressions.
    model.dimLike: groups elements which describe a measurement forming part of the physical dimensions of some object.
    model.div1Like: groups top-level structural divisions.
    model.divBottom: groups elements appearing at the end of a text division.
    model.divBottomPart: groups elements which can occur only at the end of a text division.
    model.divGenLike: groups elements used to represent a structural division which is generated rather than explicitly present in the source.
    model.divLike: groups elements used to represent un-numbered generic structural divisions.
    model.divPart: groups paragraph-level elements appearing directly within divisions.
    model.divTop: groups elements appearing at the beginning of a text division.
    model.divTopPart: groups elements which can occur only at the beginning of a text division.
    model.divWrapper: groups elements which can appear at either top or bottom of a textual division.
    model.editorialDeclPart: groups elements which may be used inside editorialDecl and appear multiple times.
    model.egLike: groups elements containing examples or illustrations.
    model.emphLike: groups phrase-level elements which are typographically distinct and to which a specific function can be attributed.
    model.encodingDescPart: groups elements which may be used inside encodingDesc and appear multiple times.
    model.entryPart: groups elements appearing at any level within a dictionary entry.
    model.entryPart.top: groups high level elements within a structured dictionary entry
    model.frontPart: groups elements which appear at the level of divisions within front or back matter.
    model.frontPart.drama: groups elements which appear at the level of divisions within front or back matter of performance texts only.
    model.global: groups elements which may appear at any point within a TEI text.
    model.global.edit: groups globally available elements which perform a specifically editorial function.
    model.global.meta: groups globally available elements which describe the status of other elements.
    model.glossLike: groups elements which provide an alternative name, explanation, or description for any markup construct.
    model.graphicLike: groups elements containing images, formulae, and similar objects.
    model.headLike: groups elements used to provide a title or heading at the start of a text division.
    model.highlighted: groups phrase-level elements which are typographically distinct.
    model.imprintPart: groups the bibliographic elements which occur inside imprints.
    model.inter: groups elements which can appear either within or between paragraph-like elements.
    model.lLike: groups elements representing metrical components such as verse lines.
    model.labelLike: groups elements used to gloss or explain other parts of a document.
    model.limitedPhrase: groups phrase-level elements excluding those elements primarily intended for transcription of existing sources.
    model.listLike: groups list-like elements.
    model.measureLike: groups elements which denote a number, a quantity, a measurement, or similar piece of text that conveys some numerical meaning.
    model.milestoneLike: groups milestone-style elements used to represent reference systems.
    model.msItemPart: groups elements which can appear within a manuscript item description.
    model.msQuoteLike: groups elements which represent passages such as titles quoted from a manuscript as a part of its description.
    model.nameLike: groups elements which name or refer to a person, place, or organization.
    model.nameLike.agent: groups elements which contain names of individuals or corporate bodies.
    model.noteLike: groups globally-available note-like elements.
    model.pLike: groups paragraph-like elements.
    model.pLike.front: groups paragraph-like elements which can occur as direct constituents of front matter.
    model.pPart.data: groups phrase-level elements containing names, dates, numbers, measures, and similar data.
    model.pPart.edit: groups phrase-level elements for simple editorial correction and transcription.
    model.persStateLike: groups elements describing changeable characteristics of a person which have a definite duration, for example occupation, residence, or name.
    model.personPart: groups elements which form part of the description of a person.
    model.phrase: groups elements which can occur at the level of individual words or phrases.
    model.physDescPart: groups specialised elements forming part of the physical description of a manuscript or similar written source.
    model.placeNamePart: groups elements which form part of a place name.
    model.placeStateLike: groups elements which describe changing states of a place.
    model.placeTraitLike: groups elements which describe unchanging traits of a place.
    model.profileDescPart: groups elements which may be used inside profileDesc and appear multiple times.
    model.ptrLike: groups elements used for purposes of location and reference.
    model.publicationStmtPart: groups elements which may appear within the publicationStmt element of the TEI Header.
    model.qLike: groups elements related to highlighting which can appear either within or between chunk-level elements.
    model.quoteLike: groups elements used to directly contain quotations.
    model.rdgLike: groups elements which contain a single reading, other than the lemma, within a textual variation.
    model.rdgPart: groups elements which mark the beginning or ending of a fragmentary manuscript or other witness.
    model.resourceLike: groups non-textual elements which may appear together with a header and a text to constitute a TEI document.
    model.respLike: groups elements which are used to indicate intellectual or other significant responsibility, for example within a bibliographic element.
    model.sourceDescPart: groups elements which may be used inside sourceDesc and appear multiple times.
    model.stageLike: groups elements containing stage directions or similar things defined by the module for performance texts.
    model.teiHeaderPart: groups high level elements which may appear more than once in a TEI Header.
    model.titlepagePart: groups elements which can occur as direct constituents of a title page, such as docTitle, docAuthor, docImprint, or epigraph.
    move: (movement) marks the actual entrance or exit of one or more characters on stage.
    msContents: (manuscript contents) describes the intellectual content of a manuscript or manuscript part, either as a series of paragraphs or as a series of structured manuscript items.
    msDesc: (manuscript description) contains a description of a single identifiable manuscript or other text-bearing object.
    msIdentifier: (manuscript identifier) contains the information required to identify the manuscript being described.
    msItem: (manuscript item) describes an individual work or item within the intellectual content of a manuscript or manuscript part.
    msItemStruct: (structured manuscript item) contains a structured description for an individual work or item within the intellectual content of a manuscript or manuscript part.
    msName: (alternative name) contains any form of unstructured alternative name used for a manuscript, such as an ‘ocellus nominum’, or nickname.
    msPart: (manuscript part) contains information about an originally distinct manuscript or part of a manuscript, now forming part of a composite manuscript.
    musicNotation: contains description of type of musical notation.
    name: (name, proper noun) contains a proper noun or noun phrase.
    node: encodes a node, a possibly labeled point in a graph.
    normalization: indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form.
    note: contains a note or annotation.
    notesStmt: (notes statement) collects together any notes providing information about a text additional to that recorded in other parts of the bibliographic description.
    num: (number) contains a number, written in any form.
    objectDesc: contains a description of the physical components making up the object which is being described.
    opener: groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter.
    orig: (original form) contains a reading which is marked as following the original, rather than being normalized or corrected.
    origDate: (origin date) contains any form of date, used to identify the date of origin for a manuscript or manuscript part.
    origPlace: (origin place) contains any form of place name, used to identify the place of origin for a manuscript or manuscript part.
    origin: contains any descriptive or other information concerning the origin of a manuscript or manuscript part.
    p: (paragraph) marks paragraphs in prose.
    pb: (page break) marks the boundary between one page of a text and the next in a standard reference system.
    pc: (punctuation character) a character or string of characters regarded as constituting a single punctuation mark.
    performance: contains a section of front or back matter describing how a dramatic piece is to be performed in general or how it was performed on some specific occasion.
    persName: (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including any or all of the person's forenames, surnames, honorifics, added names, etc.
    phr: (phrase) represents a grammatical phrase.
    physDesc: (physical description) contains a full physical description of a manuscript or manuscript part, optionally subdivided using more specialised elements from the model.physDescPart class.
    placeName: contains an absolute or relative place name.
    postscript: contains a postscript, e.g. to a letter.
    precision: indicates the numerical accuracy or precision associated with some aspect of the text markup.
    principal: (principal researcher) supplies the name of the principal researcher responsible for the creation of an electronic text.
    profileDesc: (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting.
    projectDesc: (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected.
    prologue: contains the prologue to a drama, typically spoken by an actor out of character, possibly in association with a particular performance or venue.
    provenance: contains any descriptive or other information concerning a single identifiable episode during the history of a manuscript or manuscript part, after its creation but before its acquisition.
    ptr: (pointer) defines a pointer to another location.
    pubPlace: (publication place) contains the name of the place where a bibliographic item was published.
    publicationStmt: (publication statement) groups information concerning the publication or distribution of an electronic or other text.
    publisher: provides the name of the organization responsible for the publication or distribution of a bibliographic item.
    q: (separated from the surrounding text with quotation marks) contains material which is marked as (ostensibly) being somehow different than the surrounding text, for any one of a variety of reasons including, but not limited to: direct speech or thought, technical terms or jargon, authorial distance, quotations from elsewhere, and passages that are mentioned but not used.
    quotation: specifies editorial practice adopted with respect to quotation marks in the original.
    quote: (quotation) contains a phrase or passage attributed by the narrator or author to some agency external to the text.
    rdg: (reading) contains a single reading within a textual variation.
    rdgGrp: (reading group) within a textual variation, groups two or more readings perceived to have a genetic relationship or other affinity.
    recordHist: (recorded history) provides information about the source and revision status of the parent manuscript description itself.
    ref: (reference) defines a reference to another location, possibly modified by additional text or comment.
    refState: (reference state) specifies one component of a canonical reference defined by the milestone method.
    refsDecl: (references declaration) specifies how canonical references are constructed for this text.
    reg: (regularization) contains a reading which has been regularized or normalized in some sense.
    relatedItem: contains or references some other bibliographic item which is related to the present one in some specified manner, for example as a constituent or alternative version of it.
    relation: (relationship) describes any kind of relationship or linkage amongst a specified group of participants.
    relationGrp: (relation group) provides information about relationships identified amongst people, places, and organizations, either informally as prose or as formally expressed relation links.
    repository: contains the name of a repository within which manuscripts are stored, possibly forming part of an institution.
    resp: (responsibility) contains a phrase describing the nature of a person's intellectual responsibility.
    respStmt: (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply.
    respons: (responsibility) identifies the individual(s) responsible for some aspect of the markup of particular element(s).
    restore: indicates restoration of text to an earlier state by cancellation of an editorial or authorial marking or instruction.
    revisionDesc: (revision description) summarizes the revision history for a file.
    role: the name of a dramatic role, as given in a cast list.
    roleDesc: (role description) describes a character's role in a drama.
    root: (root node) represents the root node of a tree.
    row: contains one row of a table.
    rs: (referencing string) contains a general purpose name or referring string.
    rubric: contains the text of any rubric or heading attached to a particular manuscript item, that is, a string of words through which a manuscript signals the beginning of a text division, often with an assertion as to its author and title, which is in some way set off from the text itself, usually in red ink, or by use of different size or type of script, or some other such visual device.
    s: (s-unit) contains a sentence-like division of a text.
    said: (speech or thought) indicates passages thought or spoken aloud, whether explicitly indicated in the source or not, whether directly or indirectly reported, whether by real people or fictional characters.
    salute: (salutation) contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc.
    samplingDecl: (sampling declaration) contains a prose description of the rationale and methods used in sampling texts in the creation of a corpus or collection.
    seal: contains a description of one seal or similar attachment applied to a manuscript.
    sealDesc: (seal description) describes the seals or other external items attached to a manuscript, either as a series of paragraphs or as a series of distinct seal elements, possibly with additional decoNotes.
    secFol: (second folio) The word or words taken from a fixed point in a codex (typically the beginning of the second leaf) in order to provide a unique identifier for it.
    seg: (arbitrary segment) represents any segmentation of text below the ‘chunk’ level.
    segmentation: describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc.
    series: (series information) contains information about the series in which a book or other bibliographic item has appeared.
    seriesStmt: (series statement) groups information about the series, if any, to which a publication belongs.
    set: (setting) contains a description of the setting, time, locale, appearance, etc., of the action of a play, typically found in the front matter of a printed performance text (not a stage direction).
    settlement: contains the name of a settlement such as a city, town, or village identified as a single geo-political or administrative unit.
    sic: (latin for thus or so ) contains text reproduced although apparently incorrect or inaccurate.
    signatures: contains discussion of the leaf or quire signatures found within a codex.
    signed: (signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.
    soCalled: contains a word or phrase for which the author or narrator indicates a disclaiming of responsibility, for example by the use of scare quotes or italics.
    sound: describes a sound effect or musical sequence specified within a screen play or radio script.
    source: describes the original source for the information contained with a manuscript description.
    sourceDesc: (source description) describes the source from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as "born digital" for a text which has no previous existence.
    sp: (speech) An individual speech in a performance text, or a passage presented as such in a prose or verse text.
    space: indicates the location of a significant space in the copy text.
    span: associates an interpretative annotation directly with a span of text.
    spanGrp: (span group) collects together span tags.
    speaker: A specialized form of heading or label, giving the name of one or more speakers in a dramatic text or fragment.
    stage: (stage direction) contains any kind of stage direction within a dramatic text or fragment.
    stamp: contains a word or phrase describing a stamp or similar device.
    summary: contains an overview of the available information concerning some aspect of an item (for example, its intellectual content, history, layout, typography etc.) as a complement or alternative to the more detailed information carried by more specific elements.
    supplied: signifies text supplied by the transcriber or editor for any reason, typically because the original cannot be read because of physical damage or loss to the original.
    support: contains a description of the materials etc. which make up the physical support for the written part of a manuscript.
    supportDesc: (support description) groups elements describing the physical support for the written part of a manuscript.
    surplus: (Texte superflu) marks text present in the source which the editor believes to be superfluous or redundant.
    surrogates: contains information about any non-digital representations of the manuscript being described which may exist in the holding institution or elsewhere.
    taxonomy: defines a typology used to classify texts either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy.
    tech: (technical stage direction) describes a special-purpose stage direction that is not meant for the actors.
    teiCorpus: contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more TEI elements, each containing a single text header and a text.
    teiHeader: (TEI Header) supplies the descriptive and declarative information making up an electronic title page prefixed to every TEI-conformant text.
    term: contains a single-word, multi-word, or symbolic designation which is regarded as a technical term.
    text: contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample.
    textClass: (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc.
    textLang: (text language) in a manuscript description, describes the languages and writing systems identified within the manuscript being described.
    time: contains a phrase defining a time of day in any format.
    timeline: (timeline) provides a set of ordered points in time which can be linked to elements of a spoken text to create a temporal alignment of that text.
    title: contains a title for any kind of work.
    titlePage: (title page) contains the title page of a text, appearing within the front or back matter.
    titlePart: contains a subsection or division of the title of a work, as indicated on a title page.
    titleStmt: (title statement) groups information about the title of a work and those responsible for its intellectual content.
    trailer: contains a closing title or footer appearing at the end of a division of a text.
    tree: encodes a tree, which is made up of a root, internal nodes, leaves, and arcs from root to leaves.
    triangle: (underspecified embedding tree, so called because of its characteristic shape when drawn) Provides for an underspecified eTree, that is, an eTree with information left out.
    typeDesc: contains a description of the typefaces or other aspects of the printing of an incunable or other printed source.
    typeNote: describes a particular font or other significant typographic feature distinguished within the description of a printed resource.
    unicodeName: (unicode property name) contains the name of a registered Unicode normative or informative property.
    value: (value) contains a single value for some property, attribute, or other analysis.
    variantEncoding: declares the method used to encode text-critical variants.
    view: describes the visual context of some part of a screen play in terms of what the spectator sees, generally independent of any dialogue.
    watermark: contains a word or phrase describing a watermark or similar device.
    when: indicates a point in time either relative to other elements in the same timeline tag, or absolutely.
    width: contains a measurement measured along the axis parallel to the bottom of the written surface, i.e. perpendicular to the spine of a book or codex.
    wit: contains a list of one or more sigla of witnesses attesting a given reading, in a textual variation.
    witDetail: (witness detail) gives further information about a particular witness, or witnesses, to a particular reading.
    witEnd: (fragmented witness end) indicates the end, or suspension, of the text of a fragmentary witness.
    witStart: (fragmented witness start) indicates the beginning, or resumption, of the text of a fragmentary witness.
    witness: contains either a description of a single witness referred to within the critical apparatus, or a list of witnesses which is to be referred to by a single sigil.
    Notes
    1.
    See also the TEI’s (implicit) position on this point: ‘we define markup, or (synonymously) encoding, as any means of making explicit an interpretation of a text’ (TEI Guidelines: v. A Gentle Introduction to XML). See also reference to Robinson and Solopova 1993/1997: 21: ‘Any primary textual source… has its own semiotic system within it.[…] The two semiotic system are materially distinct, in that text written by hand is not the same as the text on the computer screen’.
    2.
    As in the case where a document describes the ordering of parts of a text contained in another document. It is the case, for instance of Beckett's That Time where the speeches of A, B and C are obsessively first subdivided and subsequently shuffled and reshuffled by the means of sequences of letters and numbers contained in a number of documents.
    3.
    Manzoni, for instance, used to modify an old draft to see how a new variant fitted with the context before copying it into a new draft.
    4.
    Graph-like data structures have many applications. An early proposal to use such a formalism for handling textual variation is Sperberg-McQueen, C.M. (1989). A directed-graph data structure for text manipulation. In: ICCH/ALLC Conference. The Dynamic Text at the University of Toronto. http://www.w3.org/People/cmsmcq/1989/rhine-delta-abstract.html . For a recent one, see Desmond Schmidt Robert Colomb (2009): A data structure for representing multi-version texts online (in International Journal of Human-Computer Studies, Volume 67 , Issue 6 (June 2009) 497-514
    Notes
    1. (Leaves81-82) (Leaves56)


    Lou Burnard, Fotis Jannidis, Elena Pierazzo, Malte Rehbein. Date: Revised Draft