4 Default Text Structure

Table of contents

This chapter describes the default high-level structure for TEI documents. A full TEI document combines metadata describing it, represented by a teiHeader element, with the document itself, represented by one or more text elements or other elements taken from the model.resource class. That is, the TEI element is used to group together metadata about an encoded resource (in teiHeader, specified by the header module, which is fully described in chapter 2 The TEI Header) with an encoded resource. Possible encoded resources are

a logical transcription of a source document in a text element; the text element is specified along with its high-level constituents in the textstructure module and described in the remainder of the current chapter
a diplomatic transcription of a source document in a sourceDoc element, which is specified in the transcr module and described in chapter 11 Representation of Primary Sources
an encoded representation of a text-bearing object as images in a facsimile element, which is also specified in the transcr module and described in chapter 11 Representation of Primary Sources
a collection of contextual information or annotations that provides more detail about another encoded resource (whether in the same or a different TEI document) in a standOff element, which is specified in the linking module and described in section 16.10 The standOff Container
a feature system declaration which can be used to declare the use of fs elements in the rest of the document, which is specified in the iso-fs module and described in section 18.11 Feature System Declaration

In a case in which more than one resource related to the same source document share the same metadata, they may be grouped together in a TEI element following a single teiHeader.

Because the TEI can be a child of itself, a set or collection of documents may be represented by an outermost TEI element that contains a teiHeader with metadata that is applicable to the entire set or collection of transcriptions, and then a complete TEI element for each document in the collection or set; each of these TEI elements contains a teiHeader with metadata that is applicable to the individual document, and one or more text or other elements taken from the model.resource class.

A variant on this basic form, the teiCorpus, is also defined for the representation of language corpora, or other collections of encoded texts. A teiCorpus consists of its own metadata in a teiHeader, followed by one or more complete TEI elements, each combining a teiHeader with one or more elements from the model.resource class. This permits the encoder to distinguish metadata applicable to the whole collection of encoded texts, which is represented by the outermost teiHeader, from that applicable to each of the individual TEI elements within the corpus. Further information about the organization and encoding of language corpora is given in chapter 15 Language Corpora.

Alternatively, the corpus may be represented with a TEI element (perhaps with a type of corpus) in the same manner as a teiCorpus.

In summary, when the default structure module is included in a schema, the following elements are available for the representation of the outermost structure of a TEI document:

TEI (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple TEI elements may be combined within a TEI (or teiCorpus) element.
version specifies the version number of the TEI Guidelines against which this document is valid.
teiCorpus contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more TEI elements, each containing a single text header and a text.
teiHeader (TEI header) supplies descriptive and declarative metadata associated with a digital resource or set of resources.
text contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample.

As noted above, the teiHeader element is formally declared in the header module (see chapter 2 The TEI Header). A TEI document may also contain elements from the model.resource class (such as a collection of facsimile images, or a feature system declaration) if the appropriate module is included in a schema (see further 11.1 Digital Facsimiles and 18.11 Feature System Declaration respectively). By default, however, this class is not populated and hence only the elements TEI, text, and teiCorpus are available as major parts of a TEI document. These three elements are provided by the textstructure module described by the present chapter.

TEI texts may be regarded either as unitary, that is, forming an organic whole, or as composite, that is, consisting of several components which are in some important sense independent of each other. The distinction is not always entirely obvious: for example a collection of essays might be regarded as a single item in some circumstances, or as a number of distinct items in others. In such borderline cases, the encoder must choose whether to treat the text as unitary or composite; each may have advantages and disadvantages in a given situation.

Whether unitary or composite, the text is marked with the text tag and may contain front matter, a text body, and back matter. In unitary texts, the text body is tagged body; in composite texts, where the text body consists of a series of subordinate texts or groups, it is tagged group. The overall structure of any text, unitary or composite, is thus defined by the following elements:

front (front matter) contains any prefatory matter (headers, abstracts, title page, prefaces, dedications, etc.) found at the start of a document, before the main body.
body (text body) contains the whole body of a single unitary text, excluding any front or back matter.
group contains the body of a composite text, grouping together a sequence of distinct texts (or groups of such texts) which are regarded as a unit for some purpose, for example the collected works of an author, a sequence of prose essays, etc.
back (back matter) contains any appendixes, etc. following the main part of a text.

The overall structure of a unitary text is:

type	characterizes the element in some sense, using any convenient classification scheme or typology.
subtype	provides a sub-categorization of the element, if needed

org	(organization) specifies how the content of the division is organized.
sample	indicates whether this division is a sample of the original source and if so, from which part.

model.divTopPart	groups elements which can occur only at the beginning of a text division.
model.divWrapper	groups elements which can appear at either top or bottom of a textual division.

model.divBottomPart	groups elements which can occur only at the end of a text division.
model.divWrapper	groups elements which can appear at either top or bottom of a textual division.

opener	groups together dateline, byline, salutation, and similar phrases appearing as a preliminary group at the start of a division, especially of a letter.
signed	(signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.

closer	groups together salutations, datelines, and similar phrases appearing as a final group at the end of a division, especially of a letter.
postscript	contains a postscript, e.g. to a letter.
signed	(signature) contains the closing salutation, etc., appended to a foreword, dedicatory epistle, or other division of a text.
trailer	contains a closing title or footer appearing at the end of a division of a text.

argument	contains a formal list or prose description of the topics addressed by a subdivision of a text.
byline	contains the primary statement of responsibility given for a work on its title page or at the head or end of the work.
dateline	contains a brief description of the place, date, time, etc. of production of a letter, newspaper story, or other work, prefixed or suffixed to it as a kind of heading or trailer.
docAuthor	(document author) contains the name of the author of the document, as given on the title page (often but not always contained in a byline).
docDate	(document date) contains the date of a document, as given on a title page or in a dateline.
epigraph	contains a quotation, anonymous or attributed, appearing at the start or end of a section or on a title page.
meeting	contains the formalized descriptive title for a meeting or conference, for use in a bibliographic description for an item derived from such a meeting, or as a heading or preamble to publications emanating from it.
salute	(salutation) contains a salutation or greeting prefixed to a foreword, dedicatory epistle, or other division of a text, or the salutation in the closing of a letter, preface, etc.

P5: Guidelines for Electronic Text Encoding and Interchange

4 Default Text Structure

TEI: Divisions of the Body¶4.1 Divisions of the Body

TEI: Un-numbered Divisions¶4.1.1 Un-numbered Divisions

TEI: Numbered Divisions¶4.1.2 Numbered Divisions

TEI: Numbered or Un-numbered?¶4.1.3 Numbered or Un-numbered?

TEI: Partial and Composite Divisions¶4.1.4 Partial and Composite Divisions

TEI: Elements Common to All Divisions¶4.2 Elements Common to All Divisions

TEI: Headings and Trailers¶4.2.1 Headings and Trailers

TEI: Openers and Closers¶4.2.2 Openers and Closers

TEI: Arguments, Epigraphs, and Postscripts¶4.2.3 Arguments, Epigraphs, and Postscripts

TEI: Content of Textual Divisions¶4.2.4 Content of Textual Divisions

TEI: Grouped and Floating Texts¶4.3 Grouped and Floating Texts

TEI: Grouped Texts¶4.3.1 Grouped Texts

TEI: Floating Texts¶4.3.2 Floating Texts

TEI: Virtual Divisions¶4.4 Virtual Divisions

TEI: Front Matter¶4.5 Front Matter

TEI: Title Pages¶4.6 Title Pages

TEI: Back Matter¶4.7 Back Matter

TEI: Module for Default Text Structure¶4.8 Module for Default Text Structure