5 Overall Structure of a Text

5.1 Front, Body, and Back Matter

Overall, texts are divided into front matter, the body, and back matter, tagged respectively front, body, and back. Front and back matter are distinct only by virtue of their location: they can contain exactly the same kinds of material. The overall structure of a typical book, for example, would be something like this:

 
<text>
<front> <!-- front matter here:  title
page, dedication, preface, etc. ... -->
</front>
<body>  <!-- main body of edition here ... -->
</body>
<back>  <!-- back matter here:  index,
bibliography, etc.... -->
</back>
</text>

5.2 Text Divisions

Within the body, or within the front and back matter, text may be subdivided into text divisions (parts, chapters, sections; act, scene; canto, stanza; etc.). For such divisions, the single element div should be used; subsections are tagged with nested div elements. The type attribute may be used to indicate that the division has a particular name or type; later divisions will take the same type value unless a different value is specified. Within a text division, paragraphs or paragraph-level elements (e.g. note, list) may occur.

 
<div type='Section' n=1>
<p>The eighteenth article of amendment
to the Constitution of the United States
is hereby repealed.</p></div>
<div n=2><p>The transportation or importation
into any State, Territory, or possession of
the United States for delivery or use
therein of intoxicating liquors, in
violation of the laws thereof, is hereby
prohibited.</p></div>
<div n=3><p>This article shall be inoperative
unless it shall have been ratified as an
amendment to the Constitution by conventions
in the several States, as provided in the
Constitution, within seven years from
the date of the submission hereof to the
States by the Congress.</p></div>

In cases where text divisions have no headings, or have only headings consisting of their type value and a number, no heading need be given, as shown above. If desired, however, the heading may be given explicitly:

 
<div type='Section' n=1>
<head>Section 1.</head>
<p>The eighteenth article of amendment
to the Constitution of the United States
is hereby repealed.</p></div>
<div n=2><head>Section 2.</head>
<p>The transportation ...</p></div>
<div n=3><head>Section 3.</head>
<p>This article shall be inoperative
unless ...</p></div>

The headings in the preceding example are fixed text (the word Section followed by the value of the n attribute), which any moderately intelligent SGML software could generate mechanically. In general, document management is more convenient, and results are more consistent, if such material is not transcribed as part of the text, but is generated by software when the text is displayed or printed. Inconsistency in the source, of course, may be of interest, and if so it should be captured explicitly.

[The full TEI encoding scheme includes specialized elements for anthologies (texts containing other texts), epigraphs, datelines, bylines, salutations, signatures, and groups of headings, datelines, etc. at the beginning or ending of a text division.]

5.3 Title Pages

The TEI encoding scheme defines specialized tags for transcribing title pages, in order to ensure that processing software can easily locate and identify the author, title, and date of the document as given on its title page. The title page itself, and its major component parts, are illustrated in this example:

 
<titlePage>
<docTitle>
<titlePart type='main'>The Public Papers
and Addresses of
Franklin D. Roosevelt</titlePart>
<titlePart type='sub'>With a special introduction
and explanatory notes by
President Roosevelt</titlePart>
<titlePart type='vol number'>Volume Two</titlePart>
<titlePart type='vol title'>
The Year of Crisis
1933</titlePart>
</docTitle>
<docImprint>
  <publisher>Random House</publisher>
  <pubPlace>New York</pubPlace>
  <docDate>1938</docDate>
</docImprint>
</titlePage>

The titlePart element is used both for the different parts of the document title (as shown) and also for miscellaneous parts of the title page which are neither document title, nor document author, nor imprint information.

[In addition to the tags shown here, the full TEI scheme defines a docEdition element for tagging information like "second revised and expanded edition".]