Text Encoding Initiative
20. The Electronic Title Page
Every TEI text has a header which provides information analogous to that provided by the title page of printed text. The header is introduced by the element <teiHeader> and has four major parts:
A corpus or collection of texts, which share many characteristics, may have one header for the corpus and individual headers for each component of the corpus. In this case the type attribute indicates the type of header.
<teiHeader type="corpus">introduces the header for corpus-level information.
Some of the header elements contain running prose which consists of one or more <p>s. Others are grouped:
The <fileDesc> element is mandatory. It contains a full bibliographic description of the file with the following elements:
A minimal header has the following structure:
<teiHeader> <fileDesc> <titleStmt> ... </titleStmt> <publicationStmt> ... <publicationStmt> <sourceDesc> ... <sourceDesc> </fileDesc> </teiHeader>
The following elements can be used in the <titleStmt>:
It is recommended that the title should distinguish the computer file from the source text, for example:
[title of source]: a machine readable transcription [title of source]: electronic edition A machine readable version of: [title of source]The <respStmt> element contains the following subcomponents:
<titleStmt> <title>Two stories by Edgar Allen Poe: a machine readable transcription</title> <author>Poe, Edgar Allen (1809-1849) <respStmt><resp>compiled by</resp> <name>James D. Benson</name></respStmt> </titleStmt>
The <editionStmt> groups information relating to one edition of a text (where edition is used as elsewhere in bibliography), and may include the following elements:
<editionStmt> <edition n="U2">Third draft, substantially revised <date>1987</date> </edition> </editionStmt>
The <publicationStmt> is mandatory. It may contain a simple prose description or groups of the elements described below:
At least one of these three elements must be present, unless the entire publication statement is in prose. The following elements may occur within them:
<publicationStmt> <publisher>Oxford University Press</publisher> <pubPlace>Oxford</pubPlace> <date>1989</date> <idno type="ISBN"> 0-19-254705-5</idno> <availability>Copyright 1989, Oxford University Press</availability> </publicationStmt>
The <notesStmt>, if used, contains one or more <note> elements which contain a note or annotation. Some information found in the notes area in conventional bibliography has been assigned specific elements in the TEI scheme.
The <sourceDesc> is a mandatory element which records details of the source or sources from which the computer file is derived. It may contain simple prose or a bibliographic citation, using one or more of the following elements:
<sourceDesc> <bibl>The first folio of Shakespeare, prepared by Charlton Hinman (The Norton Facsimile, 1968)</bibl> </sourceDesc>
<sourceDesc> <scriptStmt id="CNN12"> <bibl><author>CNN Network News <title>News headlines <date>12 Jun 1989 </bibl> </scriptStmt> </sourceDesc>
The <encodingDesc> element specifies the methods and editorial principles which governed the transcription of the text. Its use is highly recommended. It may be prose description or may contain elements from the following list:
<encodingDesc> <projectDesc>Texts collected for use in the Claremont Shakespeare Clinic, June 1990. </projectDesc> </encodingDesc>
<encodingDesc> <samplingDecl>Samples of 2000 words taken from the beginning of the text </samplingDecl> </encodingDesc>
The <editorialDecl> contains a prose description of the practices used when encoding the text. Typically this description should cover such topics as the following, each of which may conveniently be given as a separate paragraph.
<editorialDecl> <p>The part of speech analysis applied throughout section 4 was added by hand and has not been checked. <p>Errors in transcription controlled by using the WordPerfect spelling checker. <p>All words converted to Modern American spelling using Webster's 9th Collegiate dictionary. <p>All quotation marks converted to entity references &odq; and &cdq;. </editorialDecl>
The <tagsDecl> element is used to provide detailed information about the SGML tags actually appearing within a text. It may contain a simple list of elements used, with a count for each, using the following special purpose elements:
The <rendition> element is used to document different ways in which elements are rendered in the source text.
<tagsDecl> <tagUsage gi="text" occurs=1> <tagUsage gi="body" occurs=1> <tagUsage gi=p occurs="12"> <tagUsage gi="hi" occurs=6> </tagsDecl>This (imaginary) tags declaration would be appropriate for a text containing twelve paragraphs in its body, within which six <hi> elements have been marked. Note that if the <tagsDecl> element is used, it must contain a <tagUsage> element for every element tagged in the associated text element.
<refsDecl> <p>The N attribute on each DIV1 and DIV2 contains the canonical reference for each such division in the form XX.yyy where XX is the book number in roman numeral and yyy is the section number in arabic. </refsDecl>
The <classDecl> element groups together definitions or sources for any descriptive classification schemes used by other parts of the header. At least one such scheme must be provided, encoded using the following elements:
In the simplest case, the taxonomy may be defined by a bibliographic reference, as in the following example:
<classDecl> <taxonomy id="LCSH"> <bibl>Library of Congress Subject Headings </bibl> </taxonomy> </classDecl>
<taxonomy id=B> <bibl>Brown Corpus</bibl> <category id="B.A"><catDesc>Press Reportage <category id="B.A1"><catDesc>Daily</category> <category id="B.A2"><catDesc>Sunday</category> <category id="B.A3"><catDesc>National</category> <category id="B.A4"><catDesc>Provincial</category> <category id="B.A5"><catDesc>Political</category> <category id="B.A6"><catDesc>Sports</category> ... </category> <category id="B.D"><catDesc>Religion <category id="B.D1"><catDesc>Books</category> <category id="B.D2"><catDesc>Periodicals and tracts</category> </category> ... </taxonomy>
The <profileDesc> element enables information characterizing various descriptive aspects of a text to be recorded within a single framework. It has three optional components:
<creation> <date value="1992-08">August 1992</date> <name type="place">Taos, New Mexico</name> </creation>
The <textClass> element classifies a text by reference to the system or systems defined by the <classDecl> element, and contains one or more of the following elements:
<textClass> <keywords scheme="LCSH"> <list> <item>English literature -- History and criticism -- Data processing.</item> <item>English literature -- History and criticism -- Theory etc.</item> <item>English language -- Style -- Data processing.</item> </list> </keywords> </textClass>
The <revisionDesc> element provides a change log in which each change made to a text may be recorded. The log may be recorded as a sequence of <change> elements each of which contains
<revisionDesc> <change><date>6/3/91:</date> <respStmt><name>EMB</name><resp>ed.</resp></respStmt> <item>File format updated</item></change> <change><date>5/25/90:</date> <respSmt><name>EMB</name><resp>ed.</resp> <item>Stuart's corrections entered</item></change> </revisionDesc>