7.1. Kernel / Additions
The guidelines will provide for
- the encoding of the text itself
- the documentation of the text's source
- the documentation of the encoding itself and its peculiarities
The provisions of the guidelines can be divided into a central core
or kernel of principles and tags applicable to all texts or to the
great majority of texts (“general-purpose tags”) and various sets
of tags or encoding conventions applicable to texts in specific
languages or scripts, texts of specific text types, or texts encoded for
specific disciplinary purposes (“special-purpose tags”). Within
each set of tags, distinctions can be made between recommended and
optional practices, but only general-purpose tags will be recommended
for all texts. Each set of tags devised for a specific language,
script, text type or discipline may itself comprise a kernel of common
tags and one or more sets of optional tags extending the kernel. When
these sets of tags are used consistently in groups, the encoding
practice of individual texts can be described by listing the tag sets
used (e.g. “encoded with basic set of general-purpose tags plus level
3 of metrical and level B of lexical tags”).
7.2. Draft Table of Contents
A draft table of contents for the guidelines follows:
- 1 Principles of Text Encoding
- 1.1 Why Markup Is Necessary at All
(A brief discussion about functions of descriptive markup,
why it is not presentational, etc.)
- 1.2 The Advantages of Standardized Markup
- 2 About These Guidelines
- 2.1 Intended Applications
(Database/retrieval/analysis as well as printing and
formatting. Research community rather than commercial.
Relevance to language industries.)
- 2.2 Design Principles
(How features are defined and described in the Guidelines.)
- 2.3 Structure of the Guidelines
(Base features, optional features, "boxes" and their
base and optional features or "levels of description".
Document prolog and document body.)
- 3 SGML Markup
- 3.1 Principles and Definitions
(Introduction to SGML: tags, elements, content models,
document type declarations. Alternatives to DTDs.
SGML declarations in general.)
- 3.2 SGML Declarations for the TEI Guidelines
(Description of SGML features used in TEI Guidelines,
text of formal SGML Declaration for TEI texts.)
- 3.3 Non-SGML Declarations for TEI Texts
(How to declare what tags you have used. How to declare
use of specific levels of description, substitution of tag
names, etc. Cross-reference to later sections for details
of declarations for pre-defined material; cross-reference
to chapter 9 for full details on declaring modifications
and extensions.)
- 4 Characters and Character Sets
- 4.1 Principles and Definitions
(Characters, character sets, character repertoires.
What is ASCII. Seven-bit and eight-bit ASCII.
Standard ways of extending ASCII. Vendor-specific
non-standard extended ASCIIs. SGML-supported
character sets. EBCDIC. IBM PC character set.
Macintosh, Mac extensions by other vendors. Adobe
Postscript. Transliteration schemes. Entity
references.)
- 4.2 Recommendations
(For character sets, names of ISO sets and EBCDIC code
pages. Possibly include recommended transliterations,
sample USEMAP and CHARSET declarations, etc.? For
entities, simply list recommended name, description and
appearance of various special characters. Possibly
relegate detailed lists and code-pages to appendices.)
- 4.2.1 Recommended Character Sets
- 4.2.2 Recommended Entity Names
- 4.2.3 Declaring New Character Sets or Character Entities
- 5 Bibliographic Control of Electronic Texts
- 5.1 Principles and Definitions
- 5.2 Recommended Features and Tags
(Bibliographic identification of machine-readable text.
Bibliographic identification of source text(s).
Documenting changes to the source text during pre-editing
or data entry. Documenting changes to the machine-readable
text.)
- 5.3 Correspondence between Recommended Tags and MARC fields
- 6 Features Common to All Texts
- 6.1 Principles and Definitions
- 6.2 Recommended Features and Tags
- 6.2.1 Basic Text Structure
(Front matter, body, back matter, chapters, sections,
etc. down to paragraph level.)
- 6.2.2 Non-structural Text Segments
(Features below paragraph level, including highlighting,
emphasis, quotation, index entries, special layout,
language and script, illustrations ...)
- 6.2.3 Figures and Tables
- 6.2.4 Bibliographic References
- 6.2.5 Critical Apparatus
- 6.2.6 Parallel Texts
- 6.2.7 Cross Reference and Textual Links
- 7 Features for Specific Text Types
- 7.1 Principles and Definitions
- 7.2 Recommended Features and Tags
- 7.2.1 Mixed Corpora
- 7.2.2 Literary Texts
- 7.2.3 Technical and Scientific Texts
- 7.2.4 Historical Documents
- 7.2.5 Dictionaries and Lexica
- 7.2.6 Transcripts of Spoken Texts
- 8 Analytic and Interpretive Features
- 8.1 Principles and Definitions
- 8.2 Recommended Features and Tags
- 8.2.1 Syntactic Features
- 8.2.2 Morphological Features
- 8.2.3 Phonological Features
- 8.2.4 Lexical Features
- 9 Extending the Guidelines
- 9.1 Modifying the Guidelines
(Substituting short forms or different names for tags or
attributes. Using a tag with a different meaning.
Changing legal attribute values. Restricting where tags
can occur; allowing tags to occur in new places; doing
away with syntactic restrictions altogether.)
- 9.2 Defining Additional Features
(Adding new tags: defining where they can occur, what
can occur inside them, and what they mean. Defining new
attributes for old or new tags.)
- 9.3 Worked Example
- 10 Full Alphabetical List of Features
(A summary for each recommended tag, giving
name, definition, description, associated features and
brief example of usage)
- 11 Translation Table
(shows equivalent name in each EC language for every
name listed in sections 3.2.2 and 9. Omit in 1990?)
- 12 Use of the Guidelines for Document Interchange
(Portability. Different needs of document capture, storage,
processing, and interchange. Reducing danger of character
set confusions during document interchange.)
- Appendices
- A How the Guidelines were Developed
(brief note about TEI structure and history)
- B Mapping from the Guidelines to Other Encoding Schemes
(Translating into the Guidelines from existing encoding
schemes. Translating back out.)
- C Examples of Tagged Texts
The editors, in consultation with the committee heads and the Steering
Committee, will have primary responsibility for sections 1, 2, 12, and A
(Principles of Text Encoding, About the Guidelines, Document Interchange,
and History of the TEI). They will assemble section 10 from the
work of the committees.
The Committee on Text Documentation will have primary responsibility for
section 5 (Bibliographic Control) and will consult with the Committee
on Metalanguage and Syntax Issues on the overall organization of
section 3.3 (Non-SGML Declarations). They will advise the Text
Representation committee on section 6.2.4 (Bibliographic References).
The Committee on Text Representation will have primary responsibility
for sections 4 (Character Sets), 6 (Features Common to All Text Types),
and 7.1 through 7.4 (Corpora, Literature, Technical and Scientific
Documents, Historical Documents), as well as any other sections inserted
into section 7 on further text types. They should consult with the
Committee on Text Documentation on section 6.2.4 (Bibliographic
References) and will partially determine the content of declarations
relevant to their subject domain (section 3.3).
The Committee on Text Analysis and Interpretation will have primary
responsibility for sections 7.5 and 7.6 (Dictionaries, Spoken Texts) and
section 8 (Analytic and Interpretive Tags). They will contribute also
to section 3.3 (Non-SGML Declarations).
The Committee on Metalanguage and Syntax Issues will have primary
responsibility for sections 3, 9, and B (SGML, Extensions, and Mapping
to Other Schemes). They will collaborate with the Committee on Text
Documentation on section 3.3 (Non-SGML Declarations).
All four working committees will contribute to sections 10 (Full List
of Features) and C (Examples).
7.3. Prescription and Description
Compliance with the guidelines is necessarily a voluntary matter;
use of the term “requirement” in connection with the guidelines
must therefore not be misconstrued. Within the context of the
guidelines, however, the committees will be able to specify a mix of
requirements, recommended practices, optional features and practices,
required choices among defined alternatives (“electives”), and
possible user-defined extensions. Provision will also be made for
documentation of user-specified deviations from the recommendations of
the guidelines.