22 Documentation Elements

Indice

This chapter describes a module which may be used for the documentation of the XML elements and element classes which make up any markup scheme, in particular that described by the TEI Guidelines, and also for the automatic generation of schemas or DTDs conforming to that documentation. It should be used also by those wishing to customize or modify these Guidelines in a conformant manner, as further described in chapters 23.2 Personalization and Customization and 23.3 Conformance and may also be useful in the documentation of any other comparable encoding scheme, even though it contains some aspects which are specific to the TEI and may not be generally applicable.

An overview of the kind of processing environment envisaged for the module described by this chapter may be helpful. In the remainder of this chapter we refer to software which provides such a processing environment as an ODD processor.79 Like any other piece of XML software, an ODD processor may be instantiated in many ways: the current system uses a number of XSLT stylesheets which are freely available from the TEI, but this specification makes no particular assumptions about the tools which will be used to provide an ODD processing environment.

As the name suggests, an ODD processor uses a single XML document to generate multiple outputs. These outputs will include:

The input required to generate these outputs consists of running prose, and special purpose elements documenting the components (elements, classes, etc.) which are to be declared in the chosen schema language. All of this input is encoded in XML using the module defined by this chapter. In order to support more than one schema language, this module uses a comparatively high-level model which can then be mapped by an ODD processor to the specific constructs appropriate to the schema language in use. Although some modern schema languages such as RELAX NG or W3C Schema natively support self-documentary features of this kind, we have chosen to retain the ODD model, if only for reasons of compatibility with earlier versions of these Guidelines. We do however use the ISO standard XML schema language RELAX NG (http://www.relaxng.org) as a means of declaring content models, rather than inventing a completely new XML-based representation for them.

In the TEI abstract model, a markup scheme (a schema) consists of a number of discrete modules, which can be combined more or less as required. Each major chapter of these Guidelines defines a distinct module. Each module declares a number of elements specific to that module, and may also populate particular classes. All classes are declared globally; particular modules extend the meaning of a class by adding elements or attributes to it. Wherever possible, element content models are defined in terms of classes rather than in terms of specific elements. Modules can also declare particular patterns, which act as short-cuts for commonly used content models or class references.

In the present chapter, we discuss the elements needed to support this system. In addition, section 22.1 Phrase Level Documentary Elements discusses some general purpose elements which may be useful in any kind of technical documentation, wherever there is need to talk about technical features of an XML encoding such as element names and attributes. Section 22.2 Modules and Schemas discusses the elements which are used to document XML modules and their high-level components. Section 22.3 Specification Elements discusses the elements which document XML elements and their attributes, element classes, and generic patterns or macros. Finally, section 22.7 Module for Documention Elements gives an overview of the whole module.

22.1 Phrase Level Documentary Elements

22.1.1 Phrase Level Terms

In any kind of technical documentation, the following phrase-level elements may be found useful for marking up strings of text which need to be distinguished from the running text because they come from some formal language:
  • code contiene un codice alfabetico derivante da un linguaggio formale, per esempio un linguaggio di programmazione
    lang (linguaggio formale) è il nome che identifica il linguaggio formale nel quale viene espresso il codice
  • ident (identificatore) contiene un identificatore o nome assegnato a un dato oggetto in un linguaggio formale
Like other phrase-level elements used to indicate the semantics of a typographically distinct string, these are members of the model.emph class. They are available anywhere that running prose is permitted when the module defined by this chapter is included in a schema.
The <code> and <ident> elements are intended for use when citing brief passages in some formal language such as a programming language, as in the following example:
<p>If the variable <ident>z</ident> has a value of zero, a statement
such as <code>x=y/z</code> will usually cause a fatal error.</p>

If the cited phrase is a mathematical or chemical formula, the more specific <formula> element defined by the figures module (14.2 Formulæ and Mathematical Expressions) may be more appropriate.

A further group of similar phrase-level elements is also defined for the special case of representing parts of an XML document:
  • att (attributo) contiene il nome di un attributo che compare all'interno del testo
  • gi (identificatore generico) contiene il nome (identificatore generico) di un elemento
  • tag contiene il testo completo di un marcatore iniziale o finale, ivi comprese eventuali descrizioni di attributi ma esclusi i caratteri di delimitazione di inizio e fine del marcatore
  • val (valore) contiene un unico valore di attributo
These elements constitute the model.phrase.xml class, which is also a subclass of model.phrase. They are also available anywhere that running prose is permitted when the module defined by this chapter is included in a schema.
As an example of the recommended use of these elements, we quote from an imaginary TEI working paper:
<p>The <gi>gi</gi> element is used to tag
element names when they appear in the text; the
<gi>tag</gi> element however is used to show how a tag as
such might appear. So one might talk of an occurrence of the
<gi>blort</gi> element which had been tagged
<tag>blort type='runcible'</tag>. The
<att>type</att> attribute may take any name token as
value; the default value is <val>spqr</val>, in memory of
its creator.</p>
Within technical documentation, it is also often necessary to provide more extended examples of usage or to present passages of markup for discussion. The following special elements are provided for these purposes:
  • eg (esempio) contiene un qualsiasi esempio
  • egXML (esempio di XML) contiene un unico esempio ben formato secondo il linguaggio XML che illustra l'impiego di un elemento o attributo XML

Like the <code> element, the <egXML> element is used to mark strings of formal code, or passages of XML markup. The <eg> element may be used to enclose any kind of example, which will typically be rendered as a distinct block, possibly using particular formatting conventions, when the document is processed. It is a specialised form of the more general <q> element provided by the TEI core module. In documents containing examples of XML markup, the <egXML> element should be used for preference, as further discussed below in 22.4.2 Exemplification of Components, since the content of this element can be checked for well-formedness.

These elements are members of the class att.xmlspace which provides the following attribute:
  • att.xmlspace groups TEI elements for which it is reasonable to specify whitespace management using the W3C-defined xml:space attribute.
    xml:spacesegnala la volontà di far sì che la spaziatura sia mantenuta da qualsivoglia applicazione.

These elements are added to the class model.egLike when this module is included in a schema. That class is a part of the general model.inter class, thus permitting <eg> or <egXML> elements to appear either within or between paragraph-like elements.

22.1.2 Element and Attribute Descriptions

Within the body of a document using this module, the following elements may be used to reference parts of the specification elements discussed in section 22.3 Specification Elements, in particular the brief prose descriptions these provide for elements and attributes.
  • specList (lista di specifiche) indica dove all'interno della documentazione deve essere inserita una determinata lista di descrizioni
  • specDesc/ (descrizione di elemento o classe) indica il punto del documento nel quale deve essere inserita la descrizione di un dato elemento o di una data classe
TEI practice requires that a <specList> listing the elements under discussion introduce each subsection of a module's documentation. The source for the present section, for example, begins as follows:
<div3>
 <head>Element and attribute descriptions</head>
 <p>Within the body of a document using this module, the following
   elements may be used to reference parts of the specification elements
   discussed in section <ptr target="#TDcrystals"/>, in particular the
   brief prose descriptions these provide for elements and attributes.
 <specList>
   <specDesc key="specList"/>
   <specDesc key="specDesc"/>
  </specList>
 </p>
 <p>TEI practice requires that a <gi>specList</gi> listing the elements
   ...
 </p>
<!-- ... -->
</div3>

When formatting the <ptr> element in this example, an ODD processor might simply generate the section number and title of the section referred to, perhaps additionally inserting a link to the section. In a similar way, when processing the <specDesc> elements, an ODD processor must recover relevant details of the elements being specified (<specList> and <specDesc> in this case) from their associated declaration elements: typically, the details recovered will include a brief description of the element and its attributes. These, and other data, will be stored in a specification element elsewhere within the current document, or they may be supplied by the ODD processor in some other way, for example from a database. For this reason, the link to the required specification element is always made using a TEI-defined key rather than an XML IDREF value. The ODD processor uses this key as a means of accessing the specification element required. There is no requirement that this be performed using the XML ID/IDREF mechanism, but there is an assumption that the identifier be unique.

A <specDesc> generates in the documentation the identifier, and also the contents of the <desc> child of whatever specification element is indicated by its key attribute, as in the example above. Documentation for any attributes specified by the atts attribute will also be generated as an associated attribute list, .

22.2 Modules and Schemas

As mentioned above, the primary purpose of this module is to facilitate the documentation of an XML schema derived from the TEI Guidelines. The following elements are provided for this purpose:
  • schemaSpec (specifica dello schema) genera uno schema TEI-conforme e la relativa documentazione
  • moduleSpec (specifica del modulo) documenta struttura, contenuto e scopo di un unico modulo, cioè un gruppo di dichiarazioni specificamente indicato ed esternamente visibile
  • moduleRef (riferimento al modulo) indica un modulo da includere all'interno di uno schema
  • specGrp (gruppo di istruzioni) contiene un qualsiasi raggruppamento funzionale di istruzioni per l'uso all'interno del modulo corrente
  • specGrpRef/ (riferimento a gruppo di istruzioni) indica il punto in cui devono essere inserite le dichiarazioni contenute nello <specGrp> indicato
  • attRef/ (puntatore dell'attributo) rimanda alla definizione di un attributo o gruppo di attributi
A module is a convenient way of grouping together element and other declarations and associating an externally-visible name with the group. A specification group performs essentially the same function, but the resulting group is not accessible outside the scope of the ODD document in which it is defined, whereas a module can be accessed by name from any TEI schema.Modules, elements, and their attributes, element classes, and patterns are all individually documented using further elements described in section 22.3 Specification Elements below; part of that specification includes the name of a module to which the component belongs. An ODD processor generating XML DTD or schema fragments from a document marked up according to the recommendations of this chapter will generate such fragments for each <moduleSpec> element found. For example, the chapter documenting the TEI module for names and dates contains a module specification like the following:
<moduleSpec xml:id="XDNDident="namesdates">
 <altIdent type="FPI">Names and Dates</altIdent>
 <desc>Additional elements for names and dates</desc>
</moduleSpec>
together with specifications for all the elements, classes, and patterns which make up that module, expressed using <elementSpec>, <classSpec>, or <macroSpec> elements as appropriate. (These elements are discussed in section 22.3 Specification Elements below.) Each of those specifications carries a module attribute, the value of which is namesdates. An ODD processor encountering the <moduleSpec> element above can thus generate a schema fragment for the TEI namesdates module that includes declarations for all the elements (etc.) which reference it.

In most realistic applications, it will be desirable to combine more than one module together to form a complete schema. A schema consists of references to one or more modules or specification groups, and may also contain explicit declarations or redeclarations of elements (see further 22.5 Building a Schema). Any combination of modules can be used to create a schema: the distinction between base and additional tagsets in earlier versions of the TEI scheme has not been carried forward into P5.

A schema can combine references to TEI modules with references to other (non-TEI) modules using different namespaces, for example to include mathematical markup expressed using MathML in a TEI document. By default, the effect of combining modules is to allow all of the components declared by the constituent modules to coexist (where this is syntactically possible: where it is not — for example, because of name clashes — a schema cannot be generated). It is also possible to over-ride declarations contained by a module, as further discussed in section 22.5 Building a Schema

It is often convenient to describe and operate on sets of declarations smaller than the whole, and to document them in a specific order: such collections are called specGrps (specification groups). Individual <specGrp> elements are identified using the global xml:id attribute, and may then be referenced from any point in an ODD document using the <specGrpRef> element. This is useful if, for example, it is desired to describe particular groups of elements in a specific sequence. Note however that the order in which element declarations appear within the schema code generated from a <moduleSpec> element is not in general affected by the order of declarations within a <specGrp>.

An ODD processor will generate a piece of schema code corresponding with the declarations contained by a <specGrp> element in the documentation being output, and a cross-reference to such a piece of schema code when processing a <specGrpRef>. For example, if the input text reads
<p>This module contains three red elements:
<specGrp xml:id="RED">
  <elementSpec ident="beetroot">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="east">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="rose">
<!-- ... -->
  </elementSpec>
 </specGrp>
and two blue ones:
<specGrp xml:id="BLUE">
  <elementSpec ident="sky">
<!-- ... -->
  </elementSpec>
  <elementSpec ident="bayou">
<!-- ... -->
  </elementSpec>
 </specGrp>
</p>
then the output documentation will replace the two <specGrp> elements above with a representation of the schema code declaring the elements <beetroot>, <east>, and <rose> and that declaring the elements <sky> and <bayou> respectively. Similarly, if the input text contains elsewhere a passage such as
<div>
 <head>An overview of the imaginary module</head>
 <p>The imaginary module contains declarations for coloured things:
 <specGrpRef target="#RED"/>
  <specGrpRef target="#BLUE"/>
 </p>
</div>
then the <specGrpRef> elements may be replaced by an appropriate piece of reference text such as ‘The RED elements were declared in section 4.2 above’, or even by a copy of the relevant declarations. As stated above, the order of declarations within the imaginary module described above will not be affected in any way. Indeed, it is possible that the imaginary module will contain declarations not present in any specification group, or that the specification groups will refer to elements that come from different modules. Specification groups are always local to the document in which they are defined, and cannot be referenced externally (unlike modules).

22.3 Specification Elements

The following elements are used to specify elements, classes, and patterns for inclusion in a given module:
  • elementSpec (specifica dell'elemento) documenta struttura, contenuto e scopo di un unico tipo di elemento
  • classSpec (indicazione di classe) contiene informazioni relative a una classe di elementi TEI, cioè un gruppo di elementi che compaiono insieme in modelli di contenuto, oppure che hanno alcuni attributi in comune, o entrambe le cose
    generateindica quale alternanza e quali tipi di sequenza stabilire per una classe di modelli; di norma sono indicate tutte le variazioni
  • macroSpec (indicazione di macro) documenta la funzione e l'applicazione di un pattern

Unlike most elements in the TEI scheme, each of these elements has a fairly rigid internal structure consisting of a large number of child elements which are always presented in the same order. For this reason, we refer to them metaphorically as ‘crystals’. Furthermore, since these elements all describe markup objects in broadly similar ways, they have several child elements in common. In the remainder of this chapter, we discuss first the elements which are common to all the specification elements, and then those which are specific to a particular type.

Specification elements may appear at any point in an ODD document, both between and within paragraphs as well as inside a <specGrp> element, but the specification element for any particular component may only appear once (except in the case where a modification is being defined; see further 22.5 Building a Schema). The order in which they appear will not affect the order in which they are presented within any schema module generated from the document. In documentation mode, however, an ODD processor will output the schema declarations corresponding with a specification element at the point in the text where they are encountered, provided that they are contained by a <specGrp> element, as discussed in the previous section. An ODD processor will also associate all declarations found with the nominated module, thus including them within the schema code generated for that module, and it will also generate a full reference description for the object concerned in a catalogue of markup objects. These latter two actions always occur irrespective of whether or not the declaration is included in a <specGrp>.

22.4 Common Elements

This section discusses the child elements common to all of the specification elements. These child elements are used to specify the naming, description, exemplification, and classification of the specification elements.

22.4.1 Description of Components

  • remarks contiene un qualsiasi commento o discussione relativi all'utilizzo di elementi, attributi, classi o entità non altrimenti documentati nell'elemento che li contiene
  • listRef (lista di riferimenti) fornisce una lista di riferimenti significativi a punti, nel documento corrente o altrove, in cui è discusso l'elemento in questione
One or more <desc> elements defined by the core module may be used to provide a brief characterization of the intended function of the element, class, value etc. being documented, as in the following example:
<elementSpec module="dramausage="optident="actor">
 <desc>Name of an actor appearing within a cast list.</desc>
 <desc xml:lang="ja"> 登場人物リスト中にある役者名を示す.</desc>
 <desc xml:lang="it">nome di un attore che appare nella lista dei personaggi.</desc>
<!-- ... -->
</elementSpec>
The <remarks> element contains any additional commentary about how the item concerned may be used, details of implementation-related issues, suggestions for other ways of treating related information etc., as in the following example:
<elementSpec module="coreident="foreign">
<!--... -->
 <remarks>
  <p>This element is intended for use only where no other element
     is available to mark the phrase or words concerned. The global
  <att>xml:lang</att> attribute should be used in preference to this element
     where it is intended to mark the language of the whole of some text
     element.</p>
  <p>The <gi>distinct</gi> element may be used to identify phrases
     belonging to sublanguages or registers not generally regarded as true
     languages.</p>
 </remarks>
<!--... -->
</elementSpec>
A specification element will usually conclude with a list of references, each tagged using the standard <ptr> element, and grouped together into a <listRef> element: in the case of the <foreign> element discussed above, the list is as follows:
<listRef>
 <ptr target="#COHQHF"/>
</listRef>
where the value COHQF is the identifier of the section in the Guidelines where this element is fully documented.

22.4.2 Exemplification of Components

  • exemplum contiene un unico esempio che illustra l'utilizzo di un elemento insieme ad eventuali paragrafi di discussione
  • eg (esempio) contiene un qualsiasi esempio
  • egXML (esempio di XML) contiene un unico esempio ben formato secondo il linguaggio XML che illustra l'impiego di un elemento o attributo XML

The <exemplum> element is used to combine a single illustrative example with an optional paragraph of commentary following or preceding it. The illustrative example itself may be marked up using either the <eg> or the <egXML> element.

If an example contains XML markup, it should be marked up using the <egXML> element. In such a case, it will clearly be necessary to distinguish the markup within the example from the markup of the document itself. In an XML schema environment, this is easily done by using a different name space for the <egXML> element. For example:
<p>The <gi>term</gi> element may be used to mark any technical term, thus : <egXML xmlns="http://www.tei-c.org/ns/Examples"> This <term>recursion</term> is giving me a headache.</egXML></p>
Alternatively, the XML tagging within an example may be ‘escaped’, either by using entity references, or by wrapping the whole example in a CDATA marked section:
<p>The <gi>term</gi> element may be used to mark any technical term, thus : <egXML xmlns="http://www.tei-c.org/ns/Examples"> This &lt;term&gt;recursion&lt;/term&gt; is giving me a headache.</egXML></p>
or, equivalently:
<p>The <gi>term</gi> element may be used to mark any technical term, thus : <egXML xmlns="http://www.tei-c.org/ns/Examples"><![CDATA[ This <term>recursion</term> is giving me a headache.]]></egXML></p>
However, escaping the markup in this way will make it impossible to validate, and should therefore generally be avoided.

If the XML contained in an example is not well-formed then it must either be enclosed in a CDATA marked section, or ‘escaped’ as above: this applies whether the <eg> or <egXML> is used. If it is well-formed but not valid, then it should be enclosed in a CDATA marked section within an <egXML>.

An <egXML> element should not be used to tag non-XML examples: the general purpose <eg> or <q> elements should be used for such purposes.

22.4.3 Classification of Components

In the TEI scheme elements are assigned to one or more classes, which may themselves have subclasses. The following elements are used to indicate class membership:
  • classes indica tutte le classi delle quali l'elemento o la classe indicati sono un membro o una sottoclasse
  • memberOf specifica l'appartenenza a una classe dell'elemento o classe genitori
    keyindica l'identificatore di una classe di cui l'elemento o classe indicati sono un membro o una sottoclasse

The <classes> element appears within either the <elementSpec> or <classSpec> element. It specifies the classes of which the element or class concerned is a member by means of one or more <memberOf> child elements. Each such element references a class by means of its key attribute. Classes themselves are defined by the <classSpec> element described in section 22.4.6 Element Classes below.

For example, to show that the element <gi> is a member of the class model.phrase.xml, the <elementSpec> which documents this element contains the following <classes> element:
<classes>
 <memberOf key="model.phrase.xml"/>
</classes>

22.4.4 Element Specifications

The <elementSpec> element is used to document an element type, together with its associated attributes. In addition to the elements listed above, it may contain the following subcomponents:
  • content (dichiarazione dello schema) contiene il testo di una dichiarazione dello schema utilizzato
  • attList contiene la documentazione relativa agli attributi associati all'elemento in questione sotto forma di una serie di elementi attDef
    org (organizzazione) indica se gli attributi contenuti nella lista sono tutti disponibili (org="gruppo") o se ne è disponibile solo uno (org="scelta")

The content of the element <content> may be expressed in one of two ways. It may use a schema language of some kind, as defined by a pattern called macro.schemaPattern, which is provided by the module defined in this chapter. Alternatively, the legal content for an element may be fully specified using the <valList> element, described in 22.4.5 Attribute List Specification below.

In the case of the TEI Guidelines, element content models are defined using RELAX NG patterns, but the user may over-ride this by redefining this pattern.

Here is a very simple example
<content>
 <rng:text/>
</content>
This content model uses the RELAX NG namespace, and will be copied unchanged to the output when RELAX NG schemas are being generated. When an XML DTD is being generated, an equivalent declaration (in this case (#PCDATA)) will be output.
Here is a more complex example:
<content>
 <rng:group>
  <rng:ref name="fileDesc"/>
  <rng:zeroOrMore>
   <rng:ref name="model.headerPart"/>
  </rng:zeroOrMore>
  <rng:optional>
   <rng:ref name="revisionDesc"/>
  </rng:optional>
 </rng:group>
</content>
This is the content model for the <teiHeader> element, expressed in the RELAX NG syntax, which again is copied unchanged to the output during schema generation. The equivalent DTD notation generated from this is (fileDesc, (%model.headerPart;)*, revisionDesc?).

The RELAX NG language does not formally distinguish element names, attribute names, class names, or macro names: all names are patterns which are handled in the same way, as the above example shows. Within the TEI scheme, however, different naming conventions are used to distinguish amongst the objects being named. Unqualified names (fileDesc, revisionDesc) are always element names. Names prefixed with model. or att. (e.g. model.headerPart are always class names. In DTD language, classes are represented by parameter entities (%model.headerPart; in the above example); see further 1 The TEI Infrastructure.

22.4.5 Attribute List Specification

The <attList> element is used to document information about a collection of attributes, either within an <elementSpec>, or within a <classSpec>. An attribute list can be organized either as a group of attribute definitions, all of which are understood to be available, or as a choice of attribute definitions, of which only one is understood to be available. An attribute list may also contain nested attribute lists.

The <attDef> element is used to document a single attribute, using an appropriate selection from the common elements already mentioned and the following which are specific to attributes:
  • attDef (definizione di attributo) contiene la definizione di un unico attributo
    usageindica il carattere facoltativo di un attributo o elemento
  • datatype indica il valore dichiarato di un attributo facendo riferimento a un qualsiasi tipo di dati definito nel linguaggio scelto per lo schema
  • defaultVal (valore predefinito) specifica il valore predefinito dichiarato per un attributo
  • valDesc (descrizione del valore) specifica un qualsiasi vincolo di tipo semantico o sintattico rispetto al valore che un attributo può assumere, aggiungendo informazioni rispetto a quanto contenuto nell'elemento datatype
  • valList (lista di valori) contiene uno o più elementi <valItem> che definiscono i valori possibili per un attributo
  • valItem contiene un unico valore e un'unica coppia di glosse per un attributo

The <attList> within an <elementSpec> is used to specify only the attributes which are specific to that particular element. Instances of the element may carry other attributes which are declared by the classes of which the element is a member. These extra attributes, which are shared by other elements, or by all elements, are specified by an <attList> contained within a <classSpec> element, as described in section 22.4.6 Element Classes below.

22.4.5.1 Datatypes
The <datatype> element is used to state what kind of value an attribute may have, using whatever facilities are provided by the underlying schema language. For the TEI scheme, expressed in RELAX NG, elements from the RELAX NG namespace may be used, for example
<datatype>
 <rng:text/>
</datatype>
permits any string of Unicode characters not containing markup, and is thus the equivalent of CDATA in DTD language.
The RELAX NG language also provides support for a number of primitive datatypes which may be specified here, using the <rng:data> element: thus one may write
<datatype>
 <rng:data type="Boolean"/>
</datatype>
to specify that an element or attribute's contents should conform to the W3C definition for Boolean.
Although only one child element may be given, this might be a selector such as rng:choice to indicate multiple possibilities:
<datatype>
 <rng:choice>
  <rng:data type="Date"/>
  <rng:data type="Float"/>
 </rng:choice>
</datatype>
which would permit either a date or a real number. In fact, the child element might be a rng:list element to indicate that a sequence of values is required, a rng:param element to specify a regular expression, or even a list of explicit rng:values. Such usages are permitted by the scheme documented here, but are not recommended when it is desired to remain independent of a particular schema language, since the full generality of one schema language cannot readily be converted to that of another. In the TEI abstract model, datatyping should preferably be carried out either by explicit enumeration of permitted values (using the TEI-specific <valList> element described below), or by definition of an explicit pattern, using the TEI-specific <macroSpec> element discussed further in section 22.4.7 Pattern Documentation.
22.4.5.2 Value Specification
The <valDesc> element may be used to describe constraints on data content in an informal way: for example
<valDesc>must point to another <gi>align</gi>
element logically preceding this
one.</valDesc>
<valDesc>Values should be Library of Congress subject
headings.</valDesc>
<valDesc>A bookseller's surname,
taken from the list in <title>Pollard and Redgrave</title>
</valDesc>

As noted above, the <datatype> element constrains the possible values for an attribute. The <valDesc> element can be used to describe further constraints. For example, to specify that an attribute age can take positive integer values less than 100, the datatype data.numeric might be used in combination with a <valDesc> such as ‘values must be positive integers less than 100’.

More usually, however, where constraints on values are explicitly enumerated, the <valList> element is used, as in the following example:
<valList type="closed">
 <valItem ident="req">
  <gloss>required</gloss>
 </valItem>
 <valItem ident="mwa">
  <gloss>mandatory when applicable</gloss>
 </valItem>
 <valItem ident="rec">
  <gloss>recommended</gloss>
 </valItem>
 <valItem ident="rwa">
  <gloss>recommended when applicable</gloss>
 </valItem>
 <valItem ident="opt">
  <gloss>optional</gloss>
 </valItem>
</valList>
Since this value list specifies that it is of type closed, only the values enumerated and glossed above are legal, and an ODD processor will typically enforce these constraints in the schema fragment generated.

The <valList> element is also used to provide illustrative examples of the kinds of values expected. In such cases the type attribute will have the value open and the datatype will usually be data.enumerated.

Note that the <gloss> element is needed to explain the significance of the identifier for an item only when this is not apparent, for example because it is abbreviated, as in the above example. It should not be used to provide a full description of the intended meaning (this is the function of the <desc> element), nor to comment on equivalent values in other schemes (this is the purpose of the <equiv> element).

22.4.5.3 Examples
The following <attList> demonstrates some of the possibilities; for more detailed examples, consult the tagged version of the reference material in these Guidelines.
<attList>
 <attDef ident="type">
  <desc>describes the form of the list.</desc>
  <datatype>
   <rng:text/>
  </datatype>
  <defaultVal>simple</defaultVal>
  <valList type="semi">
   <valItem ident="ordered">
    <desc>list items are numbered or lettered. </desc>
   </valItem>
   <valItem ident="bulleted">
    <desc>list items are marked with a bullet or other
         typographic device. </desc>
   </valItem>
   <valItem ident="simple">
    <desc>list items are not numbered or bulleted.</desc>
   </valItem>
   <valItem ident="gloss">
    <desc>each list item glosses some term or
         concept, which is given by a label element preceding
         the list item.</desc>
   </valItem>
  </valList>
  <remarks>
   <p>The formal syntax of the element declarations allows
   <gi>label</gi> tags to be omitted from lists tagged <tag>list
         type="gloss"</tag>; this is however a semantic error.</p>
  </remarks>
 </attDef>
</attList>
In the following example, the org attribute is used to indicate that instances of the element concerned may bear either a bar attribute or a baz attribute, but not both. The bax attribute is always available:
<attList>
 <attDef ident="bax">
<!-- ... -->
 </attDef>
 <attList org="choice">
  <attDef ident="bar">
<!-- ... -->
  </attDef>
  <attDef ident="baz">
<!-- ... -->
  </attDef>
 </attList>
</attList>

22.4.6 Element Classes

The element <classSpec> is used to document an element class or ‘class’, as defined in section 1.3 The TEI Class System. It has the following components, additional to those already mentioned:
  • classSpec (indicazione di classe) contiene informazioni relative a una classe di elementi TEI, cioè un gruppo di elementi che compaiono insieme in modelli di contenuto, oppure che hanno alcuni attributi in comune, o entrambe le cose
    typeindica se si tratta di una classe di modelli o di attributi
  • attList contiene la documentazione relativa agli attributi associati all'elemento in questione sotto forma di una serie di elementi attDef
A class specification does not list all of its members. Instead, its members declare that they belong to it by means of a <classes> element contained within the relevant <elementSpec>. This will contain a <memberOf> element for each class of which the relevant element is a member, supplying the name of the relevant class. For example, the <elementSpec> for the element <hi> contains the following:
<classes>
 <memberOf key="model.hiLike"/>
</classes>
This indicates that the <hi> element is a member of the class with identifier model.hiLike. The <classSpec> element that documents this class contains the following declarations:
<classSpec type="modelident="model.hiLike">
 <desc>groups phrase-level elements related to highlighting that have
   no specific semantics </desc>
 <classes>
  <memberOf key="model.highlighted"/>
 </classes>
</classSpec>
which indicate that the class model.hiLike is actually a member (or subclass) of the class model.highlighted.

The attribute type is used to distinguish between ‘model’ and ‘attribute’ classes. In the case of attribute classes, the attributes provided by membership in the class are documented by an <attList> element contained within the <classSpec>. In the case of model classes, no further information is neeeded to define the class beyond its description, its identifier, and optionally any classes of which it is a member.

When a model class is referenced in the content model of an element (i.e. in the <content> of an <elementSpec>), its meaning will depend on the name used to reference the class.

If the reference simply takes the form of the class name, it is interpreted to mean an alternated list of all the current members of the class. For example, suppose that the members of the class model.hiLike are elements <hi>, <it>, and <bo>. Then a content model such as
<content>
 <rng:zeroOrMore>
  <rng:ref name="model.hiLike"/>
 </rng:zeroOrMore>
</content>
would be equivalent to the explicit content model:
<content>
 <rng:zeroOrMore>
  <rng:choice>
   <rng:ref name="hi"/>
   <rng:ref name="it"/>
   <rng:ref name="bo"/>
  </rng:choice>
 </rng:zeroOrMore>
</content>
(or, to use RELAX NG compact syntax, (hi|it|bo)*). However, a content model referencing the class as model.hiLike_sequence would be equivalent to the following explicit content model:
<content>
 <rng:zeroOrMore>
  <rng:ref name="hi"/>
  <rng:ref name="it"/>
  <rng:ref name="bo"/>
 </rng:zeroOrMore>
</content>
(or, in RELAX NG compact syntax, (hi,it,bo)*.
The following suffixes, appended with an underscore, can be given to a class name when it is referenced in a content model:
alternation
members of the class are alternatives
sequence
members of the class are to be provided in sequence
sequenceOptional
members of the class may be provided, in sequence, but are optional
sequenceOptionalRepeatable
members of the class may be provided one or more times, in sequence, but are optional.
sequenceRepeatable
members of the class must be provided one or more times, in sequence
Thus a reference to model.hiLike_sequenceOptional in a content model would be equivalent to:
<rng:zeroOrMore>
 <rng:optional>
  <rng:ref name="hi"/>
 </rng:optional>
 <rng:optional>
  <rng:ref name="it"/>
 </rng:optional>
 <rng:optional>
  <rng:ref name="bo"/>
 </rng:optional>
</rng:zeroOrMore>
A reference to model.hiLike_sequenceRepeatable would however be equivalent to:
<rng:zeroOrMore>
 <rng:oneOrMore>
  <rng:ref name="hi"/>
 </rng:oneOrMore>
 <rng:oneOrMore>
  <rng:ref name="it"/>
 </rng:oneOrMore>
 <rng:oneOrMore>
  <rng:ref name="bo"/>
 </rng:oneOrMore>
</rng:zeroOrMore>
and a reference to model.hiLike_sequenceOptionalRepeatable would be equivalent to:
<rng:zeroOrMore>
 <rng:zeroOrMore>
  <rng:ref name="hi"/>
 </rng:zeroOrMore>
 <rng:zeroOrMore>
  <rng:ref name="it"/>
 </rng:zeroOrMore>
 <rng:zeroOrMore>
  <rng:ref name="bo"/>
 </rng:zeroOrMore>
</rng:zeroOrMore>

The ‘sequence’ in which members of a class appear in a content model when one of the sequence options is used is that in which the elements are declared.

In principal, all these possibilities are available to any element making reference to any class. The <classSpec> element defining the class may however limit the possibilities by means of its generate attribute, which can be used to say that this particular model may only be referenced in a content model with the suffixes it specifies. For example, if the <classSpec> for model.hiLike took the form classSpec ident="model.hiLike" generateOnly="sequence sequenceOptional" then a content model referring to (say) model.hiLike_sequenceRepeatable would be regarded as invalid by an ODD processor.

When a <classSpec> contains an <attList> element, all the members of that class inherit the attributes specified by it. For example, the class att.interpLike defines a small set of attributes common to all elements which are members of that class: those attributes are listed by the <attList> element contained by the <classSpec> for att.interpLike. When processing the documentation elements for elements which are members of that class, an ODD processor is required to extend the <attList> (or equivalent) for such elements to include any attributes defined by the <classSpec> elements concerned. There is a single global attribute class, att.global, the membership of which may be expanded by some modules.

22.4.7 Pattern Documentation

The <macroSpec> element is used to document predefined strings or patterns not otherwise documented by the elements described in this chapter. Its chief uses are to provide systematic documentation of the parameter entities used within TEI DTD fragments and to describe common content models, but it may be used for any purpose. It has the following components additional to those already introduced:
  • macroSpec (indicazione di macro) documenta la funzione e l'applicazione di un pattern
    typeindica quale tipo di entità deve essere generata quando un processore ODD genera un modulo utilizzando la sintassi SGML
  • remarks contiene un qualsiasi commento o discussione relativi all'utilizzo di elementi, attributi, classi o entità non altrimenti documentati nell'elemento che li contiene
  • stringVal contiene l'espansione assegnata all'entità documentata da un elemento <macroSpec> e racchiusa tra virgolette

22.5 Building a Schema

The specification elements, and several of their children, are all members of the att.identified class, from which they inherit the following attributes:
  • att.identified elementi ai quali si può fare riferimento tramite l'attributo key
    identspecifica l'identificatore utilizzato per indicare l'elemento
    predeclaredichiara se la classe debba essere considerata globale e quindi definita nel modulo core
    moduleindica il nome del modulo nel quale l'oggetto deve essere definito
    modeindica l'effetto della dichiarazione sul modulo da cui trae origine

These attributes are used by an ODD processor to determine how declarations are to be combined to form a schema or DTD, as further discussed in this section.

As noted above, a TEI schema is defined by a <schemaSpec> element containing an arbitrary mixture of explicit declarations for objects (i.e. elements, classes, patterns, or macro specifications) and references to other objects containing such declarations (i.e. references to specification groups, or to modules). A major purpose of this mechanism is to simplify the process of defining user customizations, by providing a formal method for the user to combine new declarations with existing ones, or to modify particular parts of existing declarations.

In the simplest case, a user-defined schema might simply combine all the declarations from two nominated modules:
<schemaSpec ident="example">
 <moduleRef key="teistructure"/>
 <moduleRef key="linking"/>
</schemaSpec>
An ODD processor, given such a document, would combine the declarations which belong to the named modules, and deliver the result as a schema of the requested type. It might also generate documentation for all and only the elements declared by those modules.
A schema might also include declarations for new elements, as in the following example:
<schemaSpec ident="example">
 <moduleRef key="teiheader"/>
 <moduleRef key="verse"/>
 <elementSpec ident="soundClip">
  <classes>
   <memberOf key="model.pPart.data"/>
  </classes>
 </elementSpec>
</schemaSpec>
A declaration for the element <soundClip>, which is not defined in the TEI scheme, will be added to the output schema. This element will also be added to the existing TEI class model.pPart.data, and will thus be available in TEI conformant documents.
A schema might also include re-declarations of existing elements, as in the following example:
<schemaSpec ident="example">
 <moduleRef key="teiheader"/>
 <moduleRef key="teistructure"/>
 <elementSpec ident="headmode="change">
  <content>
   <rng:ref name="macro.xtext"/>
  </content>
 </elementSpec>
</schemaSpec>
The effect of this is to redefine the content model for the element <head> as plain text, by over-riding the <content> child of the selected <elementSpec>. The attribute specification mode="change" has the effect of over-riding only those children elements of the <elementSpec> which appear both in the original specification and in the new specification supplied above: <content> in this example. Note that if the value for mode were replace, the effect would be to replace all children elements of the original specification with the the children elements of the new specification, and thus (in this example) to delete all of them except <content>.
A schema may not contain more than two declarations for any given component. The value of the mode attribute is used to determine exactly how the second declaration (and its constituents) should be combined with the first. The following table summarizes how a processor should resolve duplicate declarations; the term identifiable refers to those elements which can have a mode attribute:
mode valueexisting declarationeffect
addnoadd new declaration to schema; process its children in add mode
addyesraise error
replacenoraise error
replaceyesretain existing declaration; process new children in replace mode; ignore existing children
changenoraise error
changeyesprocess identifiable children according to their modes; process unidentifiable children in replace mode; retain existing children where no replacement or change is provided
deletenoraise error
deleteyesignore existing declaration and its children

22.6 Combining TEI and Non-TEI Modules

In the simplest case, all that is needed to include a non-TEI module in a schema is to reference its RELAX NG source using the url attribute on <moduleRef>. The following specification, for example, creates a schema in which declarations from the non-TEI module svg11.rng (defining Standard Vector Graphics) are included. To avoid any risk of name clashes, the schema specifies that all TEI patterns generated should be prefixed by the string "TEI_".
<schemaSpec prefix="TEI_ident="testsvgstart="TEI svg">
 <moduleRef key="header"/>
 <moduleRef key="core"/>
 <moduleRef key="tei"/>
 <moduleRef key="textstructure"/>
 <moduleRef url="svg11.rng"/>
</schemaSpec>
This specification generates a single schema which might be used to validate either a TEI document (with the root element <TEI>), or an SVG document (with a root element <svg:svg>), but would not validate a TEI document containing <svg:svg> or other elements from the SVG language. For that to be possible, the <svg:svg> element must become a member of a TEI model class (1.3 The TEI Class System), so that it may be referenced by other TEI elements. To achieve this, we modify the last <moduleRef> in the above example as follows:
<moduleRef url="svg11.rng">
 <content>
  <rng:define name="tei_model.graphicLikecombine="choice">
   <rng:ref name="svg"/>
  </rng:define>
 </content>
</moduleRef>

This states that when the declarations from the svg11.rng module are combined with those from the other modules, the declaration for the model class model.graphicLike in the TEI module should be extended to include the element <svg:svg> as an alternative. This has the effect that elements in the TEI scheme which define their content model in terms of that element class (notably <figure>) can now include it. A RELAX NG schema generated from such a specification can be used to validate documents in which the TEI <figure> element contains any valid SVG representation of a graphic, embedded within an <svg:svg> element.

22.7 Module for Documention Elements

The module described in this chapter makes available the following components:
Modulo tagdocs: Documentazione dei moduli TEI
The selection and combination of modules to form a TEI schema is described in 1.2 Defining a TEI Schema.

The elements described in this chapter are all members of one of three classes: model.oddDecl, model.oddRef, or model.phrase.xml, with the exceptions of <schemaSpec> (a member of model.divPart) and both <eg> and <egXML> (members of model.common and model.egLike). All of these classes are declared along with the other general TEI classes, in the basic structure module documented in 1 The TEI Infrastructure.

In addition, some elements are members of the att.identified class, which is documented in 22.5 Building a Schema above, and make use of the macro.schemaPattern pattern, which is documented in 22.4.4 Element Specifications above.

Contents « 21 Certainty and Responsibility » 23 Using the TEI

Note
79.
ODD is short for ‘One Document Does it all’, and was the name invented by the original TEI Editors for the predecessor of the system currently used for this purpose. See further Burnard and Sperberg-McQueen (1995) and Burnard and Rahtz (2004).

[English] [Deutsch] [Español] [Italiano] [Français] [日本語] [中文]



Copyright TEI Consortium 2007 Licensed under the GPL. Copying and redistribution is permitted and encouraged.
Version 1.0.1. Last updated on 3rd February 2008.This page generated on 2008-02-03T17:55:08Z