This document describes issues involved in creating an XML version of the SGML document type definition (DTD) created by the Text Encoding Initiative, and proposes solutions. It defines a TEI extensions file which incorporates those solutions, in order to allow experimentation.
The discussion of inclusion exceptions defines a method of rewriting SGML content models so as to achieve effects similar to those provided by inclusion exceptions. To make an SGML document type definition compatible with XML, inclusion exceptions must be eliminated. The simplest method of ensuring that this change does not invalidate existing documents is to modify the content model of every element which can occur as a descendant of any element with inclusion exceptions in its content model, in the manner described here. That will ensure that elements named in inclusion exceptions remain legal in all the locations where they are currently legal.
The methods of changing content models described in this paper are believed to preserve determinism (what ISO 8879 calls lack of ambiguity) and to simulate the effects of inclusion exceptions properly. At this point, however, no proof of either conjecture is offered.
The Extensible Markup Language (XML) defines a syntax for document type definitions similar to that provided by the Standard Generalized Markup Language (SGML), but more restrictive. In particular, XML allows neither inclusion nor exclusion exceptions, and prohibits the ampersand connector.
Modifying an existing SGML document type definition (DTD), such as the TEI DTD, to conform to XML thus involves:
&
connectors#PCDATA
must come first, the list of
sub-elements must be flat, and the occurrence indicator must be a star)This document describes in detail the changes necessary to perform these modifications on the TEI DTD. The changes take the form of TEI modifications files suitable for use as the entities TEI.extensions.ent and TEI.extensions.dtd files.
The modifications have different degrees of difficulty. Some affect the technical content of the TEI DTD in serious ways, and therefore require review by the TEI's Technical Review Committee before being formally integrated into TEI P3, while others do not affect the technical content of the TEI at all, or affect it only in minor ways. Changes of this latter type may be regarded as corrections of obvious simple errors, and may be performed by the editors under their authority to correct corrigible errors in the text of the Guidelines. (The concept of corrigible error is defined in document TEI ED W46 (?); in brief, a corrigible error is one which both editors agree is an error, which has an obvious fix, and the fix for which will not affect any existing data.) Each change proposed in this paper is identified as either a correction to a corrigible error, which the editors expect to fix in the course of preparing a revised and corrected reprint of TEI P3, or else a substantive change requiring review by the Technical Review Committee.
Not all of the changes to the DTD are handled by this document. [1] Those that are, are summarized in the following overviews of the extensions files.
< 1 teixml.ent >(teixml.ent) =
<!--* teixml.ent: XML version of TEI (1999-07-07) *-->
<!--* This is the TEI.extensions.ent file of an experimental
* version of the TEI P3 DTD, adapted to be XML conformant.
* N.B. using this extensions file with the standard TEI DTD
* will not make the DTD completely XML compliant. Some
* post-processing is needed. Use the pizza chef at
* http://www.uic.edu/orgs/tei/pizza.html or
* http://firth.natcorp.ox.ac.uk/TEI/nupizza.html
*
* This version: 1999-07-07b
*
* Send comments to tei-l@listserv.uic.edu or to
* teitech@listserv.uic.edu
* Thank you for beta testing!
*-->
< Provide default tagset declarations 152 >
< Define TEI keywords 153 >
< Fix placePart class 154 >
< Reproduce class declarations for phrases 22 >
< Reproduce inclusion classes 42 >
< Reproduce classes used by specPara 51 >
< Embed tag-set-specific ent files 151 >
< Element class m.Incl 41 >
< New specialPara 50 >
< New declaration for phrase and phrase.seq 45 >
< New declaration for paraContent 49 >
< New declaration for component and component.seq 47 >
< Suppress definitions of elements with ampersand 3 >
< Suppress element declarations with exclusions 40 >
< Suppress some mixed content elements 11 >
< Suppress users of phrase.seq 24 >
< Suppress standard definitions of PCDATA elements 43 >
< Suppress definitions in core tag set 54 >
< Suppress definitions in text-structure tag set 67 >
< Suppress definitions in front-matter tag set 82 >
< Suppress definitions in header tag set 86 >
< Suppress definitions in verse tag set 98 >
< Suppress definitions in drama tag set 104 >
< Suppress definitions in spoken-text tag set 110 >
< Suppress definitions in terminology tag set 112 >
< Suppress definitions in segmentation and alignment tag set 117 >
< Suppress definitions in analysis tag set 122 >
< Suppress definitions in feature-structures tag set 128 >
< Suppress definitions in text-criticism tag set 136 >
< Suppress definitions in graphs tag set 140 >
< Suppress definitions in tables tag set 146 >
< 2 teixml.dtd >(teixml.dtd) =
<!--* teixml.dtd: XML version of TEI (1999-07-07) *-->
<!--* This is the TEI.extensions.dtd file of an experimental
* version of the TEI P3 DTD, adapted to be XML conformant.
* N.B. using this extensions file with the standard TEI DTD
* will not make the DTD completely XML compliant. Some
* post-processing is needed. Use the pizza chef at
* http://www.uic.edu/orgs/tei/pizza.html or
* http://firth.natcorp.ox.ac.uk/TEI/nupizza.html
*
* This version: 1999-07-07b
*
* Send comments to tei-l@listserv.uic.edu or to
* teitech@listserv.uic.edu
* Thank you for beta testing!
*-->
< New definitions of elements with ampersand 4 >
< Redeclare elements with mixed content elements 12 >
< New declarations for users of phrase.seq 25 >
< New declarations for exclusion exceptions 37 >
< New definitions for PCDATA elements 44 >
<!--* handle specialPara *-->
< New definition of set element 53 >
< New definitions for core tag set 55 >
< New definitions for text-structure tag set 68 >
< New definitions for front-matter tag set 83 >
< New definitions for header tag set 87 >
< New definitions for verse tag set 99 >
< New definitions for drama tag set 105 >
< New definitions for spoken-text tag set 111 >
< New definitions for terminology tag set 113 >
< New definitions for flat terminology tag set 116 >
< New definitions for segmentation and alignment tag set 118 >
< New definitions for analysis tag set 123 >
< New definitions for feature-structures tag set 129 >
< New definitions for text-criticism tag set 137 >
< New definitions for graphs tag set 141 >
< New definitions for tables tag set 147 >
The immediate goal of this document is to allow experimentation with the TEI DTD and XML processors, by providing the extensions files needed to make the full TEI P3 DTD work with XML processors. To use the extensions files created by this document with other extensions files (e.g. those of TEI Lite), manual merger of the extensions files is required. The editors plan to automate this merger as soon as possible; the following stages of development are anticipated:
<!ENTITY % xml.e 'IGNORE'>
so as to
suppress the XML version of that element. (Strictly speaking, this
is unnecessary for elements not declared here, but working out whether
such a declaration is needed looks like more work than we want to
put into a short-term system.)A list of open questions is included at the end of the document.
Removing tag omissibility information is a trivial task which can be
accomplished by a DTD pretty printer, or even a simple editor script.
The strings - -
, - O
, O -
, and
O O
are legal in a DTD only as tag omissibility
information, within comments, or within literals. In the TEI DTDs, they
do not occur within literals or comments, so a global change in an
editor would handle the problem.
To enable the necessary changes to be made with a minimum of manual intervention, however, it is probably better to add a run-time option to a DTD pretty printer, to make it suppress this information, or replace it with a reference to one of the parameter entities om.RR, om.RO, om.OR, or om.OO. If the run-time flag is set, the following entities will be added to the beginning of the DTD:
<!ENTITY % om.RR '- -'> <!ENTITY % om.RO '- O'> <!ENTITY % om.OR 'O -'> <!ENTITY % om.OO 'O O'>The program carthago has accordingly been outfitted with two run-time options to suppress the omissibility markers, or to replace them with entity references.
In the short term, we will normalize parameter-entity references using the pretty printer mentioned above (or else eliminate them entirely, by running the test DTD through a pre-processor like Carthage, which expands all parameter-entity references).
In the long run, we will systematically normalize all content models in the tagdocs of TEI P3 by adding semicolons to parameter-entity references which currently do not have them. N.B. the editors regard this as a correction of a corrigible error, and this normalization will be performed in the text of TEI P3 as soon as possible.
Removing ampersand connectors involves either rewriting the
content model as a set of alternative sequence groups (thus retaining
strict equivalence with the existing model) or revising the content
model entirely. In the case of the TEI, the editors both agree
that most uses of &
have proven to be design errors, so we
propose simply to revise the content models.
The following content models use ampersand connectors in TEI P3:
In this section, we provide alternate declarations for each of them. In the entity extensions file we must first suppress all of them:
< 3 Suppress definitions of elements with ampersand > =
<!ENTITY % cit 'IGNORE' >
<!ENTITY % respStmt 'IGNORE' >
<!ENTITY % publicationStmt 'IGNORE' >
<!ENTITY % graph 'IGNORE' >
< 4 New definitions of elements with ampersand > =
< New cit declaration 5 >
< Define new respStmt 8 >
< New publicationStmt 9 >
< New graph element 10 >
N.B. All the ampersand-eliminating content-model changes in this section are regarded by the editors as corrections of corrigible errors, and will be integrated into the text of TEI P3 as soon as possible.
The standard declaration for <cit> is as follows:
<!ELEMENT %n.cit; - - ((%n.q; | %n.quote;) & (%m.bibl; | %m.loc;)) >We will redefine it with a slightly more general content model (well, almost -- see below):
< 5 New cit declaration > =
<!ENTITY % XML.cit "INCLUDE" >
<![%XML.cit;[
<!ELEMENT %n.cit; - - ((%n.q; | %n.quote; | %m.bibl; |
%m.loc; | %m.Incl;)+) >
<!ATTLIST %n.cit; %a.global;
TEIform CDATA 'cit' >
]]>
<!ELEMENT %n.cit; - - (((%n.q; | %n.quote;), (%m.bibl; | %m.loc;)) | ((%m.bibl; | %m.loc;), (%n.q; | %n.quote;))) >
As it turns out, however the declaration proposed above is ambiguous, since <link> is a member of both the loc and Incl classes. We'll have to unroll one or the other of these two classes; a coin toss decides that we should unroll loc.
< 6 New cit declaration (alternate) > =
<!ENTITY % XML.cit "INCLUDE" >
<![%XML.cit;[
<!ELEMENT %n.cit; - - ((%n.q; | %n.quote; | %m.bibl;
| %n.ptr; | %n.ref;
| %n.xptr; | %n.xref;
| %m.Incl;)+) >
<!ATTLIST %n.cit; %a.global;
TEIform CDATA 'cit' >
]]>
After further investigation (i.e. further attempts to use the DTD produced by a draft of this paper), however, it becomes clear that loc is a subclass of phrase, so that every content model which uses both the phrase class and the Incl class is going to have troubles. So instead of unrolling each case individually, we take a harsher approach, and remove <link> from the loc class.
< 7 New loc class > =
<!--* remove link from loc class to avoid ambiguity *-->
<!ENTITY % x.loc '' >
<!ENTITY % m.loc '%x.loc; %n.ptr; | %n.ref; |
%n.xptr; | %n.xref;' >
Similarly, we could replicate the original definition of <respStmt> if we wished, but it's probably better regarded as a design error to be fixed:
<!ELEMENT %n.respStmt; - O ((%n.resp; & %n.name;), (%n.resp; | %n.name;)*) >We give it a simpler and looser declaration instead:
< 8 Define new respStmt > =
<!ENTITY % XML.respStmt "INCLUDE" >
<![%XML.respStmt;[
<!ELEMENT %n.respStmt; - O (%n.resp; | %n.name;
| %m.Incl;)+ >
<!ATTLIST %n.respStmt; %a.global;
TEIform CDATA 'respStmt' >
]]>
<!ELEMENT %n.respStmt; - O (((%n.resp;)+, (%n.name;, (%n.resp; | %n.name;)*)) | ((%n.name;)+, (%n.resp;, (%n.resp; | %n.name;)*)))
The content model for <publicationStmt> includes an editorial error I am glad to have the occasion to fix. (In normal bibliographic practice, when place and publisher are both given, the place is given first. I don't know what got into me that morning.)
<!ELEMENT %n.publicationStmt; - O ((%n.p;)+ | ( (%n.publisher; | %n.distributor; | %n.authority;) & ((%n.pubPlace)?, (%n.address)?, (%n.idno)*, (%n.availability)?, (%n.date)?)+ )+ ) >Rather than simply replace the current content model with an equivalent ampersand-less expression, we'll change it. For compatibility with existing data, we'll make the new expression loose rather than tight.
< 9 New publicationStmt > =
<!ENTITY % XML.publicationStmt "INCLUDE" >
<![%XML.publicationStmt;[
<!ELEMENT %n.publicationStmt;
- O ( (%n.p;, (%m.Incl;)*)+
| ((%n.publisher; | %n.distributor;
| %n.authority; | %n.pubPlace;
| %n.address; | %n.idno;
| %n.availability; | %n.date;),
(%m.Incl;)*)+ ) >
<!ATTLIST %n.publicationStmt; %a.global;
TEIform CDATA 'publicationStmt'
>
]]>
The <graph> element uses the content model to require that graphs be encoded nodes-first or arcs-first, but not mixed hugger-mugger. We'll retain that characteristic. The old declaration is this:
<!ELEMENT %n.graph; - - ((%n.node;)+ & (%n.arc;)*) >We could require arbitrarily that all nodes come first; it's not clear whether any legacy data using <graph> actually exists. But in the interests of backward compatibility, the new content model might as well allow precisely what the old one did, even if that now seems like a design error:
< 10 New graph element > =
<![%TEI.nets;[
<!ENTITY % XML.graph "INCLUDE" >
<![%XML.graph;[
<!ELEMENT %n.graph; - - (((%n.node;, (%m.Incl;)*)+,
(%n.arc;, (%m.Incl;)*)*)
| ((%n.arc;, (%m.Incl;)*)+,
(%n.node;, (%m.Incl;)*)+)) >
<!ATTLIST %n.graph; %a.global;
type CDATA #IMPLIED
label CDATA #IMPLIED
order NUMBER #IMPLIED
size NUMBER #IMPLIED
TEIform CDATA 'graph' >
]]>
]]>
The following elements use the keyword #PCDATA
in ways that
must be changed to be legal in XML:
#PCDATA
keyword is given last, not
first, in the content model; in one or two, it's neither first nor last.
For example:
<!ELEMENT %n.sense; - - (%n.sense; | %m.dictionaryTopLevel | %m.phrase | #PCDATA)* >In one or two cases, the group also has a plus operator instead of a star operator.
<!ELEMENT %n.timeStruct; - - ((%m.temporalExpr; | #PCDATA)+) >
We must redeclare each of them, which means first of all that we must suppress their standard declarations:
< 11 Suppress some mixed content elements > =
<!ENTITY % sense 'IGNORE' >
<!ENTITY % re 'IGNORE' >
<!ENTITY % persName 'IGNORE' >
<!ENTITY % placeName 'IGNORE' >
<!ENTITY % geogName 'IGNORE' >
<!ENTITY % dateStruct 'IGNORE' >
<!ENTITY % timeStruct 'IGNORE' >
<!ENTITY % dateline 'IGNORE' >
< 12 Redeclare elements with mixed content elements > =
<![%TEI.dictionaries;[
< New mixed content elements for dictionaries 13 >
]]>
<![%TEI.names.dates;[
< New mixed content elements for names and dates 15 >
]]>
< New mixed content elements for structure 20 >
Since the normalization is purely mechanical, there seems to be no need to reproduce the original declarations here. The new declarations are given below.
N.B. All the mixed-content normalization changes in this section are regarded by the editors as corrections of corrigible errors, and will be integrated into the text of TEI P3 as soon as possible.
Two elements in this group are from the dictionary tag set:
< 13 New mixed content elements for dictionaries > =
<!ENTITY % XML.sense "INCLUDE" >
<![%XML.sense;[
<!ELEMENT %n.sense; - - (#PCDATA | %n.sense;
| %m.dictionaryTopLevel;
| %m.phrase; | %m.Incl;)* >
<!ATTLIST %n.sense; %a.global;
%a.dictionaries;
level NUMBER #IMPLIED
TEIform CDATA 'sense' >
]]>
< 14 New mixed content elements for dictionaries 13 (cont'd) > =
<!ENTITY % XML.re "INCLUDE" >
<![%XML.re;[
<!ELEMENT %n.re; - O (#PCDATA | %n.sense;
| %m.dictionaryTopLevel;
| %m.phrase; | %m.Incl;)* >
<!ATTLIST %n.re; %a.global;
%a.dictionaries;
type CDATA #IMPLIED
TEIform CDATA 're' >
]]>
<!ELEMENT %n.re; - O (#PCDATA | %n.sense; | %m.dictionaryTopLevel; | %m.phrase;)* -(%n.re;) >
The other elements in this group are from the tag set for names and dates.
< 15 New mixed content elements for names and dates > =
<!ENTITY % XML.persName "INCLUDE" >
<![%XML.persName;[
<!ELEMENT %n.persName; - - (#PCDATA | %m.personPart;
| %m.phrase; | %m.Incl;)* >
<!ATTLIST %n.persName; %a.global;
%a.names;
type CDATA #IMPLIED
TEIform CDATA 'persName' >
]]>
< 16 New mixed content elements for names and dates 15 (cont'd) > =
<!ENTITY % XML.placeName "INCLUDE" >
<![%XML.placeName;[
<!ELEMENT %n.placeName; - - (#PCDATA | %m.placePart;
| %m.phrase; | %m.Incl;)* >
<!ATTLIST %n.placeName; %a.global;
type CDATA #IMPLIED
full (yes | abb | init) yes
%a.names;
TEIform CDATA 'placeName' >
]]>
< 17 New mixed content elements for names and dates 15 (cont'd) > =
<!ENTITY % XML.geogName "INCLUDE" >
<![%XML.geogName;[
<!ELEMENT %n.geogName; - - (#PCDATA | %n.geog; | %n.name;
| %m.Incl;)* >
<!ATTLIST %n.geogName; %a.global;
%a.placePart;
TEIform CDATA 'geogName' >
]]>
< 18 New mixed content elements for names and dates 15 (cont'd) > =
<!ENTITY % XML.dateStruct "INCLUDE" >
<![%XML.dateStruct;[
<!ELEMENT %n.dateStruct;
- - (#PCDATA | %m.temporalExpr;
| %m.Incl;)* >
<!ATTLIST %n.dateStruct; %a.global;
%a.temporalExpr;
calendar CDATA #IMPLIED
exact CDATA #IMPLIED
TEIform CDATA 'dateStruct' >
]]>
< 19 New mixed content elements for names and dates 15 (cont'd) > =
<!ENTITY % XML.timeStruct "INCLUDE" >
<![%XML.timeStruct;[
<!ELEMENT %n.timeStruct;
- - (#PCDATA | %m.temporalExpr;
| %m.Incl;)* >
<!ATTLIST %n.timeStruct; %a.global;
%a.temporalExpr;
zone CDATA #IMPLIED
TEIform CDATA 'timeStruct' >
]]>
The <dateline> element (from the default text-structure tag set) is the last one needing a mixed-content fix:
< 20 New mixed content elements for structure > =
<!ENTITY % XML.dateline "INCLUDE" >
<![%XML.dateline;[
<!ELEMENT %n.dateline; - O (#PCDATA | %n.date; | %n.time;
| %n.name; | %n.address;
| %m.Incl;)* >
<!ATTLIST %n.dateline; %a.global;
TEIform CDATA 'dateline' >
]]>
The XML rules for mixed-content models also require that the declarations for phrase and phrase.seq be changed slightly. The current defintions are:
<!ENTITY % phrase '(#PCDATA | %m.phrase)' > <!ENTITY % phrase.seq '(%phrase;)*' >These give us one level too many of parentheses; we need to remove the parentheses from the entity phrase:
< 21 New declaration for phrase and phrase.seq > =
<!ENTITY % phrase '#PCDATA | %m.phrase;' >
<!ENTITY % phrase.seq '(%phrase;)*' >
N.B. This change to the declaration of phrase is regarded by the editors as the correction of a corrigible error, and will be integrated into the text of TEI P3 as soon as possible.
Unfortunately, integrating this particular fix into the XML modifications file for testing will require that we either hard-code the effective value of m.phrase, or that we recreate the entire sequence of class declarations for phrase in the modifications file. (Sigh.) While we are here, we will introduce some fixes to the declarations of some classes:
< 22 Reproduce class declarations for phrases > =
< Declare new GIs 23 >
<!ENTITY % x.hqphrase '' >
<!ENTITY % m.hqphrase '%x.hqphrase; %n.distinct; | %n.emph; |
%n.foreign; | %n.gloss; | %n.hi; | %n.mentioned; |
%n.soCalled; | %n.term; | %n.title;' >
<!ENTITY % x.data '' >
<!ENTITY % m.data '%x.data; %n.abbr; | %n.address; | %n.date;
| %n.dateRange; | %n.dateStruct; | %n.expan;
| %n.geogName;
| %n.lang; | %n.measure; | %n.name; | %n.num;
| %n.orgName; | %n.persName; | %n.placeName;
| %n.rs; | %n.time; | %n.timeRange;
| %n.timeStruct;' >
<!ENTITY % x.edit '' >
<!ENTITY % m.edit '%x.edit; %n.add; | %n.app; |
%n.corr; | %n.damage; | %n.del; |
%n.orig; | %n.reg; | %n.restore; | %n.sic;
| %n.space; | %n.supplied; | %n.unclear;' >
<!ENTITY % x.editIncl '' >
<!ENTITY % m.editIncl '%x.editIncl; %n.addSpan; | %n.delSpan; |
%n.gap;' >
< New loc class 7 >
<!ENTITY % x.seg '' >
<!ENTITY % m.seg '%x.seg; %n.c; | %n.cl; | %n.m; |
%n.phr; | %n.s; | %n.seg; | %n.w;' >
<!ENTITY % x.sgmlKeywords '' >
<!ENTITY % m.sgmlKeywords '%x.sgmlKeywords; %n.att; | %n.gi; |
%n.tag; | %n.val;' >
<!ENTITY % x.phrase.verse '' >
<!ENTITY % m.phrase.verse '%x.phrase.verse; %n.caesura;' >
<!ENTITY % x.formPointers '' >
<!ENTITY % m.formPointers '%x.formPointers; %n.oRef; | %n.oVar;
| %n.pRef; | %n.pVar;' >
<!ENTITY % x.phrase '' >
<!ENTITY % m.phrase '%x.phrase; %m.data; | %m.edit; |
%m.formPointers; | %m.hqphrase; | %m.loc; |
%m.phrase.verse; | %m.seg; | %m.sgmlKeywords; |
%n.dictAnomaly; |
%n.formula; | %n.fw; | %n.handShift;' >
<!ENTITY % x.fmchunk '' >
<!ENTITY % m.fmchunk '%x.fmchunk; %n.argument; | %n.byline; |
%n.docAuthor; | %n.docDate; | %n.docEdition; |
%n.docImprint; | %n.docTitle; | %n.epigraph; |
%n.head; | %n.titlePart;' >
The element <dictAnomaly> is new; for a description, see below, section The problem of the dictionary chapter.
We need to declare the name of <dictAnomaly>.
< 23 Declare new GIs > =
<!ENTITY % n.dictAnomaly 'dictAnomaly' >
Note that neither phrase.seq nor paraContent may be combined with other elements in a content model, in XML, because of the XML requirement that mixed content models not have nested groups. This affects the declarations for
These must be suppressed, in order to be redeclared:
< 24 Suppress users of phrase.seq > =
<!ENTITY % castItem 'IGNORE' >
<!ENTITY % docImprint 'IGNORE' >
<!ENTITY % catDesc 'IGNORE' >
<!ENTITY % byline 'IGNORE' >
<!ENTITY % opener 'IGNORE' >
<!ENTITY % closer 'IGNORE' >
<!ENTITY % form 'IGNORE' >
<!ENTITY % gramGrp 'IGNORE' >
<!ENTITY % trans 'IGNORE' >
<!ENTITY % etym 'IGNORE' >
<!ENTITY % xr 'IGNORE' >
And they need to be redefined, tag set by tag set. (We put elements from each tag set into separate scraps to simplify production of specialized modification files.)
< 25 New declarations for users of phrase.seq > =
< New castItem 26 >
< New docImprint 27 >
< New catDesc 28 >
< New opener and closer 29 >
< New phrase.seq elements for dictionaries 32 >
First, the base tag set for drama:
< 26 New castItem > =
<![%TEI.drama;[
<!ENTITY % XML.castItem "INCLUDE" >
<![%XML.castItem;[
<!ELEMENT %n.castItem; - O (#PCDATA | %n.role; | %n.roleDesc;
| %n.actor; | %m.phrase;
| %m.Incl;)* >
<!ATTLIST %n.castItem; %a.global;
type (role | list) role
TEIform CDATA 'castItem' >
]]>
]]>
Next the tag set for front matter:
< 27 New docImprint > =
<!ENTITY % XML.docImprint "INCLUDE" >
<![%XML.docImprint;[
<!ELEMENT %n.docImprint;
- O (#PCDATA | %m.phrase; | %n.pubPlace;
| %n.docDate; | %n.publisher;
| %m.Incl;)* >
<!ATTLIST %n.docImprint; %a.global;
TEIform CDATA 'docImprint' >
]]>
< 28 New catDesc > =
<!ENTITY % XML.catDesc "INCLUDE" >
<![%XML.catDesc;[
<!ELEMENT %n.catDesc; - O (#PCDATA | %m.phrase;
| %n.textDesc;)* >
<!ATTLIST %n.catDesc; %a.global;
TEIform CDATA 'catDesc' >
]]>
< 29 New opener and closer > =
<!ENTITY % XML.byline "INCLUDE" >
<![%XML.byline;[
<!ELEMENT %n.byline; - O (#PCDATA | %m.phrase;
| %n.docAuthor; | %m.Incl;)* >
<!ATTLIST %n.byline; %a.global;
TEIform CDATA 'byline' >
]]>
< 30 New opener and closer 29 (cont'd) > =
<!ENTITY % XML.opener "INCLUDE" >
<![%XML.opener;[
<!ELEMENT %n.opener; - O (#PCDATA | %m.phrase;
| %n.argument; | %n.byline;
| %n.epigraph;
| %n.signed; | %n.dateline;
| %n.salute; | %m.Incl;)* >
<!ATTLIST %n.opener; %a.global;
TEIform CDATA 'opener' >
]]>
< 31 New opener and closer 29 (cont'd) > =
<!ENTITY % XML.closer "INCLUDE" >
<![%XML.closer;[
<!ELEMENT %n.closer; - O (#PCDATA | %m.phrase;
| %n.signed; | %n.dateline;
| %n.salute; | %m.Incl;)* >
<!ATTLIST %n.closer; %a.global;
TEIform CDATA 'closer' >
]]>
And finally the base tag set for dictionaries; unlike the preceding elements, these all use paraContent, not phrase.seq. N.B. these content models will require further changes before publication. See below, The problem of the dictionary chapter.
< 32 New phrase.seq elements for dictionaries > =
<![%TEI.dictionaries;[
<!ENTITY % XML.form "INCLUDE" >
<![%XML.form;[
<!ELEMENT %n.form; - - (#PCDATA | %m.phrase; | %m.inter;
| %m.formInfo; | %m.Incl;)* >
<!ATTLIST %n.form; %a.global;
%a.dictionaries;
type CDATA #IMPLIED
TEIform CDATA 'form' >
]]>
< 33 New phrase.seq elements for dictionaries 32 (cont'd) > =
<!ENTITY % XML.gramGrp "INCLUDE" >
<![%XML.gramGrp;[
<!ELEMENT %n.gramGrp; - - (#PCDATA | %m.phrase; | %m.inter;
| %m.gramInfo; | %m.Incl;)* >
<!ATTLIST %n.gramGrp; %a.global;
%a.dictionaries;
TEIform CDATA 'gramGrp' >
]]>
< 34 New phrase.seq elements for dictionaries 32 (cont'd) > =
<!ENTITY % XML.trans "INCLUDE" >
<![%XML.trans;[
<!ELEMENT %n.trans; - O (#PCDATA | %m.phrase; | %m.inter;
| %m.dictionaryParts; | %m.Incl;)* >
<!ATTLIST %n.trans; %a.global;
%a.dictionaries;
TEIform CDATA 'trans' >
]]>
< 35 New phrase.seq elements for dictionaries 32 (cont'd) > =
<!ENTITY % XML.etym "INCLUDE" >
<![%XML.etym;[
<!ELEMENT %n.etym; - O (#PCDATA | %m.phrase; | %m.inter;
| %n.usg; | %n.lbl; | %n.def;
| %n.trans; | %n.tr;
| %m.morphInfo; | %n.eg;
| %n.xr; | %m.Incl;)* >
<!ATTLIST %n.etym; %a.global;
%a.dictionaries;
TEIform CDATA 'etym' >
]]>
< 36 New phrase.seq elements for dictionaries 32 (cont'd) > =
<!ENTITY % XML.xr "INCLUDE" >
<![%XML.xr;[
<!ELEMENT %n.xr; - O (#PCDATA | %m.phrase; | %m.inter;
| %n.usg; | %n.lbl; | %m.Incl;)* >
<!ATTLIST %n.xr; %a.global;
%a.dictionaries;
type CDATA #IMPLIED
TEIform CDATA 'xr' >
]]>
]]>
Since paraContent also occurs in the definition of specialPara, in a form not legal in XML, the specialPara entity must also be redefined; see below, The problem of specialPara elements.
Removing inclusion and exclusion exceptions typically involves changing the set of documents accepted by the DTD.[2] In the discussion which follows, I assume that our goal is to ensure that every document legal in the original DTD remains legal in the modified DTD. The changes will cause the modified DTD to accept some other documents which are not valid instances of the original DTD. That is, if the original DTD is taken as an absolutely correct definition of a language, the revised DTD will overgenerate.[3] We will wish to keep the overgeneration to a minimum, but in general we cannot eliminate it entirely, since inclusion and exclusion exceptions do extend the expressive power of the DTD notation.[4]
Rewriting declarations without exclusion exceptions involves simply removing the exception, and adding an application-specific constraint to be checked outside the SGML parser, that says the excluded element types must not occur within the element type which excluded them. Thus, for example, the TEI <s> element (for end-to-end segmentation on the level of the orthographic sentence) is currently declared thus:
<!ELEMENT s - - (%phrase.seq) -(s) >An XML-compatible TEI DTD would replace this with:
<!ELEMENT s %phrase.seq; > <!--* CONSTRAINT: <s> must not occur within * an <s>, i.e. Ancestor(1,s) = NIL *-->The important change here, for present purposes, is the removal of the exclusion exception. In addition, we have removed the tag omissibility indicators and the parentheses around phrase.seq, for reasons that should be clear from other portions of this document.
It would be possible to simulate the effect of exclusion exceptions by modifying the content models of possible descendants of <s>, so as to remove <s> from their content model; for elements which can occur both as parents and as descendants of <s>, however, this change would render some existing documents illegal; it is thus not pursued further here.
The following elements have exclusion exceptions in TEI P3:
The new declarations are precisely the same as the old declarations, only without the exclusions:
< 37 New declarations for exclusion exceptions > =
<![ %TEI.analysis; [
<!ENTITY % XML.s "INCLUDE" >
<![%XML.s;[
<!ELEMENT %n.s; - - %phrase.seq; >
<!ATTLIST %n.s; %a.global;
%a.seg;
TEIform CDATA 's' >
]]>
]]>
< 38 New declarations for exclusion exceptions 37 (cont'd) > =
<!ENTITY % XML.speaker "INCLUDE" >
<![%XML.speaker;[
<!ELEMENT %n.speaker; - O %phrase.seq; >
<!ATTLIST %n.speaker; %a.global;
TEIform CDATA 'speaker' >
]]>
< 39 New declarations for exclusion exceptions 37 (cont'd) > =
<!ENTITY % XML.stage "INCLUDE" >
<![%XML.stage;[
<!ELEMENT %n.stage; - - %specialPara; >
<!ATTLIST %n.stage; %a.global;
type CDATA mix
TEIform CDATA 'stage' >
]]>
And they have to be excluded from the base DTD:
< 40 Suppress element declarations with exclusions > =
<!ENTITY % s 'IGNORE' >
<!ENTITY % speaker 'IGNORE' >
<!ENTITY % stage 'IGNORE' >
A new definition of <re> has already been given above, in the context of normalizing mixed-content models. The new definition of <hom> would be as follows:
<!ELEMENT %n.hom; - O (%n.sense; | %m.dictionaryTopLevel)* >The actualy form to be used for <hom> in an XML DTD, however, varies from this, as described below in The problem of the dictionary chapter.
Removing inclusion exceptions requires simulating their effect in the content model of each element type which can occur as a descendant of the element type bearing the inclusions. This section discusses
Inclusions make included elements legal at any location in a content model, without however changing the requirements of the basic content model, which must still be fulfilled. (For now, I make the simplifying assumption that the set of included elements and the set of elements named in the content model are disjoint. When they are not, special considerations will apply, because of SGML's requirement that content models be deterministic.)
We can summarize the effect of inclusions very simply if we think of an FSA recognizing a content model: included elements do not change the state of the FSA. So to change an FSA without inclusions to an FSA that accepts the same language, except that it also allows the inclusion of any element i in the set of inclusions I,
for each state s in the FSA { for each element i in I { add a transition from s to s, on i } }
We can characterize the language recognized using inclusion exceptions this way. Let us construct a function imf(E,I) which maps from a regular expression E and a set of inclusions I to a new regular expression E'. Ideally we want the following to be true:
In general, for sequences of terminals x, y in Sigma*:
My best cut so far at defining such a function relies in some places on a couple of auxiliary functions. So let us define functions imf(E), mf(E), and m(E) (where i is for `initial', m for `medial', f for `final').[5] imf(E) makes the claim about xiy true for all x, y in Sigma*. mf(E) makes it true for x in Sigma+ and y in Sigma*. m(E) makes it true for x, y in Sigma+. Equivalently, we can say that any element i in I can appear initially, medially, or finally in imf(E), medially or finally (but not initially) in mf(E), and medially (but not initially or finally) in m(E).
The care we have to take with initial and final positions results from
the SGML rules about determinism, but also helps keep the resulting
expressions simpler than they'd be if we just slapped (I*)
in everywhere in the content model.
Here is a first cut at defining the functions. In a number of circumstances, they are undefined; it might perhaps be useful, therefore, to define a simple normalization on (ampersand-free) content models, which would ensure that the functions are always defined.
If E is the empty set, then the content model in question cannot be satisfied; this would be the case if a DTD which lacked any element called <nonesuch> nevertheless included an element which required it as a subelement:
<!ELEMENT impossible - - (nonesuch) >Given that we want L(E) is a subset of L(E') we must define imf etc. thus for this case:
An element may accept the empty string as its content in either
of two ways. First, the element may be declared EMPTY
: in this
case, inclusions are not legal inside the element.
#PCDATA
: in this case, inclusions are legal within
the element.
I*
If E is an atomic symbol, e.g. a, then
E
(m(E), I*)
(I*, mf(E))
If E has the form F?, and F is not nullable (does not accept the empty string), then
m(F)?
(m(F), I*)?
I*, mf(E)
= I*, (m(F), I*)?
If E has the form F?, and F is nullable, then
m(F)
mf(F)
imf(E)
?
is
redundant and may be stripped without loss of information.If E has the form F+
,
and F is not nullable,
then
(m(F), (I*, m(F))*)
(m(F), I*)+
I*, mf(E)
= I*, (m(F), I*)+
If E has the form F+
,
and F is nullable,
then
m(F*)
mf(F*)
imf(F*)
If E has the form F*
,
and F is not nullable,
then
(m(F), (I*, m(F))*)?
(m(F), I*)*
(mf(F))*
(m(F) | I)*
If E has the form F*
,
and F is nullable,
then
(m(F) | I)*
If E has the form (F,G)
, then
mf(F), m(G)
,
if and only if G is not nullable,
else undefinedmf(F), mf(G)
imf(F), mf(G)
I*, mf(E)
= I*, mf(F), mf(G)
If E has the form (F|G)
, then
(m(F)|m(G))
(m(F)|m(G)), I*
(mf(F)|mf(G))
I*, mf(E)
= I*,
(m(F)|m(G)), I*
or (I*, (mf(F)|mf(G)))
If E has the form (F&G)
, then
m(F,G)|m(G,F)
m(F&G), I*
I*, m(F&G), I*
Let's do some simple examples, abstracted from the TEI.
(a,b)
==> (I*, a, I*, b, I*)
(<TEI.2> has this structure.)(a,b+)
==> (I*, a, I*, (b, I*)+)
(<teiCorpus.2> has this structure.)(a*)
==> (a | I)*
(<spanGrp> and many other elements have this structure.)(#PCDATA | a | b | c | d)*
(%paraContent et al.)(m(#PCDATA | a | b | c | d) | I)*
((m(#PCDATA) | m(a) | m(b) | m(c) | m(d)) | I)*
((#PCDATA | a | b | c | d) | I)*
(#PCDATA | a | b | c | d | I)*
a+
==> (I*, (a, I*)+)
(a|b)+
==> (I*, ((a|b), I*)+)
The element <back> is defined thus:
<!ELEMENT %n.back; - O ( (%m.front)*, ( ( (%m.divtop), (%m.divtop | %n.titlePage;)* ) | ( (%n.div;), (%n.div; | (%m.front))* ) | ( (%n.div1;), (%n.div1; | (%m.front))* ) )? ) >
Removing the parameter entities and using single-letter identifiers, we can rewrite the content model this way to show its structure a little more clearly:
( (a | b | c)*, ( ( (d | e | f), (d | e | f | g)* ) | ( (h), (h | (a | b | c))* ) | ( (i), (i | (a | b | c))* ) )? )Or more compactly:
( (a | b | c)*, ( ( (d | e | f), (d | e | f | g)* ) | ( h, (h | a | b | c)* ) | ( i, (i | a | b | c)* ) )? )i.e. E has the form
F,G
where
F=(a|b|c)*
and
G=(((d|e|f) ... (i|a|b|c)*))?
. So
imf(E) = imf(F), mf(G)
.Now, F is simple:
imf(a|b|c)* = (a | b | c | I)*
But mf(G) requires more work.
G = H?
where
H =
( ( (d | e | f), (d | e | f | g)* ) | ( h, (h | a | b | c)* ) | ( i, (i | a | b | c)* ) )So mf(G) =
(m(H), I*)?
H in turn is an alternation of three sequences, each of
the form (x, (y|z)*)
.
This leads to a problem, because the final term in each sequence is
nullable; we will have a determinism conflict with the trailing
I*
.
So we add a new definition of
mf(E) where E = F?
.
mf(F?) = mf(F)?
Applied to G, we have:
mf(G) = (mf(H))?
, with
H = (J | K | L)
.
So
mf(H) = ((m(J) | m(K) | m(L)), I*)
But J, K, and L don't have m() forms, since their final term is nullable. So we use the alternate definition:
mf(H) = (mf(J) | mf(K) | mf(L))
We have the following:
( (d | e | f), (d | e | f | g)* )
( (d | e | f), I*, (d | e | f | g | I)*)
( h, (h | a | b | c)* )
( h, I*, (h | a | b | c | I)* )
( i, (i | a | b | c)* )
( i, I*, (i | a | b | c | I)* )
So mf(H) =
( ( (d | e | f), I*, (d | e | f | g | I)*) | ( h, I*, (h | a | b | c | I)* ) | ( i, I*, (i | a | b | c | I)* ) )
Recall that
mf(G) = (mf(H))?
.
So mf(G) =
( ( (d | e | f), I*, (d | e | f | g | I)*) | ( h, I*, (h | a | b | c | I)* ) | ( i, I*, (i | a | b | c | I)* ) )?and imf(E) =
imf(F), mf(G)
=
( (a | b | c | I)*, ( ( (d | e | f), I*, (d | e | f | g | I)*) | ( h, I*, (h | a | b | c | I)* ) | ( i, I*, (i | a | b | c | I)* ) )? )
Or, in content model terms (using the usual TEI conventions for names of element classes):
<!ELEMENT %n.back; - O ( (%m.front; | %m.I;)*, ( ( (%m.divtop;), (%Istar;), (%m.divtop; | %n.titlePage; | %m.I;)* ) | ( (%n.div;), (%Istar;), (%n.div; | %m.front; | %m.I;)* ) | ( (%n.div1;), (%Istar;), (%n.div1; | %m.front; | %m.I;)* ) )? ) >
I think we've got a system we can use manually, though I don't know for sure how to make it a program, given the problems we have defining some of the functions.
The following elements have inclusion exceptions in TEI P3 (as of September 1994):
%m.dictionaryParts; | %m.phrase;
| %m.inter;
)%m.dictionaryParts; |
%m.formPointers;
)%m.globincl;
, i.e.
<alt>,
<altGrp>,
<cb>,
<certainty>,
<fLib>,
<fs>,
<fsLib>,
<fvLib>,
<index>,
<interp>,
<interpGrp>,
<join>,
<joinGrp>,
<lb>,
<link>,
<linkGrp>,
<milestone>,
<pb>,
<respons>,
<span>,
<spanGrp>, and
<timeline>)%m.fragmentary;
, i.e.
<lacunaEnd>,
<lacunaStart>,
<witEnd>, and
<witStart>)%m.fragmentary;
)%m.terminologyInclusions;
, i.e.
<date>,
<dateStruct>,
<note>,
<ptr>,
<ref>,
<xptr>, and
<xref>)The inclusions on <entry>, <entryFree>, and <eg> will be taken care of separately, in the section on the dictionary chapter.
The inclusions on <orgName> were dropped in October 1994 (though this change has not been propagated to any public version of the DTD), and so we will ignore them.
The inclusions on <text> must be propagated to all potential descendants of <text>.
The inclusions on <lem> and <rdg> must be propagated to all potential descendants; it might be possible to do without these, but it's probably not worth the effort.
Note that in the case of terminologyInclusions, the set of inclusions is not disjoint from the set of children named directly in content models.
Study of the full TEI DTD shows that the sets of possible descendants of <text>, <lem>, <rdg>, and <termEntry> are all identical. This is not surprising given that <text> is recursive.
The 263 elements in this set fall into the following groups:
EMPTY
:
addSpan,
alt,
anchor,
any,
arc,
caesura,
cb,
certainty,
delSpan,
dft,
divGen,
eLeaf,
event,
gap,
handShift,
index,
iNode,
interp,
join,
kinesic,
lacunaEnd,
lacunaStart,
lb,
leaf,
link,
milestone,
minus,
move,
msr,
nbr,
node,
none,
null,
oRef,
pause,
pb,
plus,
pRef,
ptr,
rate,
respons,
root,
shift,
space,
span,
sym,
uncertain,
vocal,
when,
witEnd,
witStart, and
xptr(#PCDATA)
:
att,
day,
gi,
hour,
idno,
minute,
month,
offset,
postBox,
postCode,
second,
str,
tag,
val,
week, and
year.
(Of these, note that <att>,
<gi>,
<tag>, and
<val> aren't actually in the main DTD, so they won't
be handled here. Perhaps all these lists need to be checked
once more in a calm moment.)(%phrase.seq;)
:
abbr,
actor,
addrLine,
author,
authority,
biblScope,
cl,
date,
dateRange,
del,
distance,
distinct,
distributor,
docAuthor,
docDate,
edition,
editor,
expan,
extent,
funder,
fw,
gloss,
headItem,
headLabel,
label,
measure,
mentioned,
name,
num,
occasion,
orgDivn,
orgName,
orgTitle,
orgType,
orig,
phr,
principal,
publisher,
pubplace,
reg,
resp,
restore,
role,
roleDesc,
rs,
s,
salute,
signed,
soCalled,
speaker,
sponsor,
street,
term,
time,
timeRange,
trailer, and
wit
[Supernumerary in ND: surname, forename, genName, nameLink, addName,
roleName, settlement, bloc.]
[Supernumerary in Corpus: channel, constitution, derivation,
domain, factuality, interaction, preparedness, purpose, birth,
firstLang, langKnown, residence, education, affiliation, occupation,
socecstatus, locale, activity.]
[Supernumerary in Header: symbol, creation, language, classCode]
[6]
(%component.seq;)
:
epigraph(%paraContent;)
:
admin,
camera,
caption,
cell,
country,
damage,
descrip,
docEdition,
emph,
figDesc,
foreign,
gram,
head,
hi,
imprimatur,
l,
lang,
x lem,
meeting,
otherForm,
p,
x rdg,
ref,
region,
seg,
sound,
supplied,
tech,
title,
titlePart,
unclear,
witDetail,
witness,
writing, and
xref.
N.B. this list does not include elements from the dictionary tag set,
the feature system
declaration, or the tag set declaration.[7]
The dictionary tag set
presents problems of its own, and the others are not part of the
main TEI DTD.(%specialPara;)
:
add,
corr,
item,
note,
q,
quote,
sic,
stage, and
viewEmpty elements need no changes.
The other groups of elements do require changes to the DTD, which are described in the following sections.
In order to simplify the process of adding inclusions to the content models of the DTD, we define a new class for use in content models, namely m.Incl. This consists of:
< 41 Element class m.Incl > =
<!ENTITY % x.Incl ''>
<![%TEI.textcrit;[
<!--* If text criticism tag set is selected, include m.fragmentary
* in the class m.Incl.
*-->
<!ENTITY % m.Incl '%x.Incl; %m.globincl; | %m.editIncl;
| %m.fragmentary; | %n.anchor;' >
]]>
<!--* Otherwise, don't. *-->
<!ENTITY % m.Incl '%x.Incl; %m.globincl; | %m.editIncl;
| %n.anchor;' >
We have to reproduce the standard declarations for the inclusion classes:
< 42 Reproduce inclusion classes > =
<!ENTITY % x.metadata '' >
<!ENTITY % m.metadata '%x.metadata; %n.alt; | %n.altGrp; |
%n.certainty; | %n.fLib; | %n.fs; | %n.fsLib; |
%n.fvLib; | %n.index; | %n.interp; | %n.interpGrp; |
%n.join; | %n.joinGrp; | %n.link; | %n.linkGrp; |
%n.respons; | %n.span; | %n.spanGrp; | %n.timeline;' >
<!ENTITY % x.refsys '' >
<!ENTITY % m.refsys '%x.refsys; %n.cb; | %n.lb; | %n.milestone;
| %n.pb;' >
<!ENTITY % x.globincl '' >
<!ENTITY % m.globincl '%x.globincl; %m.metadata; | %m.refsys;' >
#PCDATA
elementsEach element which now has a content model of #PCDATA
should, for compatibility, be revised to have a content model
of (#PCDATA | %m.Incl;)*
.
In some cases, it might be preferable to leave the content model
alone: it's not clear that it's really useful to allow index entries,
feature structure libraries, and joins to occur within attribute names,
generic identifiers, and the components of structured times and dates.
Even within generic identifiers and so on, there might be line
breaks, page breaks, or other milestones, but perhaps we should
define at least some of these elements as (#PCDATA |
%m.refsys;)*
.
For now, for purposes of the experimental XML DTD, I propose to use the first form given.
First, we suppress all of these elements:
< 43 Suppress standard definitions of PCDATA elements > =
<!ENTITY % day 'IGNORE' >
<!ENTITY % hour 'IGNORE' >
<!ENTITY % minute 'IGNORE' >
<!ENTITY % month 'IGNORE' >
<!ENTITY % offset 'IGNORE' >
<!ENTITY % second 'IGNORE' >
<!ENTITY % week 'IGNORE' >
<!ENTITY % year 'IGNORE' >
<!ENTITY % idno 'IGNORE' >
<!ENTITY % postBox 'IGNORE' >
<!ENTITY % postCode 'IGNORE' >
<!ENTITY % str 'IGNORE' >
Then we supply the new declarations:
< 44 New definitions for PCDATA elements > =
<![%TEI.names.dates;[
<!ENTITY % XML.day "INCLUDE" >
<![%XML.day;[
<!ELEMENT %n.day; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.day; %a.global;
%a.temporalExpr;
TEIform CDATA 'day' >
]]>
<!ENTITY % XML.hour "INCLUDE" >
<![%XML.hour;[
<!ELEMENT %n.hour; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.hour; %a.global;
%a.temporalExpr;
TEIform CDATA 'hour' >
]]>
<!ENTITY % XML.minute "INCLUDE" >
<![%XML.minute;[
<!ELEMENT %n.minute; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.minute; %a.global;
%a.temporalExpr;
TEIform CDATA 'minute' >
]]>
<!ENTITY % XML.month "INCLUDE" >
<![%XML.month;[
<!ELEMENT %n.month; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.month; %a.global;
%a.temporalExpr;
TEIform CDATA 'month' >
]]>
<!ENTITY % XML.offset "INCLUDE" >
<![%XML.offset;[
<!ELEMENT %n.offset; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.offset; %a.global;
value CDATA #IMPLIED
%a.placePart;
TEIform CDATA 'offset' >
]]>
<!ENTITY % XML.second "INCLUDE" >
<![%XML.second;[
<!ELEMENT %n.second; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.second; %a.global;
%a.temporalExpr;
TEIform CDATA 'second' >
]]>
<!ENTITY % XML.week "INCLUDE" >
<![%XML.week;[
<!ELEMENT %n.week; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.week; %a.global;
%a.temporalExpr;
TEIform CDATA 'week' >
]]>
<!ENTITY % XML.year "INCLUDE" >
<![%XML.year;[
<!ELEMENT %n.year; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.year; %a.global;
%a.temporalExpr;
TEIform CDATA 'year' >
]]>
]]>
<!ENTITY % XML.idno "INCLUDE" >
<![%XML.idno;[
<!ELEMENT %n.idno; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.idno; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'idno' >
]]>
<!ENTITY % XML.postBox "INCLUDE" >
<![%XML.postBox;[
<!ELEMENT %n.postBox; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.postBox; %a.global;
TEIform CDATA 'postBox' >
]]>
<!ENTITY % XML.postCode "INCLUDE" >
<![%XML.postCode;[
<!ELEMENT %n.postCode; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.postCode; %a.global;
TEIform CDATA 'postCode' >
]]>
<![%TEI.fs;[
<!ENTITY % XML.str "INCLUDE" >
<![%XML.str;[
<!ELEMENT %n.str; - - (#PCDATA | %m.Incl;)* >
<!ATTLIST %n.str; %a.global;
rel (eq | ne | sb | ns | lt | le | gt
| ge) eq
TEIform CDATA 'str' >
]]>
]]>
The parameter entity phrase.seq should be redefined as follows:
< 45 New declaration for phrase and phrase.seq > =
<!ENTITY % phrase '#PCDATA | %m.phrase; | %m.Incl;' >
<!ENTITY % phrase.seq '(%phrase;)*' >
No changes to the actual content models are needed. (Ah, the joys of indirection.)
(Note, 14 May 1999.) No, wait, actually, that's not true. Many of these declarations read
<!ELEMENT %n.foo; - O (%phrase.seq;) >which, expanded, would be
<!ELEMENT %n.foo; - O ((#PCDATA | %m.phrase; | %m.Incl;)*)>which is illegal. The content models do need to be changed, to
<!ELEMENT %n.foo; %phrase.seq; >This is only required if we wish to allow the extensions file to work with the current (1994-09) production DTDs. Since those are what I currently have on this laptop, I do wish. But since we will shortly be releasing corrected versions, we want to make this part of the extensions file optional. We'll do so using a conditional inclusion on the parameter entity base9409, which by default will be defined
IGNORE
.The same logic applies to paraContent and (for now) specialPara.
(Note, 30 May 1999.) No, no, wait. Doesn't carthage already normalize these correctly by omitting extra parentheses? I've already spent several hours making the scraps below, and now realize we may not need them after all. (17 June 1999.) I've removed them, since carthage actually does produce legal XML.
The entity component.seq must be redefined to allow inclusions between any two components. In the long run, the changes should be made directly within the various declarations which go into component.seq, but those declarations are among the most complicated of the entire TEI DTD, since there are variant versions for each of the two hundred or so possible combinations of base tag sets.
The quick and dirty approach most suitable for use in the experimental XML DTD is to include the Incl class as a subclass of common, thus:
< 46 New declaration for x.common > =
<!ENTITY % x.common '%m.Incl; |'>
Experiment shows that it does indeed introduce ambiguity in content models, notably those for <body> and text divisions. Rather than hack at those content models, I am going to take the longer and slower approach.
< 47 New declaration for component and component.seq > =
<!ENTITY % x.common '' >
<!ENTITY % m.common '%x.common %m.bibl; | %m.chunk; |
%m.hqinter; | %m.lists; | %m.notes; | %n.stage;' >
< Reproduce standard component declarations 48 >
<!-- The entity component.seq is always a starred sequence -->
<!-- of component elements. Its definition does not vary -->
<!-- with the base (unless we are using the general base, in -->
<!-- which case it has already been defined above), but the -->
<!-- meaning of the definition does. -->
<!ENTITY % component.seq '((%component;), (%m.Incl;)*)*' >
< 48 Reproduce standard component declarations > =
<!ENTITY % mix.verse '' >
<!ENTITY % mix.drama '' >
<!ENTITY % mix.spoken '' >
<!ENTITY % mix.dictionaries '' >
<!ENTITY % mix.terminology '' >
<![ %TEI.mixed; [
<!ENTITY % TEI.singleBase 'IGNORE' >
<!ENTITY % component '(%m.common; %mix.verse; %mix.drama;
%mix.spoken; %mix.dictionaries; %mix.terminology;)' >
]]>
<![ %TEI.general; [
<!ENTITY % TEI.singleBase 'IGNORE' >
<!ENTITY % component '(%m.common; %mix.verse; %mix.drama;
%mix.spoken; %mix.dictionaries; %mix.terminology;)' >
<![ %TEI.verse; [
<!ENTITY % gen.verse '((%m.comp.verse;), (%m.common; |
%m.comp.verse; | %m.Incl;)*) |' >
]]>
<![ %TEI.drama; [
<!ENTITY % gen.drama '((%m.comp.drama;), (%m.common; |
%m.comp.drama; | %m.Incl;)*) |' >
]]>
<![ %TEI.spoken; [
<!ENTITY % gen.spoken '((%m.comp.spoken;), (%m.common; |
%m.comp.spoken; | %m.Incl;)*) |' >
]]>
<![ %TEI.dictionaries; [
<!ENTITY % gen.dictionaries '((%m.comp.dictionaries;),
(%m.common; | %m.comp.dictionaries; | %m.Incl;)*) |' >
]]>
<![ %TEI.terminology; [
<!ENTITY % gen.terminology '((%m.comp.terminology;), (%m.common;
| %m.comp.terminology; | %m.Incl;)*) |' >
]]>
<!-- Default declarations for all the entities gen.verse, -->
<!-- etc. -->
<!ENTITY % gen.verse '' >
<!ENTITY % gen.drama '' >
<!ENTITY % gen.spoken '' >
<!ENTITY % gen.dictionaries '' >
<!ENTITY % gen.terminology '' >
<!ENTITY % component.seq '((%m.common;), (%m.Incl;)*)*,
(%gen.verse; %gen.drama; %gen.spoken; %gen.dictionaries;
%gen.terminology; TEI...end)?' >
<!ENTITY % component.plus '(%gen.verse; %gen.drama; %gen.spoken;
%gen.dictionaries; %gen.terminology; TEI...end)
|
( ((%m.common;), (%m.Incl;)*)+,
(%gen.verse; %gen.drama; %gen.spoken;
%gen.dictionaries; %gen.terminology; TEI...end)?' >
<!-- (End of marked section for general base.) -->
]]>
<![ %TEI.prose; [
<!ENTITY % component '(%m.common;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<![ %TEI.verse; [
<!ENTITY % component '(%m.common; | %m.comp.verse;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<![ %TEI.drama; [
<!ENTITY % component '(%m.common; | %m.comp.drama;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<![ %TEI.spoken; [
<!ENTITY % component '(%m.common; | %m.comp.spoken;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<![ %TEI.dictionaries; [
<!ENTITY % component '(%m.common; | %m.comp.dictionaries;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<![ %TEI.terminology; [
<!ENTITY % component '(%m.common; | %m.comp.terminology;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
]]>
<!-- Default declaration. -->
<!ENTITY % component '(%m.common;)' >
<!ENTITY % TEI.singleBase 'INCLUDE' >
The parameter entity paraContent must be changed as follows:
< 49 New declaration for paraContent > =
<!ENTITY % paraContent '(#PCDATA | %m.phrase; | %m.inter;
| %m.Incl;)*' >
No change to actual content models is needed.
(Note, 14 May 1999.) No, wait, actually, that's not true. Many of these declarations read
<!ELEMENT %n.p; - O (%paraContent;) >which, expanded, would be
<!ELEMENT %n.p; - O ((#PCDATA | %m.phrase; | %m.inter; | %m.Incl;)*) >which is illegal. The content models do need to be changed, to
<!ELEMENT %n.p; - O %paraContent; >
For now, though, we can rely on carthage to do the job, so I've deleted the long boring scraps that used to be here.
In TEI P3, the entity specialPara is defined thus:
<!ENTITY % specialPara '(((%m.chunk), (%component.seq)) | (%paraContent))' >It allows an element to contain either a series of chunks or the same content as a paragraph. It is intended for elements like notes and list items: the normal case, in which the item consists of a single paragraph, can be tagged simply (
<item> ... </item>
)
and the multi-paragraph case can be accommodated using nested paragraphs
or other chunk-level elements
(<item><p> ... </p><p> ... </p></item>
).
In practice, the multi-paragraph form has proven very disconcerting to
users, since it is not intuitively obvious that no white space may
appear between the paragraphs.[8]
The current definition and use of
specialPara are thus acknowledged by the editors to be an
error. Since there is no obvious solution, however, it is not a
corrigible error.In changing specialPara to meet the requirements of XML, there are three obvious possible solutions. We can overgenerate, so as to allow all existing data to remain valid:
<!ENTITY % specialPara '(#PCDATA | %m.phrase; | %m.inter; | %m.chunk;)*' >This has the drawback of allowing paragraphs and other chunk-level elements to float within character data, thus violating one of the few consistently followed rules of the TEI DTD.
Alternatively, we can bite the bullet and require that list items and notes which consist of a single paragraph be marked as such:
<!ENTITY % specialPara '%component.seq;' >This has the advantage of being relatively clean, but it has the major disadvantage of requiring retagging for almost all current list items and notes. What is now tagged
<item> ... </item>
would have to be retagged
<item><p> ... </p></item>
.
The best that can be said is that such retagging could in principle be
automated.A third approach would be to have distinct element types for simple list items and notes, and compound ones. The simple form could be defined as containing paraContent, and the compound ones as containing component.seq. This would also require retagging (of all compound list items and notes), but not as much as the previous approach.
For purposes of the experimental XML DTD, we take the first approach.
The following element types are defined as containing specialPara:
All but one of these can be fixed simply by redefining specialPara thus:
< 50 New specialPara > =
<!ENTITY % specialPara '(#PCDATA | %m.phrase; | %m.inter;
| %m.chunk; | %m.Incl;)*' >
< 51 Reproduce classes used by specPara > =
<!ENTITY % x.hqinter '' >
<!ENTITY % m.hqinter '%x.hqinter; %n.cit; | %n.q; | %n.quote;' >
<!ENTITY % x.bibl '' >
<!ENTITY % m.bibl '%x.bibl; %n.bibl; | %n.biblFull; |
%n.biblStruct;' >
<!ENTITY % x.lists '' >
<!ENTITY % m.lists '%x.lists; %n.label; | %n.list; |
%n.listBibl;' >
<!ENTITY % x.notes '' >
<!ENTITY % m.notes '%x.notes; %n.note; | %n.witDetail;' >
<!ENTITY % x.stageDirection '' >
<!ENTITY % m.stageDirection '%x.stageDirection; %n.camera; |
%n.caption; | %n.move; | %n.sound; | %n.tech; |
%n.view;' >
<!ENTITY % x.inter '' >
<!ENTITY % m.inter '%x.inter; %m.bibl; | %m.hqinter; | %m.lists;
| %m.notes; | %m.stageDirection; | %n.castList; |
%n.figure; | %n.stage; | %n.table; | %n.text;' >
<!ENTITY % x.chunk '' >
<!ENTITY % m.chunk '%x.chunk; %n.ab; | %n.eTree; | %n.graph; |
%n.l; |
%n.lg; | %n.p; | %n.sp; | %n.tree; | %n.witList;' >
The <ab> element is new and we need to declare its content model:
< 52 Declare new GIs 23 (cont'd) > =
<!ENTITY % n.ab 'ab' >
Only one content model must be redefined by hand, to flatten the group: that of <set> in the drama tag set. The current definition is this:
<!ELEMENT set - - ((head)?, %specialPara;) > <!ATTLIST set %a.global; TEIform CDATA 'set' >If we flatten this in the expected way, we get this:
<!ELEMENT %n.set; - - (#PCDATA | %m.phrase; | %m.inter; | %m.chunk; | %m.Incl; | %n.head;)* > <!ATTLIST %n.set; %a.global; TEIform CDATA 'set' >This has the unfortunate result of allowing <head> elements at random locations; it might be better, in this case, to tighten the content model instead.[9] Version 2 of the new model is this:
< 53 New definition of set element > =
<![%TEI.drama;[
<!ENTITY % XML.set "INCLUDE" >
<![%XML.set;[
<!ELEMENT %n.set; - - ((%n.head;)?, %component.seq;) >
<!ATTLIST %n.set; %a.global;
TEIform CDATA 'set' >
]]>
]]>
<!ELEMENT %n.set; - - ((%m.Incl;)*, (%n.head;)?, %component.seq;) >For now, the experimental XML version of the DTD will use Version 2 of this declaration.
(Scraps suppressing and redeclaring the remaining elements to be supplied here.)
The elements to be treated here are: address, altgrp, analytic, app, argument, availability, back, bibl, biblfull, biblstruct, body, broadcast, byline, c, castgroup, castitem, castlist, cit, closer, dateline, datestruct, div, div0, div1, div2, div3, div4, div5, div6, div7, docimprint, doctitle, editionStmt, epilogue, equipment, etree, f, falt, figure, flib, formula, front, fs, fslib, fvlib, graph, group, imprint, interpgrp, joingrp, lg, lg1, lg2, lg3, lg4, lg5, linkgrp, list, listbibl, m, monogr, notesStmt, ofig, opener, ovar, performance, prologue, publicationStmt, pvar, rdggrp, recording, recordingStmt, respStmt, row, scriptStmt, series, seriesStmt, set, sourcedesc, sp, spangrp, table, termentry, text, tig, timeline, timestruct, titlepage, titleStmt, tree, triangle, u, valt, w, and witlist.
The following sections provide the DTD fragments necessary for suppressing the existing declarations for these elements and declaring them with new content models.
< 54 Suppress definitions in core tag set > =
<!ENTITY % address 'IGNORE' >
<!ENTITY % analytic 'IGNORE' >
<!ENTITY % bibl 'IGNORE' >
<!ENTITY % biblFull 'IGNORE' >
<!ENTITY % biblStruct 'IGNORE' >
<!ENTITY % cit 'IGNORE' >
<!ENTITY % imprint 'IGNORE' >
<!ENTITY % lg 'IGNORE' >
<!ENTITY % list 'IGNORE' >
<!ENTITY % listBibl 'IGNORE' >
<!ENTITY % monogr 'IGNORE' >
<!ENTITY % respStmt 'IGNORE' >
<!ENTITY % series 'IGNORE' >
<!ENTITY % sp 'IGNORE' >
The existing declarations are these:
<!ELEMENT %n.address; - O ((%n.addrLine)+ | (%m.addrPart)*) > <!ELEMENT %n.analytic; - O (%n.author; | %n.editor; | %n.respStmt; | %n.title;)* > <!ELEMENT %n.bibl; - O (#PCDATA | %m.phrase; | %m.biblPart;)* > <!ELEMENT %n.biblFull; - O (%n.titleStmt;, (%n.editionStmt)?, (%n.extent)?, %n.publicationStmt;, (%n.seriesStmt)?, (%n.notesStmt)?, (%n.sourceDesc)*) > <!ELEMENT %n.biblStruct; - O ((%n.analytic)?, (%n.monogr;, (%n.series)*)+, (%n.note; | %n.idno;)*) > <!ELEMENT %n.cit; - - ((%n.q; | %n.quote;) & (%m.bibl; | %m.loc;)) > <!ELEMENT %n.imprint; - O (%n.pubPlace; | %n.publisher; | %n.date; | %n.biblScope;)* > <!ELEMENT %n.lg; - O ((%m.divtop)*, (%n.l; | %n.lg;)+, (%m.divbot)*) > <!ELEMENT %n.list; - - ( (%n.head)?, ( ( (%n.item)+ ) | ( (%n.headLabel)?, (%n.headItem)?, (%n.label;, %n.item;)+))) > <!ELEMENT %n.listBibl; - - ((%n.head)?, (%n.bibl; | %n.biblStruct; | %n.biblFull;)+, (%n.trailer)?) > <!ELEMENT %n.monogr; - O ( ( ( (%n.author; | %n.editor; | %n.respStmt;)+, (%n.title)+, (%n.editor; | %n.respStmt;)*) | ( (%n.title)+, (%n.author; | %n.editor; | %n.respStmt;)*))?, (%n.note; | %n.meeting;)*, (%n.edition;, (%n.editor; | %n.respStmt;)*)*, %n.imprint;, (%n.imprint; | %n.extent; | %n.biblScope;)* ) > <!ELEMENT %n.respStmt; - O ((%n.resp; & %n.name;), (%n.resp; | %n.name;)*) > <!ELEMENT %n.series; - O (%n.title; | %n.editor; | %n.respStmt; | %n.biblScope;)* > <!ELEMENT %n.sp; - O ((%n.speaker)?, (%n.p; | %n.l; | %n.lg; | %n.seg; | %n.stage;)+) >
The new definitions are these; note that <cit> and <respStmt> have already been declared above.
< 55 New definitions for core tag set > =
<!ENTITY % XML.address "INCLUDE" >
<![%XML.address;[
<!ELEMENT %n.address; - O ((%m.Incl;)*,
( (%n.addrLine;, (%m.Incl;)*)+
| ((%m.addrPart;), (%m.Incl;)*)*)) >
<!ATTLIST %n.address; %a.global;
TEIform CDATA 'address' >
]]>
< 56 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.analytic "INCLUDE" >
<![%XML.analytic;[
<!ELEMENT %n.analytic; - O (%n.author; | %n.editor;
| %n.respStmt; | %n.title;
| %m.Incl;)* >
<!ATTLIST %n.analytic; %a.global;
TEIform CDATA 'analytic' >
]]>
< 57 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.bibl "INCLUDE" >
<![%XML.bibl;[
<!ELEMENT %n.bibl; - O (#PCDATA | %m.phrase; |
%m.biblPart; | %m.Incl;)* >
<!ATTLIST %n.bibl; %a.global;
%a.declarable;
TEIform CDATA 'bibl' >
]]>
< 58 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.biblFull "INCLUDE" >
<![%XML.biblFull;[
<!ELEMENT %n.biblFull; - O ((%m.Incl;)*,
(%n.titleStmt;, (%m.Incl;)*),
(%n.editionStmt;, (%m.Incl;)*)?,
(%n.extent;, (%m.Incl;)*)?,
(%n.publicationStmt;, (%m.Incl;)*),
(%n.seriesStmt;, (%m.Incl;)*)?,
(%n.notesStmt;, (%m.Incl;)*)?,
(%n.sourceDesc;, (%m.Incl;)*)*
) >
<!ATTLIST %n.biblFull; %a.global;
%a.declarable;
TEIform CDATA 'biblFull' >
]]>
< 59 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.biblStruct "INCLUDE" >
<![%XML.biblStruct;[
<!ELEMENT %n.biblStruct;
- O ((%m.Incl;)*,
(%n.analytic;, (%m.Incl;)*)?,
( (%n.monogr;, (%m.Incl;)*),
(%n.series;, (%m.Incl;)*)* )+,
( (%n.note; | %n.idno;),
(%m.Incl;)*)*) >
<!ATTLIST %n.biblStruct; %a.global;
%a.declarable;
TEIform CDATA 'biblStruct' >
]]>
<!-- cit has already been declared. -->
< 60 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.imprint "INCLUDE" >
<![%XML.imprint;[
<!ELEMENT %n.imprint; - O (%n.pubPlace; | %n.publisher;
| %n.date; | %n.biblScope;
| %m.Incl;)* >
<!ATTLIST %n.imprint; %a.global;
TEIform CDATA 'imprint' >
]]>
< 61 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.lg "INCLUDE" >
<![%XML.lg;[
<!ELEMENT %n.lg; - O ((%m.divtop; | %m.Incl;)*,
(%n.l; | %n.lg;),
(%n.l; | %n.lg; | %m.Incl;)*,
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.lg; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg' >
]]>
< 62 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.list "INCLUDE" >
<![%XML.list;[
<!ELEMENT %n.list; - - ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
( ((%n.item;, (%m.Incl;)*)*) |
( (%n.headLabel;, (%m.Incl;)*)?,
(%n.headItem;, (%m.Incl;)*)?,
(%n.label;, (%m.Incl;)*,
%n.item;, (%m.Incl;)*)+))) >
<!ATTLIST %n.list; %a.global;
type CDATA simple
TEIform CDATA 'list' >
]]>
< 63 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.listBibl "INCLUDE" >
<![%XML.listBibl;[
<!ELEMENT %n.listBibl; - - ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
(%n.bibl; | %n.biblStruct;
| %n.biblFull;),
(%n.bibl; | %n.biblStruct;
| %n.biblFull; | %m.Incl;)*,
(%n.trailer;, (%m.Incl;)*)?) >
<!ATTLIST %n.listBibl; %a.global;
%a.declarable;
TEIform CDATA 'listBibl' >
]]>
< 64 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.monogr "INCLUDE" >
<![%XML.monogr;[
<!ELEMENT %n.monogr; - O (
((%m.Incl;)*,
((
(%n.author; | %n.editor; | %n.respStmt;),
(%n.author; | %n.editor;
| %n.respStmt; | %m.Incl;)*,
(%n.title;, (%m.Incl;)*)+,
((%n.editor; | %n.respStmt;), (%m.Incl;)*)*
)
|
(
(%n.title;, (%m.Incl;)*)+,
(
(%n.author; | %n.editor; | %n.respStmt;),
(%m.Incl;)*
)*
))
)?,
((%n.note; | %n.meeting;), (%m.Incl;)*)*,
(%n.edition;,
(%n.editor; | %n.respStmt; | %m.Incl;)*)*,
%n.imprint;,
(%n.imprint; | %n.extent; |
%n.biblScope; | %m.Incl;)*
) >
<!ATTLIST %n.monogr; %a.global;
TEIform CDATA 'monogr' >
]]>
<!-- respStmt has already been declared -->
< 65 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.series "INCLUDE" >
<![%XML.series;[
<!ELEMENT %n.series; - O (%n.title; | %n.editor; |
%n.respStmt; | %n.biblScope;
| %m.Incl;)* >
<!ATTLIST %n.series; %a.global;
TEIform CDATA 'series' >
]]>
< 66 New definitions for core tag set 55 (cont'd) > =
<!ENTITY % XML.sp "INCLUDE" >
<![%XML.sp;[
<!ELEMENT %n.sp; - O ((%m.Incl;)*,
(%n.speaker;, (%m.Incl;)*)?,
((%n.p; | %n.l; | %n.lg; | %n.seg; | %n.ab;
| %n.stage;), (%m.Incl;)*)+) >
<!ATTLIST %n.sp; %a.global;
who IDREFS #IMPLIED
TEIform CDATA 'sp' >
]]>
< 67 Suppress definitions in text-structure tag set > =
<!ENTITY % argument 'IGNORE' >
<!ENTITY % back 'IGNORE' >
<!ENTITY % body 'IGNORE' >
<!ENTITY % byline 'IGNORE' >
<!ENTITY % closer 'IGNORE' >
<!ENTITY % dateline 'IGNORE' >
<!ENTITY % div 'IGNORE' >
<!ENTITY % div0 'IGNORE' >
<!ENTITY % div1 'IGNORE' >
<!ENTITY % div2 'IGNORE' >
<!ENTITY % div3 'IGNORE' >
<!ENTITY % div4 'IGNORE' >
<!ENTITY % div5 'IGNORE' >
<!ENTITY % div6 'IGNORE' >
<!ENTITY % div7 'IGNORE' >
<!ENTITY % group 'IGNORE' >
<!ENTITY % opener 'IGNORE' >
<!ENTITY % text 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.argument; - - ((%n.head)?, %component.seq;) > <!ELEMENT %n.back; - O ( (%m.front)*, ( ( (%m.divtop), (%m.divtop | %n.titlePage;)*) | ( (%n.div;), (%n.div; | (%m.front))*) | ( (%n.div1;), (%n.div1; | (%m.front))*) )? ) > <!ELEMENT %n.body; - O ((%m.divtop;)*, ( ( (%n.divGen)*, ( (%n.div;, (%n.div; | %n.divGen;)*) | (%n.div0;, (%n.div0; | %n.divGen;)*) | (%n.div1;, (%n.div1; | %n.divGen;)*) ) ) | ( (%component)+, ((%n.divGen)*, ( (%n.div;, (%n.div; | %n.divGen;)*) | (%n.div0;, (%n.div0; | %n.divGen;)*) | (%n.div1;, (%n.div1; | %n.divGen;)*) )? ))), (%m.divbot;)*) > <!ELEMENT %n.byline; - O (%phrase.seq; | %n.docAuthor;)* > <!ELEMENT %n.closer; - O (%n.signed; | %n.dateline; | %n.salute; | %phrase.seq;)* > <!ELEMENT %n.dateline; - O (%n.date; | %n.time; | %n.name; | #PCDATA | %n.address;)* > <!ELEMENT %n.div; - O ((%m.divtop;)*, ((%n.div; | %n.divGen;)+ | ((%component;)+, (%n.div; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div0; - O ((%m.divtop;)*, ( (%n.div1; | %n.divGen;)+ | ( (%component;)+, (%n.div1; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div1; - O ((%m.divtop;)*, ( (%n.div2; | %n.divGen;)+ | ((%component;)+, (%n.div2; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div2; - O ((%m.divtop;)*, ( (%n.div3; | %n.divGen;)+ | ((%component;)+, (%n.div3; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div3; - O ((%m.divtop;)*, ( (%n.div4; | %n.divGen;)+ | ((%component;)+, (%n.div4; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div4; - O ((%m.divtop;)*, ( (%n.div5; | %n.divGen;)+ | ((%component;)+, (%n.div5; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div5; - O ((%m.divtop;)*, ( (%n.div6; | %n.divGen;)+ | ((%component;)+, (%n.div6; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div6; - O ((%m.divtop;)*, ((%n.div7; | %n.divGen;)+ | ((%component;)+, (%n.div7; | %n.divGen;)*)), (%m.divbot;)*) > <!ELEMENT %n.div7; - O ((%m.divtop;)*, (%component;)+, (%m.divbot;)*) > <!ELEMENT %n.group; - O ((%m.divtop;)*, (%n.text; | %n.group;)+, (%m.divbot;)*) > <!ELEMENT %n.opener; - O (%n.signed; | %n.dateline; | %n.salute; | %phrase.seq;)* > <!ELEMENT %n.text; - - ((%n.front)?, (%n.body; | %n.group;), (%n.back)?) +(%m.globincl;) >
The new definitions are as follows:
< 68 New definitions for text-structure tag set > =
<!ENTITY % XML.argument "INCLUDE" >
<![%XML.argument;[
<!ELEMENT %n.argument; - - ((%m.Incl;)*, (%n.head;,
%component.seq;)?) >
<!ATTLIST %n.argument; %a.global;
TEIform CDATA 'argument' >
]]>
< 69 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.back "INCLUDE" >
<![%XML.back;[
<!ELEMENT %n.back; - O
( (%m.front; | %m.Incl;)*,
( ( (%m.divtop;),
(%m.divtop; | %n.titlePage;
| %m.Incl;)*)
|
( (%n.div;),
(%n.div; | %m.front; | %m.Incl;)*)
|
( (%n.div1;),
(%n.div1; | %m.front; | %m.Incl;)*)
)?
) >
<!ATTLIST %n.back; %a.global;
%a.declaring;
TEIform CDATA 'back' >
]]>
< 70 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.body "INCLUDE" >
<![%XML.body;[
<!ELEMENT %n.body; - O (
(%m.divtop; | %m.Incl;)*,
(
(
((%component;), (%m.Incl;)*)+,
((%n.divGen;, (%m.Incl;)*)*,
( (%n.div;,
(%n.div; | %n.divGen; | %m.Incl;)*)
|
(%n.div0;,
(%n.div0; | %n.divGen; | %m.Incl;)*)
|
(%n.div1;,
(%n.div1; | %n.divGen; | %m.Incl;)*)
)?
)
)
|
( (%n.divGen;, (%m.Incl;)*)*,
( (%n.div;,
(%n.div; | %n.divGen; | %m.Incl;)*)
|
(%n.div0;,
(%n.div0; | %n.divGen; | %m.Incl;)*)
|
(%n.div1;,
(%n.div1; | %n.divGen; | %m.Incl;)*)
)
)
),
((%m.divbot;), (%m.Incl;)*)*
) >
<!ATTLIST %n.body; %a.global;
%a.declaring;
TEIform CDATA 'body' >
]]>
< 71 New definitions for text-structure tag set 68 (cont'd) > =
<!--* byline, closer, and dateline have already been done *-->
<!ENTITY % XML.div "INCLUDE" >
<![%XML.div;[
<!ELEMENT %n.div; - O (
(%m.divtop; | %m.Incl;)*,
( ((%n.div; | %n.divGen;), (%m.Incl;)*)+
|
( (%component;, (%m.Incl;)*)+,
((%n.div; | %n.divGen;), (%m.Incl;)*)*)
),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div' >
]]>
< 72 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div0 "INCLUDE" >
<![%XML.div0;[
<!ELEMENT %n.div0; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div1; |
%n.divGen;), (%m.Incl;)*)+ | ( (%component;, (%m.Incl;)*)+,
((%n.div1; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div0; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div0' >
]]>
< 73 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div1 "INCLUDE" >
<![%XML.div1;[
<!ELEMENT %n.div1; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div2; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div2; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div1; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div1' >
]]>
< 74 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div2 "INCLUDE" >
<![%XML.div2;[
<!ELEMENT %n.div2; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div3; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div3; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div2; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div2' >
]]>
< 75 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div3 "INCLUDE" >
<![%XML.div3;[
<!ELEMENT %n.div3; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div4; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div4; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div3; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div3' >
]]>
< 76 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div4 "INCLUDE" >
<![%XML.div4;[
<!ELEMENT %n.div4; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div5; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div5; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div4; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div4' >
]]>
< 77 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div5 "INCLUDE" >
<![%XML.div5;[
<!ELEMENT %n.div5; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div6; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div6; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div5; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div5' >
]]>
< 78 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div6 "INCLUDE" >
<![%XML.div6;[
<!ELEMENT %n.div6; - O ((%m.divtop; | %m.Incl;)*, ( ((%n.div7; |
%n.divGen;), (%m.Incl;)*)+ | ((%component;, (%m.Incl;)*)+,
((%n.div7; | %n.divGen;), (%m.Incl;)*)*)),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div6; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div6' >
]]>
< 79 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.div7 "INCLUDE" >
<![%XML.div7;[
<!ELEMENT %n.div7; - O ((%m.divtop; | %m.Incl;)*, (%component;, (%m.Incl;)*)+,
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.div7; %a.global;
%a.declaring;
%a.divn;
TEIform CDATA 'div7' >
]]>
< 80 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.group "INCLUDE" >
<![%XML.group;[
<!ELEMENT %n.group; - O ((%m.divtop; | %m.Incl;)*,
((%n.text; | %n.group;),
(%n.text; | %n.group; | %m.Incl;)*),
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.group; %a.global;
%a.declaring;
TEIform CDATA 'group' >
]]>
<!--* opener has already been done *-->
< 81 New definitions for text-structure tag set 68 (cont'd) > =
<!ENTITY % XML.text "INCLUDE" >
<![%XML.text;[
<!ELEMENT %n.text; - - ((%m.Incl;)*,
(%n.front;, (%m.Incl;)*)?,
(%n.body; | %n.group;),
(%m.Incl;)*,
(%n.back;, (%m.Incl;)*)?)
>
<!ATTLIST %n.text; %a.global;
%a.declaring;
TEIform CDATA 'text' >
]]>
< 82 Suppress definitions in front-matter tag set > =
<!--* docimprint has already been suppressed and redefined *-->
<!ENTITY % docTitle 'IGNORE' >
<!ENTITY % front 'IGNORE' >
<!ENTITY % titlePage 'IGNORE' >
The existing declarations are these:
<!ELEMENT %n.front; - O ( (%m.front;)*, ( ( (%m.divtop;), (%m.divtop; | %n.titlePage;)*) | ( (%n.div;), (%n.div; | (%m.front;) )*) | ( (%n.div1;), (%n.div1; | (%m.front;) )*) )? ) > <!ELEMENT %n.titlePage; - O (%m.tpParts;)+ > <!ELEMENT %n.docTitle; - O ((%n.titlePart)+) >
The new definitions are these. The definition for <front> has been changed to use fmchunk instead of divtop.
< 83 New definitions for front-matter tag set > =
<!ENTITY % XML.front "INCLUDE" >
<![%XML.front;[
<!ELEMENT %n.front; - O
( (%m.front; | %m.Incl;)*,
( ( (%m.fmchunk;),
(%m.fmchunk; | %n.titlePage; | %m.Incl;)*)
| ( (%n.div;),
(%n.div; | %m.front; | %m.Incl;)*)
| ( (%n.div1;),
(%n.div1; | %m.front; | %m.Incl;)*)
)?
) >
<!ATTLIST %n.front; %a.global;
%a.declaring;
TEIform CDATA 'front' >
]]>
< 84 New definitions for front-matter tag set 83 (cont'd) > =
<!ENTITY % XML.titlePage "INCLUDE" >
<![%XML.titlePage;[
<!ELEMENT %n.titlePage; - O ((%m.Incl;)*,
(%m.tpParts;),
(%m.tpParts; | %m.Incl;)*) >
<!ATTLIST %n.titlePage; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'titlePage' >
]]>
< 85 New definitions for front-matter tag set 83 (cont'd) > =
<!ENTITY % XML.docTitle "INCLUDE" >
<![%XML.docTitle;[
<!ELEMENT %n.docTitle; - O ((%m.Incl;)*,
(%n.titlePart;, (%m.Incl;)*)+) >
<!ATTLIST %n.docTitle; %a.global;
TEIform CDATA 'docTitle' >
]]>
< 86 Suppress definitions in header tag set > =
<!ENTITY % availability 'IGNORE' >
<!ENTITY % broadcast 'IGNORE' >
<!ENTITY % editionStmt 'IGNORE' >
<!ENTITY % equipment 'IGNORE' >
<!ENTITY % notesStmt 'IGNORE' >
<!-- % publicationStmt is already replaced -->
<!ENTITY % recording 'IGNORE' >
<!ENTITY % recordingStmt 'IGNORE' >
<!ENTITY % scriptStmt 'IGNORE' >
<!ENTITY % seriesStmt 'IGNORE' >
<!ENTITY % sourceDesc 'IGNORE' >
<!ENTITY % titleStmt 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.availability; - O ((%n.p;)+) > <!ELEMENT %n.broadcast; - - ((%n.p)+ | %n.bibl; | %n.biblStruct; | %n.biblFull; | %n.recording;) > <!ELEMENT %n.editionStmt; - O ( (%n.edition;, (%n.respStmt)*) | (%n.p;)+ ) > <!ELEMENT %n.equipment; - O ((%n.p;)+) > <!ELEMENT %n.notesStmt; - O ((%n.note)+) > <!ELEMENT %n.recording; - - ((%n.p)+ | (%n.respStmt; | %n.equipment; | %n.broadcast; | %n.date;)*) > <!ELEMENT %n.recordingStmt; - - ((%n.p)+ | (%n.recording)+ ) > <!ELEMENT %n.scriptStmt; - - ((%n.p)+ | %n.bibl; | %n.biblFull; | %n.biblStruct;) > <!ELEMENT %n.seriesStmt; - O ( (%n.title;, (%n.idno; | %n.respStmt;)*) | (%n.p)+ ) > <!ELEMENT %n.sourceDesc; - - (%n.p; | %n.bibl; | %n.biblFull; | %n.biblStruct; | %n.listBibl; | %n.scriptStmt; | %n.recordingStmt;)+ > <!ELEMENT %n.titleStmt; - O (((%n.title)+, (%n.author; | %n.editor; | %n.sponsor; | %n.funder; | %n.principal; | %n.respStmt;)*)) >
The new definitions are as follows. We've changed the language for some element types, in parallel with changes to TEI P3:
< 87 New definitions for header tag set > =
<!ENTITY % XML.availability "INCLUDE" >
<![%XML.availability;[
<!ELEMENT %n.availability;
- O (%n.p; | %m.Incl;)* >
<!ATTLIST %n.availability; %a.global;
status (free | unknown | restricted)
#IMPLIED
TEIform CDATA 'availability' >
]]>
< 88 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.broadcast "INCLUDE" >
<![%XML.broadcast;[
<!ELEMENT %n.broadcast; - - ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+
| ((%n.bibl; |
%n.biblStruct; | %n.biblFull; |
%n.recording;), (%m.Incl;)*))) >
<!ATTLIST %n.broadcast; %a.global;
%a.declarable;
TEIform CDATA 'broadcast' >
]]>
< 89 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.editionStmt "INCLUDE" >
<![%XML.editionStmt;[
<!ELEMENT %n.editionStmt;
- O ((%m.Incl;)*, ((%n.edition;,
(%n.respStmt; | %m.Incl;)*)
| (%n.p;, (%m.Incl;)*)+)
) >
<!ATTLIST %n.editionStmt; %a.global;
TEIform CDATA 'editionStmt' >
]]>
< 90 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.equipment "INCLUDE" >
<![%XML.equipment;[
<!ELEMENT %n.equipment; - O ((%m.Incl;)*,
(%n.p;, (%m.Incl;)*)+) >
<!ATTLIST %n.equipment; %a.global;
%a.declarable;
TEIform CDATA 'equipment' >
]]>
< 91 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.notesStmt "INCLUDE" >
<![%XML.notesStmt;[
<!ELEMENT %n.notesStmt; - O ((%m.Incl;)*,
(%n.note;, (%m.Incl;)*)+) >
<!ATTLIST %n.notesStmt; %a.global;
TEIform CDATA 'notesStmt' >
]]>
< 92 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.recording "INCLUDE" >
<![%XML.recording;[
<!ELEMENT %n.recording; - - (((%m.Incl;)*,
(%n.p;, (%m.Incl;)*)+)
| ((%n.respStmt; |
%n.equipment; | %n.broadcast; |
%n.date;), (%m.Incl;)*)*) >
<!ATTLIST %n.recording; %a.global;
%a.declarable;
type (audio | video) audio
dur CDATA #IMPLIED
TEIform CDATA 'recording' >
]]>
< 93 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.recordingStmt "INCLUDE" >
<![%XML.recordingStmt;[
<!ELEMENT %n.recordingStmt;
- - ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+
| (%n.recording;, (%m.Incl;)*)+ ))>
<!ATTLIST %n.recordingStmt; %a.global;
TEIform CDATA 'recordingStmt'>
]]>
< 94 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.scriptStmt "INCLUDE" >
<![%XML.scriptStmt;[
<!ELEMENT %n.scriptStmt;
- - ((%m.Incl;)*, ((%n.p;, (%m.Incl;)*)+
| ((%n.bibl; |
%n.biblStruct; | %n.biblFull;),
(%m.Incl;)*))) >
<!ATTLIST %n.scriptStmt; %a.global;
%a.declarable;
TEIform CDATA 'scriptStmt' >
]]>
< 95 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.seriesStmt "INCLUDE" >
<![%XML.seriesStmt;[
<!ELEMENT %n.seriesStmt;
- O ((%m.Incl;)*,
((%n.title;,
(%n.idno; | %n.respStmt; | %m.Incl;)*
)
|
(%n.p;, (%m.Incl;)*)+)
) >
<!ATTLIST %n.seriesStmt; %a.global;
TEIform CDATA 'seriesStmt' >
]]>
< 96 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.sourceDesc "INCLUDE" >
<![%XML.sourceDesc;[
<!ELEMENT %n.sourceDesc;
- - ((%m.Incl;)*, ((%n.p; | %n.bibl;
| %n.biblFull; |
%n.biblStruct; | %n.listBibl; |
%n.scriptStmt; |
%n.recordingStmt;), (%m.Incl;)*)+) >
<!ATTLIST %n.sourceDesc; %a.global;
%a.declarable;
TEIform CDATA 'sourceDesc' >
]]>
< 97 New definitions for header tag set 87 (cont'd) > =
<!ENTITY % XML.titleStmt "INCLUDE" >
<![%XML.titleStmt;[
<!ELEMENT %n.titleStmt; - O ( (%m.Incl;)*,
(%n.title;, (%m.Incl;)*)+,
( (%n.author;
| %n.editor;
| %n.sponsor;
| %n.funder;
| %n.principal;
| %n.respStmt;),
(%m.Incl;)*)*
) >
<!ATTLIST %n.titleStmt; %a.global;
TEIform CDATA 'titleStmt' >
]]>
< 98 Suppress definitions in verse tag set > =
<!ENTITY % lg1 'IGNORE' >
<!ENTITY % lg2 'IGNORE' >
<!ENTITY % lg3 'IGNORE' >
<!ENTITY % lg4 'IGNORE' >
<!ENTITY % lg5 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.lg1; - O ((%n.head)?, (%n.l; | %n.lg2;)+) > <!ELEMENT %n.lg2; - O ((%n.head)?, (%n.l; | %n.lg3;)+) > <!ELEMENT %n.lg3; - O ((%n.head)?, (%n.l; | %n.lg4;)+) > <!ELEMENT %n.lg4; - O ((%n.head)?, (%n.l; | %n.lg5;)+) > <!ELEMENT %n.lg5; - O ((%n.head)?, (%n.l)+) >
The new definitions are as follows:
< 99 New definitions for verse tag set > =
<![%TEI.verse;[
<!ENTITY % XML.lg1 "INCLUDE" >
<![%XML.lg1;[
<!ELEMENT %n.lg1; - O ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
((%n.l; | %n.lg2;),
(%m.Incl;)*)+) >
<!ATTLIST %n.lg1; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg1' >
]]>
< 100 New definitions for verse tag set 99 (cont'd) > =
<!ENTITY % XML.lg2 "INCLUDE" >
<![%XML.lg2;[
<!ELEMENT %n.lg2; - O ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
((%n.l; | %n.lg3;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg2; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg2' >
]]>
< 101 New definitions for verse tag set 99 (cont'd) > =
<!ENTITY % XML.lg3 "INCLUDE" >
<![%XML.lg3;[
<!ELEMENT %n.lg3; - O ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
((%n.l; | %n.lg4;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg3; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg3' >
]]>
< 102 New definitions for verse tag set 99 (cont'd) > =
<!ENTITY % XML.lg4 "INCLUDE" >
<![%XML.lg4;[
<!ELEMENT %n.lg4; - O ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
((%n.l; | %n.lg5;), (%m.Incl;)*)+) >
<!ATTLIST %n.lg4; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg4' >
]]>
< 103 New definitions for verse tag set 99 (cont'd) > =
<!ENTITY % XML.lg5 "INCLUDE" >
<![%XML.lg5;[
<!ELEMENT %n.lg5; - O ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
(%n.l;, (%m.Incl;)*)+) >
<!ATTLIST %n.lg5; %a.global;
%a.divn;
%a.metrical;
TEIform CDATA 'lg5' >
]]>
]]>
< 104 Suppress definitions in drama tag set > =
<!ENTITY % castGroup 'IGNORE' >
<!-- castitem has been done already -->
<!ENTITY % castList 'IGNORE' >
<!ENTITY % epilogue 'IGNORE' >
<!ENTITY % performance 'IGNORE' >
<!ENTITY % prologue 'IGNORE' >
<!ENTITY % set 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.castGroup; - - ( (%n.head;)?, (%n.castItem; | %n.castGroup;)+, (%n.trailer;)?) > <!ELEMENT %n.castItem; - O (%n.role; | %n.roleDesc; | %n.actor; | (%phrase.seq))* > <!ELEMENT %n.castList; - - ( (%m.divtop;)*, (%component;)*, (%n.castItem; | %n.castGroup;)+, (%component;)*) > <!ELEMENT %n.epilogue; - - ((%m.divtop)*, (%component)+, (%m.divbot)*) > <!ELEMENT %n.performance; - - ((%m.divtop)*, (%component)+, (%m.divbot)*) > <!ELEMENT %n.prologue; - - ((%m.divtop)*, (%component)+, (%m.divbot)*) > <!ELEMENT %n.set; - - ((%n.head)?, %specialPara) >
The new definitions are as follows:
< 105 New definitions for drama tag set > =
<![%TEI.drama;[
<!ENTITY % XML.castGroup "INCLUDE" >
<![%XML.castGroup;[
<!ELEMENT %n.castGroup; - - ((%m.Incl;)*, (%n.head;, (%m.Incl;)*)?,
((%n.castItem; |
%n.castGroup;), (%m.Incl;)*)+,
(%n.trailer;, (%m.Incl;)*)?) >
<!ATTLIST %n.castGroup; %a.global;
TEIform CDATA 'castGroup' >
]]>
<!--* castItem has been done elsewhere *-->
< 106 New definitions for drama tag set 105 (cont'd) > =
<!ENTITY % XML.castList "INCLUDE" >
<![%XML.castList;[
<!ELEMENT %n.castList; - - (
(%m.divtop; | %m.Incl;)*,
((%component;), (%m.Incl;)*)*,
((%n.castItem; | %n.castGroup;),
(%m.Incl;)*)+,
((%component;), (%m.Incl;)*)*) >
<!ATTLIST %n.castList; %a.global;
TEIform CDATA 'castList' >
]]>
< 107 New definitions for drama tag set 105 (cont'd) > =
<!ENTITY % XML.epilogue "INCLUDE" >
<![%XML.epilogue;[
<!ELEMENT %n.epilogue; - - ((%m.divtop; | %m.Incl;)*,
((%component;), (%m.Incl;)*)+,
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.epilogue; %a.global;
TEIform CDATA 'epilogue' >
]]>
< 108 New definitions for drama tag set 105 (cont'd) > =
<!ENTITY % XML.performance "INCLUDE" >
<![%XML.performance;[
<!ELEMENT %n.performance;
- - ((%m.divtop; | %m.Incl;)*,
((%component;), (%m.Incl;)*)+,
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.performance; %a.global;
TEIform CDATA 'performance' >
]]>
< 109 New definitions for drama tag set 105 (cont'd) > =
<!ENTITY % XML.prologue "INCLUDE" >
<![%XML.prologue;[
<!ELEMENT %n.prologue; - - ((%m.divtop; | %m.Incl;)*,
((%component;), (%m.Incl;)*)+,
((%m.divbot;), (%m.Incl;)*)*) >
<!ATTLIST %n.prologue; %a.global;
TEIform CDATA 'prologue' >
]]>
<!-- set is already done -->
]]>
< 110 Suppress definitions in spoken-text tag set > =
<!ENTITY % u 'IGNORE' >
The current definition is this:
<!ELEMENT %n.u; - - ((%phrase | %m.comp.spoken)+) >
The new definitions are as follows:
< 111 New definitions for spoken-text tag set > =
<![%TEI.spoken;[
<!ENTITY % XML.u "INCLUDE" >
<![%XML.u;[
<!ELEMENT %n.u; - - (#PCDATA | %m.phrase; | %m.comp.spoken;
| %m.Incl;)* >
<!ATTLIST %n.u; %a.global;
%a.timed;
%a.declaring;
trans (smooth | latching | overlap |
pause) smooth
who IDREF %INHERITED;
TEIform CDATA 'u' >
]]>
]]>
We handle the dictionary tag set below, not here. (The list above does contain <oVar> and <pVar>, but that must be a mistake.)
< 112 Suppress definitions in terminology tag set > =
<!ENTITY % ofig 'IGNORE' >
<!ENTITY % termEntry 'IGNORE' >
<!ENTITY % tig 'IGNORE' >
The current definitions in the nested tag set are these:
<!ELEMENT %n.ofig; - O ((%m.terminologyMisc)*, (%n.otherForm;, (%n.gram)*), (%m.terminologyMisc)*) > <!ELEMENT %n.termEntry; - O ((%m.terminologyMisc)*, (%n.tig)+) +(%m.terminologyInclusions) > <!ELEMENT %n.tig; - O ((%m.terminologyMisc)*, (%n.term;, (%n.gram)*), (%m.terminologyMisc)*, (%n.ofig)*) >
Note that <termEntry> has inclusions of its own. These do not require special treatment in our propagation of inclusions, since the set of legal descendants of <termEntry> is the same as the set of legal descendants of <text>. The set of terminology inclusions, however, does need to be revised for future versions of the DTD, since it's not disjoint from elements named in content models. It includes elements normally included in any phrase-level content model; we don't want to include them in m.Incl, since that would cause ambiguity. So all terminological content models should be rewritten for TEI P4, or even P3.5.
The new definitions are as follows:
< 113 New definitions for terminology tag set > =
<![%TEI.terminology;[
<!ENTITY % XML.ofig "INCLUDE" >
<![%XML.ofig;[
<!ELEMENT %n.ofig; - O ((%m.terminologyMisc; | %m.Incl;)*,
(%n.otherForm;, (%n.gram; | %m.Incl;)*),
((%m.terminologyMisc;), (%m.Incl;)*)*) >
<!ATTLIST %n.ofig; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'ofig' >
]]>
< 114 New definitions for terminology tag set 113 (cont'd) > =
<!ENTITY % XML.termEntry "INCLUDE" >
<![%XML.termEntry;[
<!ELEMENT %n.termEntry; - O ((%m.terminologyMisc;
| %m.terminologyInclusions; | %m.Incl;)*,
(%n.tig;,
(%m.Incl; | %m.terminologyInclusions;)*)+)
>
<!ATTLIST %n.termEntry; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'termEntry' >
]]>
< 115 New definitions for terminology tag set 113 (cont'd) > =
<!ENTITY % XML.tig "INCLUDE" >
<![%XML.tig;[
<!ELEMENT %n.tig; - O ((%m.terminologyMisc;
| %m.terminologyInclusions; | %m.Incl;)*,
(%n.term;,
(%n.gram; | %m.terminologyInclusions;
| %m.Incl;)*),
((%m.terminologyMisc;),
(%m.terminologyInclusions; | %m.Incl;)*)*,
(%n.ofig;,
(%m.terminologyInclusions; | %m.Incl;)*)*)
>
<!ATTLIST %n.tig; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'tig' >
]]>
]]>
In the flat version of the terminology tag set, there is no <ofig> and no <tig> element. The current definition of <termEntry> is this one:
<!ELEMENT %n.termEntry; - O ( (%m.terminologyMisc | %n.otherForm; | %n.gram; | %m.terminologyInclusions)*, (%n.term;, (%m.terminologyMisc | %n.otherForm; | %n.gram; | %m.terminologyInclusions)* )+ ) >
The new definition is as follows. Since we need both versions in the extensions file, we invent a new parameter entity (TEI.terminology.flat) to signal the difference between the nested and flat terminology element sets.
< 116 New definitions for flat terminology tag set > =
<![%TEI.terminology;[
<!ENTITY % TEI.terminology.flat 'IGNORE'>
<![%TEI.terminology.flat;[
<!ENTITY % XML.termEntry "INCLUDE" >
<![%XML.termEntry;[
<!ELEMENT %n.termEntry; - O ( (%m.terminologyMisc; |
%n.otherForm; | %n.gram; |
%m.terminologyInclusions; | %m.Incl;)*,
(%n.term;,
(%m.terminologyMisc; |
%n.otherForm; | %n.gram; |
%m.terminologyInclusions; | %m.Incl;)*
)+
) >
<!ATTLIST %n.termEntry; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'termEntry' >
]]>
]]>
]]>
< 117 Suppress definitions in segmentation and alignment tag set > =
<!ENTITY % altGrp 'IGNORE' >
<!ENTITY % joinGrp 'IGNORE' >
<!ENTITY % linkGrp 'IGNORE' >
<!ENTITY % timeline 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.altGrp; - - ((%n.alt; | %n.ptr; | %n.xptr;)*) > <!ELEMENT %n.joinGrp; - - ((%n.join; | %n.ptr; | %n.xptr;)*) > <!ELEMENT %n.linkGrp; - - (%n.link; | %n.ptr; | %n.xptr;)+ > <!ELEMENT %n.timeline; - - ((%n.when;)+) >
The new definitions are as follows. We take the opportunity to level the declarations by using stars, instead of plus signs, on all of them. This has the drawback of allowing a link group to contain no links (only members of m.Incl), but the advantage of dramatically simplifying the content model.
< 118 New definitions for segmentation and alignment tag set > =
<![%TEI.linking;[
<!ENTITY % XML.altGrp "INCLUDE" >
<![%XML.altGrp;[
<!ELEMENT %n.altGrp; - - ((%n.ptr; | %n.xptr; | %m.Incl;)*) >
<!ATTLIST %n.altGrp; %a.global;
%a.pointerGroup;
mode (excl | incl) excl
wScale (perc | real) perc
TEIform CDATA 'altGrp' >
]]>
< 119 New definitions for segmentation and alignment tag set 118 (cont'd) > =
<!ENTITY % XML.joinGrp "INCLUDE" >
<![%XML.joinGrp;[
<!ELEMENT %n.joinGrp; - - ((%n.ptr; | %n.xptr; | %m.Incl;)*)
>
<!ATTLIST %n.joinGrp; %a.global;
%a.pointerGroup;
result CDATA #IMPLIED
desc CDATA #IMPLIED
TEIform CDATA 'joinGrp' >
]]>
< 120 New definitions for segmentation and alignment tag set 118 (cont'd) > =
<!ENTITY % XML.linkGrp "INCLUDE" >
<![%XML.linkGrp;[
<!ELEMENT %n.linkGrp; - - (%n.ptr; | %n.xptr; | %m.Incl;)* >
<!ATTLIST %n.linkGrp; %a.global;
%a.pointerGroup;
TEIform CDATA 'linkGrp' >
]]>
< 121 New definitions for segmentation and alignment tag set 118 (cont'd) > =
<!ENTITY % XML.timeline "INCLUDE" >
<![%XML.timeline;[
<!ELEMENT %n.timeline; - - ((%n.when;), (%m.Incl;)*)+ >
<!ATTLIST %n.timeline; %a.global;
origin IDREF #REQUIRED
unit NMTOKEN #IMPLIED
interval NUTOKEN #IMPLIED
TEIform CDATA 'timeline' >
]]>
]]>
We have included m.Incl within these content models in the interests of consistency: this document is intended to provide an XML-compatible DTD which accepts all valid TEI P3 documents, and does not change the language unnecessarily. In the long run, however, it seems unlikely that we need to allow any m.Incl elements within any of these content models. Page breaks really and truly do not occur within link groups. Allowing timelines to nest within timelines is daft. And as we have seen, adding m.Incl to the original content models introduces ambiguity, since some members of that class were already named in the models. Removing the explicit mention avoids the ambigutity, but renders the content model misleading.
It is the editors' view that in P4, the m.Incl class should not appear in these models; they should revert to the form given in P3.
< 122 Suppress definitions in analysis tag set > =
<!ENTITY % c 'IGNORE' >
<!ENTITY % interpGrp 'IGNORE' >
<!ENTITY % m 'IGNORE' >
<!ENTITY % spanGrp 'IGNORE' >
<!ENTITY % w 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.c; - - (#PCDATA) > <!ELEMENT %n.interpGrp; - - ((%n.interp;)*) > <!ELEMENT %n.m; - - ((#PCDATA | %n.seg; | %n.c;)*) > <!ELEMENT %n.spanGrp; - - ((%n.span;)*) > <!ELEMENT %n.w; - - ((#PCDATA | %n.seg; | %n.w; | %n.m; | %n.c;)*) >
The new definitions are as follows:
< 123 New definitions for analysis tag set > =
<![%TEI.analysis;[
<!ENTITY % XML.c "INCLUDE" >
<![%XML.c;[
<!ELEMENT %n.c; - - (#PCDATA) >
<!ATTLIST %n.c; %a.global;
%a.seg;
TEIform CDATA 'c' >
]]>
< 124 New definitions for analysis tag set 123 (cont'd) > =
<!ENTITY % XML.interpGrp "INCLUDE" >
<![%XML.interpGrp;[
<!--* We should really have: (%n.interp; | %m.Incl;)* -->
<!ELEMENT %n.interpGrp; - - (%m.Incl;)* >
<!ATTLIST %n.interpGrp; %a.global;
%a.interpret;
TEIform CDATA 'interpGrp' >
]]>
< 125 New definitions for analysis tag set 123 (cont'd) > =
<!ENTITY % XML.m "INCLUDE" >
<![%XML.m;[
<!ELEMENT %n.m; - - (#PCDATA | %n.seg; | %n.c; | %m.Incl;)* >
<!ATTLIST %n.m; %a.global;
%a.seg;
baseform CDATA #IMPLIED
TEIform CDATA 'm' >
]]>
< 126 New definitions for analysis tag set 123 (cont'd) > =
<!ENTITY % XML.spanGrp "INCLUDE" >
<![%XML.spanGrp;[
<!--* We should really have: (%n.span; | %m.Incl;)* -->
<!ELEMENT %n.spanGrp; - - (%m.Incl;)* >
<!ATTLIST %n.spanGrp; %a.global;
%a.interpret;
TEIform CDATA 'spanGrp' >
]]>
< 127 New definitions for analysis tag set 123 (cont'd) > =
<!ENTITY % XML.w "INCLUDE" >
<![%XML.w;[
<!ELEMENT %n.w; - - (#PCDATA | %n.seg; | %n.w; |
%n.m; | %n.c; | %m.Incl;)* >
<!ATTLIST %n.w; %a.global;
%a.seg;
lemma CDATA #IMPLIED
TEIform CDATA 'w' >
]]>
]]>
The arguments given above against propagating global inclusions to the segmentation and alignment element types apply with equal or greater force to the feature-structures element types. But we resist the siren song of common sense and press on doggedly toward our goal of an upward-compatible experimental XML DTD.
< 128 Suppress definitions in feature-structures tag set > =
<!ENTITY % f 'IGNORE' >
<!ENTITY % falt 'IGNORE' >
<!ENTITY % flib 'IGNORE' >
<!ENTITY % fs 'IGNORE' >
<!ENTITY % fslib 'IGNORE' >
<!ENTITY % fvlib 'IGNORE' >
<!ENTITY % valt 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.f; - O (%n.null; | (%n.plus; | %n.minus; | any | %n.none; | %n.dft; | %n.uncertain; | %n.sym; | %n.nbr; | %n.msr; | %n.rate; | %n.str; | %n.vAlt; | %n.alt; | %n.fs;)*) > <!ELEMENT %n.fAlt; - - ((%n.f; | %n.fs; | %n.fAlt;), (%n.f; | %n.fs; | %n.fAlt;)+) > <!ELEMENT %n.fLib; - - ((%n.f; | %n.fAlt;)*) > <!ELEMENT %n.fs; - - ((%n.f; | %n.fAlt; | %n.alt;)*) > <!ELEMENT %n.fsLib; - - ((%n.fs; | %n.vAlt;)*) > <!ELEMENT %n.fvLib; - - ((%n.plus; | %n.minus; | any | %n.none; | %n.dft; | %n.uncertain; | %n.null; | %n.sym; | %n.nbr; | %n.msr; | %n.rate; | %n.str; | %n.vAlt;)*) > <!ELEMENT %n.vAlt; - - ((%n.plus; | %n.minus; | any | %n.none; | %n.dft; | %n.uncertain; | %n.null; | %n.sym; | %n.nbr; | %n.msr; | %n.rate; | %n.str; | %n.vAlt; | %n.fs;), (%n.plus; | %n.minus; | any | %n.none; | %n.dft; | %n.uncertain; | %n.null; | %n.sym; | %n.nbr; | %n.msr; | %n.rate; | %n.str; | %n.vAlt; | %n.fs;)+) >
The new definitions are as follows:
< 129 New definitions for feature-structures tag set > =
<![%TEI.fs;[
<!ENTITY % XML.f "INCLUDE" >
<![%XML.f;[
<!ELEMENT %n.f; - O (%n.null; | (%n.plus; | %n.minus;
| any | %n.none; | %n.dft; |
%n.uncertain; | %n.sym; | %n.nbr;
| %n.msr; | %n.rate; | %n.str; |
%n.vAlt; | %n.alt; | %n.fs;)*) >
<!ATTLIST %n.f; %a.global;
name NMTOKEN #REQUIRED
org (single | set | bag | list)
#IMPLIED
rel (eq | ne | sb | ns) eq
fVal IDREFS #IMPLIED
TEIform CDATA 'f' >
]]>
< 130 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.fAlt "INCLUDE" >
<![%XML.fAlt;[
<!ELEMENT %n.fAlt; - - ((%n.f; | %n.fs; | %n.fAlt;),
(%n.f; | %n.fs; | %n.fAlt;)+) >
<!ATTLIST %n.fAlt; %a.global;
mutExcl (Y | N) #IMPLIED
TEIform CDATA 'fAlt' >
]]>
< 131 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.fLib "INCLUDE" >
<![%XML.fLib;[
<!ELEMENT %n.fLib; - - ((%n.f; | %n.fAlt;)*) >
<!ATTLIST %n.fLib; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'fLib' >
]]>
< 132 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.fs "INCLUDE" >
<![%XML.fs;[
<!ELEMENT %n.fs; - - ((%n.f; | %n.fAlt; | %n.alt;)*) >
<!ATTLIST %n.fs; %a.global;
type CDATA #IMPLIED
feats IDREFS #IMPLIED
rel (eq | ne | sb | ns) sb
TEIform CDATA 'fs' >
]]>
< 133 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.fsLib "INCLUDE" >
<![%XML.fsLib;[
<!ELEMENT %n.fsLib; - - ((%n.fs; | %n.vAlt;)*) >
<!ATTLIST %n.fsLib; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'fsLib' >
]]>
< 134 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.fvLib "INCLUDE" >
<![%XML.fvLib;[
<!ELEMENT %n.fvLib; - - ((%n.plus; | %n.minus; | any |
%n.none; | %n.dft; | %n.uncertain;
| %n.null; | %n.sym; | %n.nbr; |
%n.msr; | %n.rate; | %n.str; |
%n.vAlt;)*) >
<!ATTLIST %n.fvLib; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'fvLib' >
]]>
< 135 New definitions for feature-structures tag set 129 (cont'd) > =
<!ENTITY % XML.vAlt "INCLUDE" >
<![%XML.vAlt;[
<!ELEMENT %n.vAlt; - - ((%n.plus; | %n.minus; | any |
%n.none; | %n.dft; | %n.uncertain;
| %n.null; | %n.sym; | %n.nbr; |
%n.msr; | %n.rate; | %n.str; |
%n.vAlt; | %n.fs;), (%n.plus; |
%n.minus; | any | %n.none; |
%n.dft; | %n.uncertain; | %n.null;
| %n.sym; | %n.nbr; | %n.msr; |
%n.rate; | %n.str; | %n.vAlt; |
%n.fs;)+) >
<!ATTLIST %n.vAlt; %a.global;
mutExcl (Y | N) #IMPLIED
TEIform CDATA 'vAlt' >
]]>
]]>
It will be noted that the new versions are identical to the old versions. Common sense has won out, and in this experimental XML version of the TEI DTD, global inclusions are not propagated into these feature-structure element types.
The <dateStruct> and <timeStruct> element types have already been rewritten above.
< 136 Suppress definitions in text-criticism tag set > =
<!ENTITY % app 'IGNORE' >
<!ENTITY % rdgGrp 'IGNORE' >
<!ENTITY % witList 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.app; - O ((%n.lem)?, ((%n.rdg;, (%n.wit)?) | (%n.rdgGrp;, (%n.wit)?))+) > <!ELEMENT %n.rdgGrp; - O (%n.rdgGrp; | (%n.rdg;, (%n.wit)?))+ > <!ELEMENT %n.witList; - O ((%n.witness)+) >
The new definitions are as follows. We take the opportunity to address one of Peter Robinson's long-standing concerns, and allow witnesses to the lemma to be listed. Note that the model for <rdgGrp> seems bizarre. Why are readings and reading groups treated similarly in <app> entries and not in <rdgGrp> elements?
< 137 New definitions for text-criticism tag set > =
<![%TEI.textcrit;[
<!ENTITY % XML.app "INCLUDE" >
<![%XML.app;[
<!ELEMENT %n.app; - O ( (%m.Incl;)*,
(%n.lem;, (%m.Incl;)*,
(%n.wit;, (%m.Incl;)*)?)?,
( (%n.rdg;, (%m.Incl;)*,
(%n.wit;, (%m.Incl;)*)?)
|
(%n.rdgGrp;, (%m.Incl;)*,
(%n.wit;, (%m.Incl;)*)?)
)+
) >
<!ATTLIST %n.app; %a.global;
type CDATA #IMPLIED
from IDREF #IMPLIED
to IDREF #IMPLIED
loc CDATA #IMPLIED
TEIform CDATA 'app' >
]]>
< 138 New definitions for text-criticism tag set 137 (cont'd) > =
<!ENTITY % XML.rdgGrp "INCLUDE" >
<![%XML.rdgGrp;[
<!ELEMENT %n.rdgGrp; - O ((%m.Incl;)*,
(((%n.rdgGrp;, (%m.Incl;)*) |
(%n.rdg;, (%m.Incl;)*,
(%n.wit;, (%m.Incl;)*)?)))+) >
<!ATTLIST %n.rdgGrp; %a.global;
%a.readings;
TEIform CDATA 'rdgGrp' >
]]>
< 139 New definitions for text-criticism tag set 137 (cont'd) > =
<!ENTITY % XML.witList "INCLUDE" >
<![%XML.witList;[
<!ELEMENT %n.witList; - O ((%m.Incl;)*,
(%n.witness;, (%m.Incl;)*)+) >
<!ATTLIST %n.witList; %a.global;
TEIform CDATA 'witList' >
]]>
]]>
< 140 Suppress definitions in graphs tag set > =
<!ENTITY % eTree 'IGNORE' >
<!ENTITY % forest 'IGNORE' >
<!ENTITY % forestGrp 'IGNORE' >
<!ENTITY % graph 'IGNORE' >
<!ENTITY % tree 'IGNORE' >
<!ENTITY % triangle 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.graph; - - ((%n.node;)+ & (%n.arc;)*) > <!ELEMENT %n.tree; - - ((%n.leaf; | %n.iNode;)*, %n.root;, (%n.leaf; | %n.iNode;)*) > <!ELEMENT %n.eTree; - - ((%n.eTree; | %n.triangle; | %n.eLeaf; )*) > <!ELEMENT %n.triangle; - - ((%n.eTree; | %n.triangle; | %n.eLeaf;)*) > <!ELEMENT %n.forest; - - ((%n.tree; | %n.eTree; | %n.triangle;)+) > <!ELEMENT %n.forestGrp; - - ((%n.forest;)+) >
The new definitions are as follows:
< 141 New definitions for graphs tag set > =
<![%TEI.nets;[
<!ENTITY % XML.tree "INCLUDE" >
<![%XML.tree;[
<!ELEMENT %n.tree; - - ((%n.leaf; | %n.iNode; | %m.Incl;)*,
%n.root;,
(%n.leaf; | %n.iNode; | %m.Incl;)*)
>
<!ATTLIST %n.tree; %a.global;
label CDATA #IMPLIED
arity NUMBER #IMPLIED
ord (Y | N | partial) Y
order NUMBER #IMPLIED
TEIform CDATA 'tree' >
]]>
< 142 New definitions for graphs tag set 141 (cont'd) > =
<!ENTITY % XML.eTree "INCLUDE" >
<![%XML.eTree;[
<!ELEMENT %n.eTree; - - ((%n.eTree; | %n.triangle; |
%n.eLeaf; | %m.Incl;)*) >
<!ATTLIST %n.eTree; %a.global;
label CDATA #IMPLIED
value IDREF #IMPLIED
TEIform CDATA 'eTree' >
]]>
< 143 New definitions for graphs tag set 141 (cont'd) > =
<!ENTITY % XML.triangle "INCLUDE" >
<![%XML.triangle;[
<!ELEMENT %n.triangle; - - ((%n.eTree; | %n.triangle; |
%n.eLeaf; | %m.Incl;)*) >
<!ATTLIST %n.triangle; %a.global;
label CDATA #IMPLIED
value IDREF #IMPLIED
TEIform CDATA 'triangle' >
]]>
< 144 New definitions for graphs tag set 141 (cont'd) > =
<!ENTITY % XML.forest "INCLUDE" >
<![%XML.forest;[
<!ELEMENT %n.forest; - - ((%n.tree; | %n.eTree; |
%n.triangle; | %m.Incl;)*) >
<!ATTLIST %n.forest; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'forest' >
]]>
< 145 New definitions for graphs tag set 141 (cont'd) > =
<!ENTITY % XML.forestGrp "INCLUDE" >
<![%XML.forestGrp;[
<!ELEMENT %n.forestGrp; - - ((%n.forest;, (%m.Incl;)*)+) >
<!ATTLIST %n.forestGrp; %a.global;
type CDATA #IMPLIED
TEIform CDATA 'forestGrp' >
]]>
]]>
< 146 Suppress definitions in tables tag set > =
<!ENTITY % figure 'IGNORE' >
<!ENTITY % formula 'IGNORE' >
<!ENTITY % row 'IGNORE' >
<!ENTITY % table 'IGNORE' >
The current definitions are these:
<!ELEMENT %n.table; - - ((%n.head)*, (%n.row)+) > <!ELEMENT %n.row; - O ((%n.cell; | %n.table;)+) > <!ELEMENT %n.figure; - - ((%n.head)?, (%n.p)*, (%n.figDesc)?, (%n.text)?) > <!ELEMENT %n.formula; - - %formulaContent; >
The new definitions are as follows:
< 147 New definitions for tables tag set > =
<![%TEI.figures;[
<!ENTITY % XML.table "INCLUDE" >
<![%XML.table;[
<!ELEMENT %n.table; - - ((%n.head; | %m.Incl;)*,
(%n.row;, (%m.Incl;)*)+) >
<!ATTLIST %n.table; %a.global;
rows NUMBER #IMPLIED
cols NUMBER #IMPLIED
TEIform CDATA 'table' >
]]>
< 148 New definitions for tables tag set 147 (cont'd) > =
<!ENTITY % XML.row "INCLUDE" >
<![%XML.row;[
<!ELEMENT %n.row; - O ((%n.cell; | %n.table;),
(%m.Incl;)*)+ >
<!ATTLIST %n.row; %a.global;
role CDATA data
TEIform CDATA 'row' >
]]>
< 149 New definitions for tables tag set 147 (cont'd) > =
<!ENTITY % XML.figure "INCLUDE" >
<![%XML.figure;[
<!ELEMENT %n.figure; - - ((%m.Incl;)*,
(%n.head;, (%m.Incl;)*)?,
(%n.p;, (%m.Incl;)*)*,
(%n.figDesc;, (%m.Incl;)*)?,
(%n.text;, (%m.Incl;)*)?) >
<!ATTLIST %n.figure; %a.global;
entity ENTITY #IMPLIED
TEIform CDATA 'figure' >
]]>
< 150 New definitions for tables tag set 147 (cont'd) > =
<!ENTITY % XML.formula "INCLUDE" >
<![%XML.formula;[
<!ELEMENT %n.formula; - - %formulaContent; >
<!ATTLIST %n.formula; %a.global;
notation %formulaNotations; #REQUIRED
TEIform CDATA 'formula' >
]]>
]]>
The TEI base tag set for dictionaries cannot be made XML conformant using the methods described here. That tag set distinguishes two top-level elements for dictionary entries: <entry>, which has a relatively well-defined structure, and <entryFree>, which has no prescribed structure at all: any element used in tagging dictionary entries may appear, within any other element, at any level of nesting. The desired freedom for <entryFree> entries is guaranteed by the inclusion exception on <entryFree>. The standard declaration for the element is this:
<!ELEMENT %n.entryFree; - O (#PCDATA) +(%m.dictionaryParts | %m.phrase | %m.inter) >
If we use the techniques described above, all of the members of the classes dictionaryParts, phrase, and inter will be made legal at every point within any members of any of those classes. Apart from the havoc that would wreak on the core tag set, it would wholly erase the distinction between <entry> and <entryFree> elements.
So some other method of handling anomalous dictionary entries is needed in an XML version of the TEI DTD. Borrowing ideas from B. Tommie Usdin and Deborah A. Lapeyre, and with thanks also to David J. Birnbaum, I propose a new approach to the problem.
The basic idea is to define an element for anomalous structures in dictionary entries. In this discussion, I'll assume this element is called <dictAnomaly> for (`dictionary anomaly'). For every element in the normal structure of a dictionary, the existing content model is changed by taking the existing content model and adding <dictAnomaly> as an alternative. Thus the element <superentry> currently has the following declaration:
<!ELEMENT %n.superentry; - O ((%n.form)?, (%n.entry)+) >After the change, it will have the declaration:
<!ELEMENT %n.superentry; - O (((%n.form;)?, (%n.entry;)+) | %n.dictAnomaly;) >That is, a superentry is either normal (an optional <form> element followed by one or more <entry> elements), or else it is anomalous. The <dictAnomaly> element itself is defined as allowing any sequence of character data, dictionary elements, inter-level elements, or phrase-level elements:
<!ELEMENT %n.anomaly; - O (#PCDATA | %m.dictionaryParts; | %m.phrase; | %m.inter;)* >An anomalous superentry contains a single <dictAnomaly> element, and nothing else.
For elements which are currently defined with mixed content, <dictAnomaly> is simply added to the list of elements which can occur within them. This allows us to evade the mixed-content problem. The simplest way to do this is to define <dictAnomaly> as a phrase-level element in the dictionary tag set. It also allows anomalies to occur within generic phrase-level and inter-level elements which are used in dictionary entries.
In principle, the extensions file should handle this thus:
<!ENTITY % x.phrase 'dictAnomaly |' >But since we have to include new declarations for the entire phrase-level class system in the extensions file anyway (to fix the problems with phrase.seq), we can simply add <dictAnomaly> to phrase, as was done above.
This list brings together in one place a number of open questions mentioned above.
#PCDATA
.Corrigible errors identified in this document are:
#PCDATA
not as prescribed in XML 1.0A few scraps necessary for housekeeping have no obvious home in this document; I'll put them here.
Before we define component, we need to embed all the entity files for the selected tag sets:
< 151 Embed tag-set-specific ent files > =
<!-- 3.7.6: Embedding tag-set-specific entity definitions -->
<![ %TEI.verse; [
<!ENTITY % TEI.verse.ent system 'teivers2.ent' >
%TEI.verse.ent;
]]>
<![ %TEI.drama; [
<!ENTITY % TEI.drama.ent system 'teidram2.ent' >
%TEI.drama.ent;
]]>
<![ %TEI.spoken; [
<!ENTITY % TEI.spoken.ent system 'teispok2.ent' >
%TEI.spoken.ent;
]]>
<![ %TEI.dictionaries; [
<!ENTITY % TEI.dictionaries.ent system 'teidict2.ent' >
%TEI.dictionaries.ent;
]]>
<![ %TEI.terminology; [
<!ENTITY % x.common '' >
<!ENTITY % m.common '%x.common %m.bibl; | %m.chunk; |
%m.hqinter; | %m.lists; | %m.notes; | %n.stage;' >
<!ENTITY % TEI.terminology.ent system 'teiterm2.ent' >
%TEI.terminology.ent;
]]>
<![ %TEI.linking; [
<!ENTITY % TEI.linking.ent system 'teilink2.ent' >
%TEI.linking.ent;
]]>
<![ %TEI.analysis; [
<!ENTITY % TEI.analysis.ent system 'teiana2.ent' >
%TEI.analysis.ent;
]]>
<![ %TEI.transcr; [
<!ENTITY % TEI.transcr.ent system 'teitran2.ent' >
%TEI.transcr.ent;
]]>
<![ %TEI.textcrit; [
<!ENTITY % TEI.textcrit.ent system 'teitc2.ent' >
%TEI.textcrit.ent;
]]>
<![ %TEI.names.dates; [
<!ENTITY % TEI.names.dates.ent system 'teind2.ent' >
%TEI.names.dates.ent;
]]>
<![ %TEI.figures; [
<!ENTITY % TEI.figures.ent system 'teifig2.ent' >
%TEI.figures.ent;
]]>
Before we do that, we have to provide default values for all the tagset entities:
< 152 Provide default tagset declarations > =
<!ENTITY % TEI.prose 'IGNORE' >
<!ENTITY % TEI.verse 'IGNORE' >
<!ENTITY % TEI.drama 'IGNORE' >
<!ENTITY % TEI.spoken 'IGNORE' >
<!ENTITY % TEI.dictionaries 'IGNORE' >
<!ENTITY % TEI.terminology 'IGNORE' >
<!ENTITY % TEI.general 'IGNORE' >
<!ENTITY % TEI.mixed 'IGNORE' >
<!ENTITY % TEI.linking 'IGNORE' >
<!ENTITY % TEI.analysis 'IGNORE' >
<!ENTITY % TEI.fs 'IGNORE' >
<!ENTITY % TEI.certainty 'IGNORE' >
<!ENTITY % TEI.transcr 'IGNORE' >
<!ENTITY % TEI.textcrit 'IGNORE' >
<!ENTITY % TEI.names.dates 'IGNORE' >
<!ENTITY % TEI.nets 'IGNORE' >
<!ENTITY % TEI.figures 'IGNORE' >
<!ENTITY % TEI.corpus 'IGNORE' >
And we need to define the TEI keywords and default generic identifiers:
< 153 Define TEI keywords > =
<!ENTITY % INHERITED '#IMPLIED' >
<!ENTITY % ISO-date 'CDATA' >
<!ENTITY % extPtr 'CDATA' >
<!ENTITY % TEI.elementNames system 'teigis2.ent' >
%TEI.elementNames;
< 154 Fix placePart class > =
<!ENTITY % x.placePart '' >
<!ENTITY % m.placePart '%x.placePart %n.bloc; | %n.country; |
%n.distance; | %n.geog; | %n.offset;
| %n.region; | %n.settlement;' >
The notation in this paper is fairly simple:
I*
will normally be written %Istar;
or
(%m.I;)*
, where the parameter entities are declared
along these lines:
<!ENTITY % x.I ''> <!ENTITY % m.I '%x.I; %m.globincl;'> <!ENTITY % Istar '(%m.I;)*'>
[1]
In particular, this document does not suppress the
tag-omissibility indicators in the TEI DTD; that job is left to
special-purpose software. In its current form, this document also
does not completely normalize all mixed content models to the form
required by XML. I started to make it do so, and have just realized
that carthage may already do what is necessary. I need
to find out for sure whether carthage does the job, and
either complete or remove the partial sets of changes described for
the mass redeclaration of all phrase.seq and
paraContent elements.
[return to text]
[2]
If
the set of inclusions and the set of exclusions on the exception stack
are always the same
for every possible occurrence of every element type in the DTD, then an
exception-free DTD can be created which accepts exactly the same set of
documents as the original DTD. A DTD which had exceptions
only on the root element type, for example, could be replicated without
changing the language it accepts. I am not aware of any production DTDs
which fall into this class.
[return to text]
[3]
One could
take the converse goal of ensuring that the revised DTD be at
least as selective as the original DTD, i.e. that it undergenerate with
respect to the original language. This would be interesting as an
exercise, but if applied to the TEI DTD it would invalidate existing TEI
data, which makes it unacceptable as an approach to creating
an XML-conformant version of the TEI DTD.
[return to text]
[4]
This is clearly established by Wood and Kilpeläinen,
though they inexplicably claim to have proven the opposite.
[return to text]
[5]
Strictly speaking, these ought perhaps to be
imf(E,I),
mf(E,I), and
m(E,I),
but for purposes of this paper we will never need different sets of
inclusions I. So if it matters, we can define
imf(E) formally as imf(E,I), etc.
[return to text]
[6]
What is wrong with these lists, and why are they not complete?
The Names and Dates tag set may not have been selected, or the DTD
I used may -- almost surely did -- have the bug that makes much of
that tag set unreachable. The Corpus tags are for the header, and
may in fact not be descendants of <text>.
[return to text]
[7]
The dictionary tag set includes orth, pron, hyph, syll, stress, gram,
gen, number, case, per, tns, mood, itype, pos, subc, colloc, def, tr,
lang, usg, lbl.
[return to text]
[8]
This is a classic
example of what is known in DTD design circles as the Mixed-Content
Gotcha; the problems associated with it led the XML design group to
restrict the form of mixed-content models in order to forbid content
models which are subject to the problem. This restriction, in turn,
makes it essential to revise specialPara in an XML
version of the TEI DTD.
[return to text]
[9]
An inquiry on TEI-L might
usefully reveal whether anyone is actually using <set> and
whether they would be inconvenienced by this tighter model.
[return to text]