21 Certainty, Precision, and Responsibility
Table of contents
- the note element defined in section 3.8 Notes, Annotation, and Indexing may be used with a value of certainty for its type attribute.
- the certainty element defined in this chapter may be used to record the nature and degree of the uncertainty in a more structured way.
- the precision element defined in this chapter may be used to record the accuracy with which some numerical value (such as a date or quantity) is provided by some other element or attribute.
- the alt element defined in the module for linking and segmentation may be used to provide alternative encodings for parts of a text, as described in section 16.8 Alternation.
- the TEI header records who is responsible for an electronic text by means of the respStmt element and other more specific elements (author, sponsor, funder, principal, etc.) used within the titleStmt, editionStmt, and revisionDesc elements.
- the note element may be used with a value of resp or responsibility in its type attribute.
- the respons element defined in this chapter may be used to record fine-grained structured information about responsibility for individual tags in the text.
21.1 Levels of CertaintyTEI: Levels of Certainty¶
- a given tag may or may not correctly apply (e.g. a given word may be a personal name, or perhaps not)
- the precise point at which an element begins or ends is uncertain
- the value given for an attribute is uncertain
- the content of an element is uncertain, perhaps because it is hard to read or hard to hear, or for some other reason.
- the numerical precision associated with a number or date (for this use the precision element discussed in 21.2 Indications of Precision)
- the document being transcribed may be read in different ways (for this use the transcriptional elements such as unclear, discussed in chapter 11 Representation of Primary Sources)
- a transcriber, editor, or author wishes to indicate a level of confidence in a factual assertion made in the text (for this use the interpretative mechanisms discussed in 17 Simple Analytic Mechanisms and 18 Feature Structures)
21.1.1 Using Notes to Record UncertaintyTEI: Using Notes to Record Uncertainty¶
Using note, the uncertainty here may be recorded quite simply:Elizabeth went to Essex. She had always liked Essex.
<note type="uncertainty" resp="#MSM">It is not
clear here whether <mentioned>Essex</mentioned>
refers to the place or to the nobleman. -MSM</note>
She had always liked <placeName xml:id="CE-p1b">Essex</placeName>.
<note type="uncertainty" resp="#MSM" target="#CE-p1a #CE-p1b">It
is not clear here whether <mentioned>Essex</mentioned>
refers to the place or to the nobleman. If the latter,
it should be tagged as a personal name. -<name xml:id="MSM">Michael</name>
</note>
The advantage of this technique is its relative simplicity. Its disadvantage is that the nature and degree of uncertainty are not conveyed in any systematic way and thus are not susceptible to any sort of automatic processing.
21.1.2 Structured Indications of UncertaintyTEI: Structured Indications of Uncertainty¶
- certainty indicates the degree of certainty or uncertainty associated
with some aspect of the text markup.
locus indicates more exactly the aspect concerning which uncertainty is being expressed: for example, whether the markup is correctly located, whether the right element or attribute name has been used, whether the content of the element or attribute is correct, etc. degree indicates the degree of confidence assigned to the aspect of the markup named by the locus attribute. match supplies an XSLT 2.0 pattern which may be used to select those portions of the document the certainty of which is to be specified. target points at the element or elements whose markup is uncertain.
<placeName xml:id="CE-pl1">Essex</placeName>.
<!-- ... elsewhere in the document ... -->
<certainty target="#CE-pl1" locus="name">
<desc>possibly not a placename</desc>
</certainty>
<!-- ... --><certainty target="#CE-pl1" locus="name" degree="0.6"/>
<!-- ... --><certainty target="#CE-pl1" locus="name" degree="0.6">
<desc>probably a placename, but possibly not</desc>
</certainty>
<certainty
target="#CE-pl1"
locus="name"
degree="0.4"
assertedValue="persName">
<desc>may refer to the Earl of Essex</desc>
</certainty>
She had always liked <placeName xml:id="CE-PL2">Essex</placeName>.
<!-- ... -->
<!-- 60% chance that P1 is a placename, 40% chance a personal name. -->
<certainty
xml:id="cert-1"
target="#CE-PL1"
locus="name"
degree="0.6">
<desc>probably a placename, but possibly not"</desc>
</certainty>
<certainty
xml:id="cert-2"
target="#CE-PL1"
locus="name"
assertedValue="persName"
degree="0.4">
<desc>may refer to the Earl of Essex"</desc>
</certainty>
<!-- 60% chance that P2 is a placename, 40% chance a personal name. 100% chance that it agrees with P1. -->
<certainty
target="#CE-PL2"
locus="name"
given="#cert-1"
degree="1.0">
<desc>if CE-PL1 is a placename, CE-PL2 certainly is"</desc>
</certainty>
<certainty
target="#CE-PL2"
locus="name"
assertedValue="persName"
degree="1.0"
given="#cert-2">
<desc>if CE-PL1 is a personal name, then so is CE-PL2</desc>
</certainty>
<certainty
xml:id="cert1"
target="#CE-p2"
locus="name"
degree="0.6"/>
<certainty
target="#CE-p2"
locus="start"
given="#cert1"
degree="0.9"/>
<certainty
xml:id="cert2"
target="#CE-p2"
locus="name"
assertedValue="placeName"
degree="0.4"/>
<certainty
target="#CE-p2"
locus="start"
given="#cert2"
degree="0.5"/>
<certainty
xml:id="cert3"
target="#CE-p2"
locus="start"
assertedValue="#CE-a1"
given="#cert1"
degree="0.1"/>
<certainty
xml:id="cert4"
target="#CE-p2"
locus="start"
assertedValue="#CE-a1"
given="#cert2"
degree="0.5"/>
Ernest went to old <placeName>Saybrook</placeName>. (0.4 * 0.5, or 0.20)
Ernest went to <placeName>old Saybrook</placeName>. (0.4 * 0.5, or 0.20)
As noted in 16 Linking, Segmentation, and Alignment, the target attribute may also provide an xpath expression. This is not however supported by all processors, and it is not recommended TEI practice. There are however some simple cases in which the more general data.pointer values supplied by the target attribute are to be preferred, notably those in which the xml:id attribute is used to identify a single element occurrence. The usage #A (to indicate the element whose xml:id attribute has the value A) is syntactically much simpler than the equivalent xpath2 expression //*[@xml:id='A'] and is hence preferred throughout these guidelines.
For similar reasons, the certainty element may specify both a target value (expressed as an URI) and a match value (expressed as an xpath). The former defines the context within which the latter is to be found. If no target is specified, the match pattern is evaluated in the context of the document root. If the pattern does not match, the certainty assertion is considered to be null.
<certainty
target="#CE-u1"
match="@who"
locus="value"
degree="0.5"/>
<certainty target="#CE-p3" locus="value" degree="0.5"/>
<choice>
<expan xml:id="CE-e1">Standard
Generalized Markup Language</expan>
<expan xml:id="CE-e40">Some Grandiose Methodology for Losers</expan>
<abbr>SGML</abbr>
</choice> ...
<!-- ... -->
<certainty target="#CE-e1" locus="value" degree="0.9"/>
<certainty target="#CE-e40" locus="value" degree="0.5"/>
<certainty
target="#CE-P3"
locus="value"
assertedValue="gun"
degree="0.8">
<desc>a gun makes more sense in a holdup</desc>
</certainty>
target="#dd001"
match="@resp"
locus="value"
degree="0.2"/>
<!-- ... -->
<certainty match=".//my:*" locus="value" degree="0.9"/>
</div>
The certainty element and the other TEI mechanisms for indicating uncertainty provide a range of methods of graduated complexity. Simple expressions of uncertainty may be made by using the note element. This is simple and convenient, and can accommodate either a discursive and unstructured indication of uncertainty, or a complex and structured but probably project-specific expression of uncertainty. In general, however, unless special steps are taken, the note element does not provide as much expressive power as the certainty element, and in cases where highly structured certainty information must be given, it is recommended that the certainty element be used.
The certainty element may be used for simple unqualified indications of uncertainty, in which case only the locus, match and target attributes might be specified. In more complex cases, the other attributes may be used to provide fuller information. While these attributes may take any string of characters as value, the recommended values should be used wherever possible; if they are not appropriate in a given situation, encoders should provide their own controlled vocabulary and document it in the encodingDesc or tagUsage elements of the TEI header.
21.2 Indications of PrecisionTEI: Indications of Precision¶
As noted above, certainty about the accuracy of an encoding or its content is not the same thing as the precision with which a value is specified. In the case of a date or a quantity, for example, we might be certain that the value given is imprecise, or uncertain about whether or not the value given is correct. The latter possibility would be represented by the certainty element discussed in the previous section; the former by the precision element discussed in this section.
atLeast="10"
atMost="30"
unit="cm"
scope="all"/>
Suppose however that the precision with which the value of such an attribute can be specified is variable. For example, suppose an event is dated ‘about fifty years after the death of Augustus’. In this case, the precision of one end of the range (the death of Augustus) is higher than the other, assuming we know when Augustus died. We can say that the latest possible date is probably 50 years after that, but with less confidence than we can attach to the earliest possible date.
years after the death of Augustus</date>
<precision target="#d001" match="@notAfter" degree="0.3"/>
<precision target="#d001" match="@notBefore" degree="0.9"/>
some time in April of 1857.</residence>
<precision target="#res01" match="@notAfter" degree="0.5"/>
xml:id="w00t"
atLeast="10"
atMost="30"
unit="cm"
scope="all"/>
<precision target="#w00t" match="@atMost" degree="0.3"/>
xml:id="dim1"
type="avgLineLength"
unit="chars"
quantity="62.4"/>
<precision target="#dim1" stdDeviation="4"/>
21.3 Attribution of ResponsibilityTEI: Attribution of Responsibility¶
In general, attribution of responsibility for the transcription and markup of an electronic text is made by respStmt elements within the header: specifically, within the title statement, the edition statement(s), and the revision history.
- respons (responsibility) identifies the individual(s) responsible for some aspect of the
markup of particular element(s).
resp (responsible party) identifies the individual or agency responsible for the indicated aspect of the electronic text.
<!-- ... -->
<respons target="#CE-p5" locus="value" resp="#RC"/>
<respons target="#CE-p5" locus="name location" resp="#PMWR"/>
<list type="encoders">
<item xml:id="PMWR"/>
<item xml:id="RC"/>
</list>
Some elements bear specialized resp or agent attributes, which have specific meanings that vary from element to element; the respons element should be reserved for the general aspects of responsibility common to all text transcription and markup, and should not be confused with the more specific attributes on individual elements.
21.4 The Certainty ModuleTEI: The Certainty Module¶
↑ Contents « 20 Non-hierarchical Structures » 22 Documentation Elements