21 Certainty, Precision, and Responsibility
內容
Encoders of text often find it useful to indicate that some aspects of the encoded text are problematic or uncertain, and to indicate who is responsible for various aspects of the markup of the electronic text. These Guidelines provide several methods of recording uncertainty about the text or its markup:
- the note element defined in section 3.9 Notes, Annotation, and Indexing may be used with a value of certainty for its type attribute.
- the certainty element defined in this chapter may be used to record the nature and degree of the uncertainty in a more structured way.
- the precision element defined in this chapter may be used to record the accuracy with which some numerical value (such as a date or quantity) is provided by some other element or attribute.
- the alt element defined in the module for linking and segmentation may be used to provide alternative encodings for parts of a text, as described in section 16.8 Alternation.
There are three methods of indicating responsibility for different aspects of the electronic text:
- the TEI header records who is responsible for an electronic text by means of the respStmt element and other more specific elements (author, sponsor, funder, principal, etc.) used within the titleStmt, editionStmt, and revisionDesc elements.
- the note element may be used with a value of resp or responsibility in its type attribute.
- the respons element defined in this chapter may be used to record fine-grained structured information about responsibility for individual tags in the text.
No special steps are needed to use the note and respStmt elements, since they are defined in the core module and header respectively. The alt element is only available when the module for linking has been selected, as described in chapter 16 Linking, Segmentation, and Alignment. To use the certainty, precision or respons elements, the module for certainty and responsibility should be selected.
These three elements are all members of an attribute class called att.scoping from which they inherit the following attributes:
- att.scoping provides attributes for selecting particular elements within a document.
target points at one or more sets of zero or more elements each. match supplies an XPath selection pattern using the syntax defined in Kay (ed.) (2017) which identifies a set of nodes, selected within the context identified by the target attribute if this is supplied, or within the context of the parent element if it is not.
These attributes enable statements about certainty, precision, or responsibility to be made with respect to the whole of a document, or any part or parts of it which can be identified using standard XML location methods. Several examples are given in the discussion of the certainty element below; the same mechanisms are available for all three elements discussed in this chapter.
TEI: Levels of Certainty⚓︎21.1 Levels of Certainty
Many types of uncertainty may be distinguished. The certainty element is designed to encode the following sorts:
- a given tag may or may not correctly apply (e.g. a given word may be a personal name, or perhaps not)
- the precise point at which an element begins or ends is uncertain
- the value given for an attribute is uncertain
- the content given for an element is unreliable for any reason.
The following types of uncertainty are not indicated with the certainty element:
- the numerical precision associated with a number or date (for this use the precision element discussed in 21.2 Indications of Precision)
- the content of the document being transcribed is identifiable, but may be read or understood in different ways (for this use the transcriptional elements such as unclear, discussed in chapter 11 Representation of Primary Sources)
- a transcriber, editor, or author wishes to indicate a level of confidence in a factual assertion made in the text (for this use the interpretative mechanisms discussed in 17 Simple Analytic Mechanisms and 18 Feature Structures)
TEI: Using Notes to Record Uncertainty⚓︎21.1.1 Using Notes to Record Uncertainty
<note type="certainty" resp="#MSM">It is not
clear here whether <mentioned>Essex</mentioned>
refers to the place or to the nobleman. -MSM</note>
She had always liked <placeName xml:id="CE-p1b">Essex</placeName>.
<note type="certainty" resp="#MSM"
target="#CE-p1a #CE-p1b">It
is not clear here whether <mentioned>Essex</mentioned>
refers to the place or to the nobleman. If the latter,
it should be tagged as a personal name. -<name xml:id="MSM">Michael</name>
</note>
The advantage of this technique is its relative simplicity. Its disadvantage is that the nature and degree of uncertainty are not conveyed in any systematic way and thus are not susceptible to any sort of automatic processing.
TEI: Structured Indications of Uncertainty⚓︎21.1.2 Structured Indications of Uncertainty
To record uncertainty in a more structured way, susceptible of at least simple automatic processing, the certainty element may be used:
- certainty 指出標記文本某些部分的確定或不確定的程度。
locus 指出不確定標記的確切位置:元素的應用性、開始標籤或結束標籤的確切位置、特定屬性的屬性值等。 degree 指出屬性locus所指定的標記部分之確定程度。
<placeName xml:id="CE-pl1">Essex</placeName>.
<!-- ... elsewhere in the document ... -->
<certainty target="#CE-pl1" locus="name">
<desc>possibly not a placename</desc>
</certainty>
<!-- ... --><certainty target="#CE-pl1" locus="name"
degree="0.6">
<desc>probably a placename, but possibly not</desc>
</certainty>
<certainty target="#CE-pl1" locus="name"
degree="0.4" assertedValue="persName">
<desc>may refer to the Earl of Essex</desc>
</certainty>
TEI: Contingent Conditions⚓︎21.1.2.1 Contingent Conditions
She had always liked <placeName xml:id="CE-PL2">Essex</placeName>.
<!-- ... -->
<!-- 60% chance that P1 is a placename, 40% chance a personal name. -->
<certainty xml:id="cert-1" target="#CE-PL1"
locus="name" degree="0.6">
<desc>probably a placename, but possibly not"</desc>
</certainty>
<certainty xml:id="cert-2" target="#CE-PL1"
locus="name" assertedValue="persName" degree="0.4">
<desc>may refer to the Earl of Essex"</desc>
</certainty>
<!-- 60% chance that P2 is a placename, 40% chance a personal name. 100% chance that it agrees with P1. -->
<certainty target="#CE-PL2" locus="name"
given="#cert-1" degree="1.0">
<desc>if CE-PL1 is a placename, CE-PL2 certainly is"</desc>
</certainty>
<certainty target="#CE-PL2" locus="name"
assertedValue="persName" degree="1.0" given="#cert-2">
<desc>if CE-PL1 is a personal name, then so is CE-PL2</desc>
</certainty>
<certainty xml:id="cert1" target="#CE-p2"
locus="name" degree="0.6"/>
<certainty target="#CE-p2" locus="start"
given="#cert1" degree="0.9"/>
<certainty xml:id="cert2" target="#CE-p2"
locus="name" assertedValue="placeName" degree="0.4"/>
<certainty target="#CE-p2" locus="start"
given="#cert2" degree="0.5"/>
<certainty xml:id="cert3" target="#CE-p2"
locus="start" assertedValue="#CE-a1" given="#cert1"
degree="0.1"/>
<certainty xml:id="cert4" target="#CE-p2"
locus="start" assertedValue="#CE-a1" given="#cert2"
degree="0.5"/>
Ernest went to old <placeName>Saybrook</placeName>. (0.4 * 0.5, or 0.20)
Ernest went to <placeName>old Saybrook</placeName>. (0.4 * 0.5, or 0.20)
TEI: Pervasive Conditions⚓︎21.1.2.2 Pervasive Conditions
checked
:
If an element in a document is matched by more than one match expression, then the
most specific pattern applies. 95 As a simple case, if both the preceding certainty elements were present in the same document, a persName occurring within a <div type="checked"> element would potentially match both pattern expressions. However because the second
pattern is more specific than the former, in fact this is the only one that would
apply. If multiple patterns match and have the same priority, then the first one (in
document order) is applied. Only those statements of certainty which have matched
in this sense are available for conditional application using the given attribute mentioned above.my
. This namespace prefix must be associated with an appropriate namespace definition,
either on the certainty element itself, or on one of its ancestor elements.TEI: Content Uncertainty⚓︎21.1.2.3 Content Uncertainty
<choice>
<expan xml:id="CE-e1">Standard
Generalized Markup Language</expan>
<expan xml:id="CE-e40">Some Grandiose Methodology for Losers</expan>
<abbr>SGML</abbr>
</choice> ...
<!-- ... -->
<certainty target="#CE-e1" locus="value"
degree="0.9"/>
<certainty target="#CE-e40" locus="value"
degree="0.5"/>
<certainty target="#CE-P3" locus="value"
assertedValue="gun" degree="0.8">
<desc>a gun makes more sense in a holdup</desc>
</certainty>
TEI: Target or Match?⚓︎21.1.2.4 Target or Match?
As noted in 16 Linking, Segmentation, and Alignment, the target attribute may take any general teidata.pointer as values and may thus also contain an XPath expression of arbitrary complexity. Because full support for XPath is not provided by current processors, it is not generally recommended TEI practice. There are however some simple cases in which XPath syntax is to be preferred, notably those in which the xml:id attribute is used to identify a single element occurrence. The usage #A (to indicate the element whose xml:id attribute has the value A) is syntactically much simpler than the equivalent xpath2 expression //*[@xml:id='A'] and is hence preferred throughout these guidelines.
For similar reasons, the certainty element may specify both a target value (expressed as an URI) and a match value (expressed as an XPath). The former defines the context within which the latter is to be evaluated. As previously noted, if no value is supplied for target, the context within which the value of match should be evaluated is the parent element of the certainty element itself.
<certainty target="#CE-u1" match="@who"
locus="value" degree="0.5"/>
degree="0.5"/>
</u>
The certainty element and the other TEI mechanisms for indicating uncertainty provide a range of methods of graduated complexity. Simple expressions of uncertainty may be made by using the note element. This is simple and convenient, and can accommodate either a discursive and unstructured indication of uncertainty, or a complex and structured but probably project-specific expression of uncertainty. In general, however, unless special steps are taken, the note element does not provide as much expressive power as the certainty element, and in cases where highly structured certainty information are needed, it is recommended that the certainty element be preferred.
TEI: Indications of Precision⚓︎21.2 Indications of Precision
As noted above, certainty about the accuracy of an encoding or its content is not the same thing as the precision with which a value is specified. In the case of a date or a quantity, for example, we might be certain that the value given is imprecise, or uncertain about whether or not the value given is correct. The latter possibility would be represented by the certainty element discussed in the previous section; the former by the precision element discussed in this section.
The elements concerning which statements of precision are to be made are identified using the same target and match attributes inherited from the att.scoping class discussed in the previous section and in the same way. Other aspects are provided by other attributes as further discussed below.
- precision indicates the numerical accuracy or precision associated with some aspect of the
text markup.
precision characterizes the precision of the element or attribute pointed to by the precision element. stdDeviation supplies a standard deviation associated with the value in question
Suppose however that the precision with which the value of such an attribute can be specified is variable. For example, suppose an event is dated ‘about fifty years after the death of Augustus’. In this case, the precision of one end of the range (the death of Augustus) is higher than the other, assuming we know when Augustus died. We can say that the latest possible date is probably 50 years after that, but with less confidence than we can attach to the earliest possible date.
notAfter="0064">About 50
years after the death of Augustus</date>
<precision target="#d001" match="@notAfter"
precision="low"/>
<precision target="#d001"
match="@notBefore" precision="high"/>
notAfter="1857-04-30">From the 1st of March to
some time in April of 1857.
<precision match="@notAfter"
precision="medium"/>
</residence>
TEI: Attribution of Responsibility⚓︎21.3 Attribution of Responsibility
In general, attribution of responsibility for the transcription and markup of an electronic text is made by respStmt elements within the header: specifically, within the title statement, the edition statement(s), and the revision history.
In some cases, however, more detailed element-by-element information may be desired. For example, an encoder may wish to distinguish between the individuals responsible for transcribing the content and those responsible for determining that a given word or phrase constitutes a proper noun. Where such fine-grained attribution of responsibility is required, the respons element can be used.
- respons (責任) 指出標記一或多個特定元素某部分的負責人。
locus 指出標記責任所屬的特定部分。
This element allows one or more aspects of the markup to be attributed to a given individual. This element inherits the target and match attributes from the att.scoping class, in the same way as the certainty and precision elements. Its locus attribute functions in the same way as that on the certainty element (see 21.1 Levels of Certainty). It inherits the resp and cert attributes from the att.global.responsibility class.
<persName xml:id="CE-p5" rend="it">Saybrook</persName>.
<!-- ... -->
<respons target="#CE-p5" locus="value"
resp="#RC"/>
<respons target="#CE-p5"
locus="name location" resp="#PMWR"/>
<list type="encoders">
<item xml:id="PMWR"/>
<item xml:id="RC"/>
</list>
TEI: The Certainty Module⚓︎21.4 The Certainty Module
The module described in this chapter makes available the following additional elements:
The selection and combination of modules to form a TEI schema is described in 1.2 Defining a TEI Schema.