3 Elements Available in All TEI Documents

Table des matières

This chapter describes elements which may appear in any kind of text and the tags used to mark them in all TEI documents. Most of these elements are freely floating phrases, which can appear at any point within the textual structure, although they must generally be contained by a higher-level element of some kind (such as a paragraph). A few of the elements described in this chapter (for example, bibliographic citations and lists) have a comparatively well-defined internal structure, but most of them have no consistent inner structure of their own. In the general case, they contain only a few words, and are often identifiable in a conventionally printed text by the use of typographic conventions such as shifts of font, use of quotation or other punctuation marks, or other changes in layout.

This chapter begins by describing the p tag used to mark paragraphs, the prototypical formal unit for running text in many TEI modules. This is followed, in section 3.2 Treatment of Punctuation, by a discussion of some specific problems associated with the interpretation of conventional punctuation, and the methods proposed by the Guidelines for resolving ambiguities therein.

The next section (section 3.3 Highlighting and Quotation) describes a number of phrase-level elements commonly marked by typographic features (and thus well-represented in conventional markup languages). These include features commonly marked by font shifts (section 3.3.2 Emphasis, Foreign Words, and Unusual Language) and features commonly marked by quotation marks (section 3.3.3 Quotation) as well as such features as terms, cited words, and glosses (section 3.3.4 Terms, Glosses, Equivalents, and Descriptions).

Section 3.4 Simple Editorial Changes introduces some phrase-level elements which may be used to record simple editorial interventions, such as emendation or correction of the encoded text. The elements described here constitute a simple subset of the full mechanisms for encoding such information (described in full in chapter 11 Representation of Primary Sources), which should be adequate to most commonly encountered situations.

The next section (section 3.5 Names, Numbers, Dates, Abbreviations, and Addresses) describes several phrase-level and inter-level elements which, although often of interest for analysis or processing, are rarely explicitly identified in conventional printing. These include names (section 3.5.1 Referring Strings), numbers and measures (section 3.5.3 Numbers and Measures), dates and times (section 3.5.4 Dates and Times), abbreviations (section 3.5.5 Abbreviations and Their Expansions), and addresses (section 3.5.2 Addresses).

In the same way, the following section (section 3.6 Simple Links and Cross-References) presents only a subset of the facilities available for the encoding of cross-references or text-linkage. The full story may be found in chapter 16 Linking, Segmentation, and Alignment; the tags presented here are intended to be usable for a wide variety of simple applications.

Sections 3.7 Lists, and 3.8 Notes, Annotation, and Indexing, describe two kinds of quasi-structural elements: lists and notes. These may appear either within chunk-level elements such as paragraphs, or between them. Several kinds of lists are catered for, of an arbitrary complexity. The section on notes discusses both notes found in the source and simple mechanisms for adding annotations of an interpretive nature during the encoding; again, only a subset of the facilities described in full elsewhere (specifically, in chapter 17 Simple Analytic Mechanisms) is discussed.

Section 3.9 Graphics and Other Non-textual Components introduces some simple ways of representing graphic or other non-textual content found in a text. A fuller discussion of the multimedia facilities supported by these Guidelines may be found in chapters 14 Tables, Formulæ, Graphics and Notated Music and 16 Linking, Segmentation, and Alignment.

Next, section 3.10 Reference Systems, describes methods of encoding within a text the conventional system or systems used when making references to the text. Some reference systems have attained canonical authority and must be recorded to make the text useable in normal work; in other cases, a convenient reference system must be created by the creator or analyst of an electronic text.

Like lists and notes, the bibliographic citations discussed in section 3.11 Bibliographic Citations and References, may be regarded as structural elements in their own right. A range of possibilities is presented for the encoding of bibliographic citations or references, which may be treated as simple phrases within a running text, or as highly-structured components suitable for inclusion in a bibliographic database.

Additional elements for the encoding of passages of verse or drama (whether prose or verse) are discussed in section 3.12 Passages of Verse or Drama.

The chapter concludes with a technical overview of the structure and organization of the module described here. This should be read in conjunction with chapter 1 The TEI Infrastructure, describing the structure of the TEI document type definition.

TEI: Paragraphs¶3.1 Paragraphs

The paragraph is the fundamental organizational unit for all prose texts, being the smallest regular unit into which prose can be divided. Prose can appear in all TEI texts, even those that are primarily of another genre (e.g., verse); thus the paragraph is described here, as an element which can appear in any kind of text.

Paragraphs can contain any of the other elements described within this chapter, as well as some other elements which are specific to individual text types. We distinguish phrase-level elements, which must be entirely contained within a paragraph and cannot appear except within one, from chunks, which can appear between, but not within, paragraphs, and from inter-level elements, which can appear either within a single paragraph or between paragraphs. The class of phrases includes emphasized or quoted phrases, names, dates, etc. The class of inter-level elements includes bibliographic citations, notes, lists, etc. The class of chunks includes the paragraph itself, and other elements which have similar structural properties, notably the ab (anonymous block) element described in 16.3 Blocks, Segments, and Anchors) which may be used as an alternative to the paragraph in some kinds of texts.

Because paragraphs may appear in different base or additional tag sets, their possible contents may differ in different kinds of documents. In particular, additional elements not listed in this chapter may appear in paragraphs in certain kinds of text. However, the elements described in this chapter are always by default available in all kinds of text.

The paragraph is marked using the p element:

p (paragraphe) marque les paragraphes dans un texte en prose.

If a consistent internal subdivision of paragraphs is desired, the s or seg (‘segment’) elements may be used, as discussed in chapters 16 Linking, Segmentation, and Alignment and 17 Simple Analytic Mechanisms respectively. More usually, however, paragraphs have no firm internal structure, but contain prose encoded as a mix of characters, entity references, phrases marked as described in the rest of this chapter, and embedded elements like lists, figures, or tables.

Since paragraphs are usually explicitly marked in Western texts, typically by indentation, the application of the p tag usually presents few problems.

In some cases, the body of a text may comprise but a single paragraph:

<body>
I fully appreciate Gen. Pope's splendid achievements with their
invaluable results; but you must know that Major Generalships in the
Regular Army, are not as plenty as blackberries.
</body>

direct	peut être utilisé pour indiquer si le sujet cité est à considérer comme comme étant du discours direct ou du discours indirect.
aloud	peut être utilisé pour indiquer si l'on estime que l'objet cité est dit oralement ou par signes.

uri	(Identifiant de ressource uniforme.) référence le concept sous-jacent dont le parent est une représentation au moyen d'un identifiant externe quelconque.
filter	référence un script externe qui contient une méthode pour transformer les instances de cet élément en TEI canonique.
name	nomme le concept sous-jacent dont le parent est une représentation.

cert	(certitude) donne le degré de certitude associée à l'intervention ou à l'interprétation.
resp	(responsable) indique l'agent responsable de l'intervention ou de l'interprétation, par exemple un éditeur ou un transcripteur.

unit	noms des unités utilisées pour la mesure. Les valeurs suggérées comprennent: 1] cm; 2] mm; 3] in; 4] lines; 5] chars
quantity	spécifie la longueur dans les unités indiquées
extent	indique la dimension de l'objet en utilisant un vocabulaire spécifique à un projet qui combine la quantité et l'unité dans une chaîne seule de mots.
precision	caractérise la précision des valeurs spécifiées par les autres attributs.
scope	spécifie l'applicabilité de cette mesure, là où plus d'un objet est mesuré. Exemple de valeurs possibles: 1] all; 2] most; 3] range

type	caractérise l'élément en utilisant n'importe quel système ou typologie de classification approprié.
subtype	(sous-type) fournit une sous-catégorisation de l'élément, si c'est nécessaire.

key	fournit un moyen, défini de façon externe, d'identifier l'entité (ou les entités) nommé(es), en utilisant une valeur codée d'un certain type.
ref	(référence) fournit un moyen explicite de localiser une définition complète de l'entité nommée au moyen d'un ou plusieurs URIs.

model.nameLike.agent	regroupe des éléments qui contiennent des noms d'individus ou de personnes morales.
model.offsetLike	regroupe des éléments qui ne peuvent apparaître que sous la forme d'une partie d'un toponyme.
model.persNamePart	regroupe des éléments qui font partie d'un nom de personne
model.placeStateLike	regroupe des éléments qui décrivent les transformations d'un lieu

idno	(identifiant) donne un numéro normalisé ou non qui peut être utilisé pour identifier une référence bibliographique.
lang	(nom de la langue) nom de la langue mentionnée des informations de nature linguistique (étymologique ou autre)
rs	(chaîne de référence) contient un nom générique ou une chaîne permettant de s'y référer.

addName	(nom additionnel) contient une composante de nom additionnelle, comme un surnom, une épithète, un alias ou toute autre expression descriptive utilisée dans un nom de personne.
forename	(prénom) contient un prénom, qu'il soit donné ou un nom de baptême.
genName	(qualificatif générationnel de nom) contient une composante de nom utilisée pour distinguer des noms, par ailleurs similaires, sur la base de l'âge ou de la génération des personnes concernées.
nameLink	(lien entre les composants d'un nom) contient une particule ou une expression exprimant un lien, utilisés dans un nom mais considérés comme n'en faisant pas partie, comme "van der" ou "de".
roleName	(rôle) contient un composant du nom d'une personne, indiquant que celle-ci a un rôle ou une position particulière dans la société, comme un titre ou un rang officiel.
surname	(nom de famille) contient un nom de famille (hérité) par opposition à un nom donné, nom de baptême ou surnom.

bloc	(bloc) contient le nom d'une unité géo-politique composée d'au moins deux états ou pays
country	(pays) contient le nom d'une unité géo-politique, comme une nation, un pays, une colonie ou une communauté, plus grande ou administrativement supérieure à une région et plus petite qu'un bloc.
district	(district) contient le nom d'une subdivision quelconque d'une ville, comme une paroisse, une circonscription électorale ou toute autre unité administrative ou géographique.
geogName	(nom de lieu géographique) un nom associé à une caractéristique géographique comme Windrush Valley ou le Mont Sinaï.
placeName	(nom de lieu) contient un nom de lieu absolu ou relatif.
region	(région) contient le nom d'une unité administrative comme un état, une province ou un comté, plus grande qu'un lieu de peuplement, mais plus petite qu'un pays.
settlement	(lieu de peuplement) contient le nom d'un lieu de peuplement comme une cité, une ville ou un village, identifié comme une unité géo-politique ou administrative unique.

quantity	(quantité) spécifie le nombre des unités indiquées que comprend la mesure.
unit	(unité) indique les unités de mesure utilisées ; il s'agit en général du symbole normalisé pour les unités dont on a besoin. Les valeurs suggérées comprennent: 1] m; 2] kg; 3] s; 4] Hz; 5] Pa; 6] Ω; 7] L; 8] t; 9] ha; 10] Å; 11] mL; 12] cm; 13] dB; 14] kbit; 15] Kibit; 16] kB; 17] KiB; 18] MB; 19] MiB
commodity	(article) indique ce qui est mesuré.

target	précise la cible de la référence en donnant une ou plusieurs références URI
evaluate	(évalué) détermine le sens attendu, si la cible d'un pointeur est elle-même un pointeur.

width	Where the media are displayed, indicates the display width
height	Where the media are displayed, indicates the display height
scale	Where the media are displayed, indicates a scale factor to be applied when generating the desired display size

bibl	(référence bibliographique.) contient une référence bibliographique faiblement structurée dans laquelle les sous-composants peuvent ou non être explicitement balisés.
biblFull	(référence bibliographique totalement structurée) contient une référence bibliographique totalement structurée : tous les composants de la description du fichier TEI y sont présents.
biblStruct	(référence bibliographique structurée) contient une référence bibliographique dans laquelle seuls des sous-éléments bibliographiques apparaissent et cela, selon un ordre déterminé.
listBibl	(liste de références bibliographiques) contient une liste de références bibliographiques de toute nature.
msDesc	(description d'un manuscrit) contient la description d'un manuscrit bien individualisé

type	indique le type de valeur numérique Les valeurs suggérées comprennent: 1] cardinal; 2] ordinal; 3] fraction; 4] percentage
value	fournit la valeur d'un nombre sous une forme normalisée.

atLeast	donne une estimation de la valeur minimum pour la mesure.
atMost	donne une estimation de la valeur maximum pour la mesure.

biblScope	(extension d'une référence bibliographique) définit l'extension d'une référence bibliographique, comme par exemple une liste de numéros de page, ou le nom d'une subdivision d'une oeuvre plus grande.
distributor	(diffuseur) donne le nom d’une personne ou d’un organisme responsable de la diffusion d’un texte.
publisher	(éditeur) donne le nom de l'organisme responsable de la publication ou de la distribution d'un élément de la bibliographie.
pubPlace	(lieu de publication) contient le nom du lieu d'une publication.

date	(date) contient une date exprimée dans n'importe quel format.
time	(temps) contient une expression qui précise un moment de la journée sous n'importe quelle forme.

mainLang	(langue principale) contient un code identifiant la langue principale du manuscrit.
otherLangs	(autres langues) contient un ou plusieurs codes identifiant toute autre langue utilisée dans le manuscrit.

P5: Recommandations pour l'encodage et l'échange de textes électroniques

3 Elements Available in All TEI Documents

TEI: Paragraphs¶3.1 Paragraphs

TEI: Treatment of Punctuation¶3.2 Treatment of Punctuation

TEI: Functions of Punctuation¶3.2.1 Functions of Punctuation

TEI: Hyphenation¶3.2.2 Hyphenation

TEI: Highlighting and Quotation¶3.3 Highlighting and Quotation

TEI: What Is Highlighting?¶3.3.1 What Is Highlighting?

TEI: Emphasis, Foreign Words, and Unusual Language¶3.3.2 Emphasis, Foreign Words, and Unusual Language

TEI: Foreign Words or Expressions¶3.3.2.1 Foreign Words or Expressions

TEI: Emphatic Words and Phrases¶3.3.2.2 Emphatic Words and Phrases

TEI: Other Linguistically Distinct Material¶3.3.2.3 Other Linguistically Distinct Material

TEI: Quotation¶3.3.3 Quotation

TEI: Terms, Glosses, Equivalents, and Descriptions¶3.3.4 Terms, Glosses, Equivalents, and Descriptions

TEI: Some Further Examples¶3.3.5 Some Further Examples

TEI: Simple Editorial Changes¶3.4 Simple Editorial Changes

TEI: Apparent Errors¶3.4.1 Apparent Errors

TEI: Regularization and Normalization¶3.4.2 Regularization and Normalization

TEI: Additions, Deletions, and Omissions¶3.4.3 Additions, Deletions, and Omissions

TEI: Names, Numbers, Dates, Abbreviations, and Addresses¶3.5 Names, Numbers, Dates, Abbreviations, and Addresses

TEI: Referring Strings¶3.5.1 Referring Strings

TEI: Addresses¶3.5.2 Addresses

TEI: Numbers and Measures¶3.5.3 Numbers and Measures

TEI: Dates and Times¶3.5.4 Dates and Times

TEI: Abbreviations and Their Expansions¶3.5.5 Abbreviations and Their Expansions

TEI: Simple Links and Cross-References¶3.6 Simple Links and Cross-References

TEI: Lists¶3.7 Lists

TEI: Notes, Annotation, and Indexing¶3.8 Notes, Annotation, and Indexing

TEI: Notes and Simple Annotation¶3.8.1 Notes and Simple Annotation

TEI: Index Entries¶3.8.2 Index Entries

TEI: Pre-existing indexes¶3.8.2.1 Pre-existing indexes

TEI: Auto-generated indexes¶3.8.2.2 Auto-generated indexes

TEI: Graphics and Other Non-textual Components¶3.9 Graphics and Other Non-textual Components

TEI: Reference Systems¶3.10 Reference Systems

TEI: Using the xml:id and n Attributes¶3.10.1 Using the xml:id and n Attributes

TEI: Creating New Reference Systems¶3.10.2 Creating New Reference Systems

TEI: Milestone Elements¶3.10.3 Milestone Elements

TEI: Declaring Reference Systems¶3.10.4 Declaring Reference Systems

TEI: Bibliographic Citations and References¶3.11 Bibliographic Citations and References

TEI: Methods of Encoding Bibliographic References and Lists of References¶3.11.1 Methods of Encoding Bibliographic References and Lists of References

TEI: Components of Bibliographic References¶3.11.2 Components of Bibliographic References

TEI: Analytic, Monographic, and Series Levels¶3.11.2.1 Analytic, Monographic, and Series Levels

TEI: Titles, Authors, and Editors¶3.11.2.2 Titles, Authors, and Editors

TEI: Document Identifiers¶3.11.2.3 Document Identifiers

TEI: Imprint, Size of a Document, and Reprint Information¶3.11.2.4 Imprint, Size of a Document, and Reprint Information

TEI: Scopes and Ranges in Bibliographic Citations¶3.11.2.5 Scopes and Ranges in Bibliographic Citations

TEI: Series Information¶3.11.2.6 Series Information

TEI: Related Items¶3.11.2.7 Related Items

TEI: Notes and Statement of Language¶3.11.2.8 Notes and Statement of Language

TEI: Order of Components within References¶3.11.2.9 Order of Components within References

TEI: Bibliographic Pointers ¶3.11.3 Bibliographic Pointers

TEI: Relationship to Other Bibliographic Schemes¶3.11.4 Relationship to Other Bibliographic Schemes

TEI: Passages of Verse or Drama¶3.12 Passages of Verse or Drama

TEI: Core Tags for Verse¶3.12.1 Core Tags for Verse

TEI: Core Tags for Drama¶3.12.2 Core Tags for Drama

TEI: Overview of the Core Module ¶3.13 Overview of the Core Module