2 The TEI Header
목차
This chapter addresses the problems of describing an encoded work so that the text itself, its source, its encoding, and its revisions are all thoroughly documented. Such documentation is equally necessary for scholars using the texts, for software processing them, and for cataloguers in libraries and archives. Together these descriptions and declarations provide an electronic analogue to the title page attached to a printed work. They also constitute an equivalent for the content of the code books or introductory manuals customarily accompanying electronic data sets.
Every TEI-conformant text must carry such a set of descriptions, prefixed to it and encoded as described in this chapter. The set is known as the TEI header, tagged teiHeader, and has five major parts:
- a file description, tagged fileDesc, containing a full bibliographical description of the computer file itself, from which a user of the text could derive a proper bibliographic citation, or which a librarian or archivist could use in creating a catalogue entry recording its presence within a library or archive. The term computer file here is to be understood as referring to the whole entity or document described by the header, even when this is stored in several distinct operating system files. The file description also includes information about the source or sources from which the electronic document was derived. The TEI elements used to encode the file description are described in section 2.2 The File Description below.
- an encoding description, tagged encodingDesc, which describes the relationship between an electronic text and its source or sources. It allows for detailed description of whether (or how) the text was normalized during transcription, how the encoder resolved ambiguities in the source, what levels of encoding or analysis were applied, and similar matters. The TEI elements used to encode the encoding description are described in section 2.3 The Encoding Description below.
- a text profile, tagged profileDesc, containing classificatory and contextual information about the text, such as its subject matter, the situation in which it was produced, the individuals described by or participating in producing it, and so forth. Such a text profile is of particular use in highly structured composite texts such as corpora or language collections, where it is often highly desirable to enforce a controlled descriptive vocabulary or to perform retrievals from a body of text in terms of text type or origin. The text profile may however be of use in any form of automatic text processing. The TEI elements used to encode the profile description are described in section 2.4 The Profile Description below.
- a container element, tagged xenoData, which allows easy inclusion of metadata from non-TEI schemes (i.e., other than elements in the TEI namespace). For example, the MARC record for the encoded document might be included using MARCXML or MODS. A simple set of metadata for harvesting might be included encoded in Dublin Core.
- a revision history, tagged revisionDesc, which allows the encoder to provide a history of changes made during the development of the electronic text. The revision history is important for version control and for resolving questions about the history of a file. The TEI elements used to encode the revision description are described in section 2.6 The Revision Description below.
A TEI header can be a very large and complex object, or it may be a very simple one. Some application areas (for example, the construction of language corpora and the transcription of spoken texts) may require more specialized and detailed information than others. The present proposals therefore define both a core set of elements (all of which may be used without formality in any TEI header) and some additional elements which become available within the header as the result of including additional specialized modules within the schema. When the module for language corpora (described in chapter 15 Language Corpora) is in use, for example, several additional elements are available, as further detailed in that chapter.
The next section of the present chapter briefly introduces the overall structure of the header and the kinds of data it may contain. This is followed by a detailed description of all the constituent elements which may be used in the core header. Section 2.7 Minimal and Recommended Headers, at the end of the present chapter, discusses the recommended content of a minimal TEI header and its relation to standard library cataloguing practices.
TEI: Organization of the TEI Header¶2.1 Organization of the TEI Header
TEI: The TEI Header and Its Components¶2.1.1 The TEI Header and Its Components
The teiHeader element should be clearly distinguished from the front matter of the text itself (for which see section 4.5 Front Matter). A composite text, such as a corpus or collection, may contain several headers, as further discussed below. In the general case, however, a TEI-conformant text will contain a single teiHeader element, followed by a single text or facsimile element, or both.
The header element has the following description:
- teiHeader (TEI 헤더) 모든 TEI 구조의 텍스트 서두에 위치하는 전자 제목 페이지를 구성하는 기술적이고 선언적인 정보를 제공한다.
As discussed above, the teiHeader element has five principal components:
- fileDesc (파일 기술) 전자 파일에 관한 완전한 서지 정보의 기술을 포함한다.
- encodingDesc (부호화 기술) 전자 텍스트와 그것의 원전 텍스트 혹은 텍스트들 사이의 관련성을 기록한다.
- profileDesc (텍스트-개요 기술) 분명하게 언어와 특수 언어가 사용된 텍스트, 텍스트가 생산된 상황, 참여자, 배경에 관한 비서지적 측면을 상세히 기술한다.
- xenoData (non-TEI metadata) provides a container element into which metadata in non-TEI formats may be placed.
- revisionDesc (수정 기술) 하나의 파일에 대한 수정 이력을 요약한다.
<fileDesc>
<titleStmt>
<title>
<!-- title of the resource -->
</title>
</titleStmt>
<publicationStmt>
<p>
<!-- Information about distribution of the resource -->
</p>
</publicationStmt>
<sourceDesc>
<p>
<!-- Information about source from which the resource derives -->
</p>
</sourceDesc>
</fileDesc>
</teiHeader>
<teiHeader xml:lang="fr">
<!-- ... -->
</teiHeader>
<text xml:lang="en">
<!-- ... -->
</text>
</TEI>
<teiHeader>
<!-- corpus-level metadata here -->
</teiHeader>
<TEI>
<teiHeader>
<!-- metadata specific to this text here -->
</teiHeader>
<text>
<!-- ... -->
</text>
</TEI>
<TEI>
<teiHeader>
<!-- metadata specific to this text here -->
</teiHeader>
<text>
<!-- ... -->
</text>
</TEI>
</teiCorpus>
TEI: Types of Content in the TEI Header¶2.1.2 Types of Content in the TEI Header
The elements occurring within the TEI header may contain several types of content; the following list indicates how these types of content are described in the following sections:
- free prose
- Most elements contain simple running prose at some level. Many elements may contain either prose (possibly organized into paragraphs) or more specific elements, which themselves contain prose. In this chapter's descriptions of element content, the phrase prose description should be understood to imply a series of paragraphs, each marked as a p element. The word phrase, by contrast, should be understood to imply character data, interspersed as need be with phrase-level elements, but not organized into paragraphs. For more information on paragraphs, highlighted phrases, lists, etc., see section 3.1 Paragraphs.
- grouping elements
- Elements whose names end with the suffix Stmt (e.g. editionStmt, titleStmt) and the xenoData element enclose a group of specialized elements recording some structured information. In the case of the bibliographic elements, the suffix Stmt is used in names of elements corresponding to the ‘areas’ of the International Standard Bibliographic Description.5 In the case of the xenoData element, the specialized elements are not TEI elements, but rather come from some other metadata scheme. In most cases grouping elements may contain prose descriptions as an alternative to the set of specialized elements, thus allowing the encoder to choose whether or not the information concerned should be presented in a structured form or in prose.
- declarations
- Elements whose names end with the suffix Decl (e.g. tagsDecl, refsDecl) enclose information about specific encoding practices applied in the electronic text; often these practices are described in coded form. Typically, such information takes the form of a series of declarations, identifying a code with some more complex structure or description. A declaration which applies to more than one text or division of a text need not be repeated in the header of each such text or subdivision. Instead, the decls attribute of each text (or subdivision of the text) to which the declaration applies may be used to supply a cross-reference to it, as further described in section 15.3 Associating Contextual Information with a Text.
- descriptions
- Elements whose names end with the suffix Desc (e.g. settingDesc, projectDesc) contain a prose description, possibly, but not necessarily, organized under some specific headings by suggested sub-elements.
TEI: Model Classes in the TEI Header¶2.1.3 Model Classes in the TEI Header
The TEI header provides a very rich collection of metadata categories, but makes no claim to be exhaustive. It is certainly the case that individual projects may wish to record specialized metadata which either does not fit within one of the predefined categories identified by the TEI header or requires a more specialized element structure than is proposed here. To overcome this problem, the encoder may elect to define additional elements using the customization methods discussed in 23.3 Customization. The TEI class system makes such customizations simpler to effect and easier to use in interchange.
These classes are specific to parts of the header:
- model.applicationLike 문서에 관한 애플리케이션 명시 정보를 헤더에 기록하는 요소를 모아 놓는다.
application 문서에 사용한 애플리케이션에 관한 정보를 제시한다. - model.availabilityPart groups elements such as licences and paragraphs of text which may appear as part of an availability statement
licence contains information about a licence or other legal agreement applicable to the text. - model.catDescPart TEI 헤더 범주 기술의 성분 요소를 모아 놓는다.
textDesc (텍스트 기술) 장면적 매개변수를 통해 텍스트에 대해 기술한다. - model.editorialDeclPart editorialDecl 내부에 사용되어 여러 번 출현될 수 있는 요소들을 모아 놓는다.
correction (수정 원리) 텍스트의 수정의 상황과 방법에 대해 설명한다. hyphenation 원본 텍스트의 하이픈이 부호화된 버전에서 처리된 방법을 요약한다. interpretation 전사본에 덧붙여 텍스트에 부착된 분석적 또는 해석적 정보의 범위를 기술한다. normalization 전자 형식으로 변환할 때 수행된 원본 텍스트의 표준화 또는 규칙화의 정도를 표시한다. punctuation specifies editorial practice adopted with respect to punctuation marks in the original. quotation 원본 인용 부호에 관해 채택한 편집 방식을 명시한다. segmentation 예를 들어 문장, 음성 단위, 문자적 층위로 텍스트가 분절되는 원리를 기술한다. stdVals (표준 값) 표준화된 날짜 또는 숫자 값이 제시될 때 사용되는 형식을 명시한다. - model.encodingDescPart encodingDesc 내부에 사용되어 여러 번 출현될 수 있는 요소들을 모아 놓는다.
appInfo (애플리케이션 정보) TEI 파일을 편집한 애플리케이션에 관한 정보를 기록한다. charDecl (문자 선언) 비표준 문자와 그림문자에 대한 정보를 제공한다. classDecl (분류 선언) 텍스트 내 어디서든 사용될 수 있는 분류 부호를 정의하는 분류법을 포함한다. editorialDecl (편집 실행 선언) 텍스트 부호화에서 적용된 편집 원리 및 기준의 상세 항목을 제시한다. fsdDecl (자질 체계 선언) 하나 이상의 자질 구조 선언 또는 자질 구조 선언 연결을 구성하고 있는 자질 체계 선언을 제시한다. geoDecl (지리적 좌표 선언) 문서 내 어디서든지 geo 요소의 내용으로 표현된 지리적 좌표로 사용된 표기법과 자료를 기재한다. listPrefixDef (list of prefix definitions) contains a list of definitions of prefixing schemes used in teidata.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs. metDecl (운율적 표기법 선언) 운율적 텍스트에서 구조적 요소(예, lg, l, 또는 seg)에 대한 met, real, 또는 rhyme 속성 값으로 명시할 때 운율적 유형을 표시하는 표기법을 기록한다. projectDesc (프로젝트 기술) 전자 파일이 부호화된 목적을 상세히 기술하며, 아울러 그것이 수집된 절차에 관한 기타 관련 정보를 기술한다, . refsDecl (참조 선언) 표준 참조가 이러한 텍스트에 대해 어떻게 구성되는가를 명시한다. samplingDecl (표본 추출 선언) 코퍼스 또는 텍스트 집단 구축에서 표본 추출에 사용된 원리와 방법에 대한 산문체 기술을 포함한다. schemaRef (schema reference) describes or points to a related customization or schema file schemaSpec (스키마 명시) TEI 구조 스키마 및 문서를 생성한다. styleDefDecl (style definition language declaration) specifies the name of the formal language in which style or renditional information is supplied elsewhere in the document. The specific version of the scheme may also be supplied. tagsDecl (태깅 선언) 문서에 적용된 태깅에 관한 정보를 상세하게 제공한다. transcriptionDesc describes the set of transcription conventions used, particularly for spoken material. unitDecl (unit declarations) provides information about units of measurement that are not members of the International System of Units. variantEncoding 텍스트 비평 이문을 부호화하는 방법을 선언한다. - model.profileDescPart profileDesc 내부에 사용되어 여러 번 출현할 수 있는 요소를 모아 놓는다.
abstract contains a summary or formal abstract prefixed to an existing source document by the encoder. calendarDesc (calendar description) contains a description of the calendar system used in any dating expression found in the text. correspDesc (correspondence description) contains a description of the actions related to one act of correspondence. creation 텍스트 생성에 관한 정보를 포함한다. handNotes 원본 텍스트 내에서 식별되는 필적을 기록하는 하나 이상의 handNote 요소를 포함한다. langUsage (언어 사용) 텍스트 내에 나타나는 언어, 특수 언어, 레지스터, 방언 등을 기술한다. listTranspose supplies a list of transpositions, each of which is indicated at some point in a document typically by means of metamarks. particDesc (참여 기술) 언어적 상호작용에서 식별가능한 화자, 음성, 또는 기타 참여자를 기술한다. settingDesc (무대 기술) 언어적 상호작용이 발생하는 무대 또는 배경을 산문적 기술로서 또는 일련의 무대 요소로서 기술한다. textClass (텍스트 분류) 표준 분류 스키마, 시소러스 등을 통해서 텍스트의 특성 또는 주제를 기술하는 정보를 모아 놓는다. textDesc (텍스트 기술) 장면적 매개변수를 통해 텍스트에 대해 기술한다. - model.teiHeaderPart TEI 헤더에 한번 이상 출현하는 상위 층위 요소들을 모아 놓는다.
encodingDesc (부호화 기술) 전자 텍스트와 그것의 원전 텍스트 혹은 텍스트들 사이의 관련성을 기록한다. profileDesc (텍스트-개요 기술) 분명하게 언어와 특수 언어가 사용된 텍스트, 텍스트가 생산된 상황, 참여자, 배경에 관한 비서지적 측면을 상세히 기술한다. xenoData (non-TEI metadata) provides a container element into which metadata in non-TEI formats may be placed. - model.sourceDescPart sourceDesc 내부에 사용되어 여러 번 출현할 수 있는 요소들을 모아 놓는다.
recordingStmt (녹음 진술) 구어 텍스트 전사의 기반으로 사용된 녹음 집합을 기술한다. scriptStmt (스크립트 진술) 구어 텍스트에 사용되는 스크립트의 상세 항목을 제시하는 인용을 포함한다. - model.textDescPart 예를 들어, 상황적 변인을 통해, 텍스트를 범주화하는 요소를 모아 놓는다.
channel (주요 경로) 텍스트가 전달되거나 경험되어지는 매체나 경로를 기술한다. 문어 텍스트의 경우는 인쇄본, 원고, 전자메일 등일 수 있고, 구어 텍스트의 경우는 라디오, 전화, 대면 대화 등이 해당될 수 있다. constitution 예를 들어 미완, 완성 등의 텍스트 또는 텍스트 샘플의 내부적 구성을 기술한다. derivation 텍스트의 원본성에 대한 특성과 범위를 기술한다. domain (사용 영역) 텍스트가 실현되거나 사용되는 가장 중요한 사회적 맥락을 기술한다. 예를 들어 공적 대 사적, 교육, 종교 등. factuality 텍스트가 상상적 또는 비상상적, 즉, 허구적 또는 사실적 세계를 기술하는 것으로 간주될 수 있는 범위를 기술한다. interaction 예를 들어 대답 또는 감탄, 논평 등의 형식으로 텍스트를 생산하고 경험하는 대상들 사이의 상호작용의 범위, 기준, 그리고 특성을 기술한다. preparedness 텍스트가 준비된 것 또는 자발적인 것으로 간주될 수 있는지의 범위를 기술한다.
TEI: The File Description¶2.2 The File Description
This section describes the fileDesc element, which is the first component of the teiHeader element.
The bibliographic description of a machine-readable or digital text resembles in structure that of a book, an article, or any other kind of textual object. The file description element of the TEI header has therefore been closely modelled on existing standards in library cataloguing; it should thus provide enough information to allow users to give standard bibliographic references to the electronic text, and to allow cataloguers to catalogue it. Bibliographic citations occurring elsewhere in the header, and also in the text itself, are derived from the same model (on bibliographic citations in general, see further section 3.11 Bibliographic Citations and References). See further section 2.8 Note for Library Cataloguers.
The bibliographic description of an electronic text should be supplied by the mandatory fileDesc element:
- fileDesc (파일 기술) 전자 파일에 관한 완전한 서지 정보의 기술을 포함한다.
The fileDesc element contains three mandatory elements and four optional elements, each of which is described in more detail in sections 2.2.1 The Title Statement to 2.2.6 The Notes Statement below. These elements are listed below in the order in which they must be given within the fileDesc element.
- titleStmt (제목 진술) 저작의 제목 그리고 지적 내용에 대한 책임에 관한 정보를 모아 놓는다.
- editionStmt (편집 진술) 텍스트의 한 판에 관련된 정보를 모아 놓는다.
- extent 전달 매체, 즉, 디지털 또는 비디지털로 저장된 텍스트의, 다양한 단위로 명시되는, 대략적 규모를 기술한다.
- publicationStmt (출판 진술) 전자 또는 기타 텍스트의 출판 또는 배포에 관한 정보를 모아 놓는다.
- seriesStmt (연속간행물 진술) 연속 간행물에 대한 정보를 모아 놓는다.
- notesStmt (주석 진술) 서지 정보 기술의 다른 부분에 기록된 것들 이외에, 텍스트에 관한 정보를 제공하는 주석들을 모아 놓는다.
- sourceDesc (원전 기술) 전자 파일을 생성하거나 도출한 원전 텍스트에 대한 기술을 제시한다.
<fileDesc>
<titleStmt>
<title>
<!-- title of the resource -->
</title>
</titleStmt>
<editionStmt>
<p>
<!-- information about the edition of the resource -->
</p>
</editionStmt>
<extent>
<!-- description of the size of the resource -->
</extent>
<publicationStmt>
<p>
<!-- information about the distribution of the resource -->
</p>
</publicationStmt>
<seriesStmt>
<p>
<!-- information about any series to which the resource belongs -->
</p>
</seriesStmt>
<notesStmt>
<note>
<!-- notes on other aspects of the resource -->
</note>
</notesStmt>
<sourceDesc>
<p>
<!-- information about the source from which the resource was derived -->
</p>
</sourceDesc>
</fileDesc>
</teiHeader>
TEI: The Title Statement¶2.2.1 The Title Statement
The titleStmt element is the first component of the fileDesc element, and is mandatory:
- titleStmt (제목 진술) 저작의 제목 그리고 지적 내용에 대한 책임에 관한 정보를 모아 놓는다.
It contains the title given to the electronic work, together with one or more optional statements of responsibility which identify the encoder, editor, author, compiler, or other parties responsible for it:
- title 다양한 종류의 작업에 대한 전체 제목을 제공한다.
- author 참고문헌에 작가, 단독 저자, 공동 저자의 이름을 포함한다; 서지 항목의 책임에 관한 1차적 진술.
- editor 서지 항목의 책임에 관한 2차적 진술, 예를 들어, 편집, 번역 등의 작업을 한 편집, 개인, 기관, 또는 기구의 이름
- sponsor 후원 조직 또는 기관의 이름을 명시한다.
- funder (재정 지원 조직체) 프로젝트 또는 텍스트의 재정 지원 책임을 지는 개인, 기관, 조직의 이름을 명시한다.
- principal (책임 연구자) 전자 텍스트 생성에 대한 책임을 지는 책임 연구자의 이름을 제시한다.
- respStmt (책임성 진술) 텍스트, 편집, 녹음 또는 총서의 지적 내용에 대한 책임성 진술을 제시한다. 여기에서 작가, 편집자 등에 대한 특별한 요소는 충분치 않거나 적용되지 않는다.
- resp (책임성) 개인의 지적 책임성에 관한 특성을 기술하는 구를 포함한다.
- name (이름, 고유명사) 고유명사 또는 명사구를 포함한다.
The title element contains the chief name of the electronic work, including any alternative title or subtitles it may have. It may be repeated, if the work has more than one title (perhaps in different languages) and takes whatever form is considered appropriate by its creator. Where the electronic work is derived from an existing source text, it is strongly recommended that the title for the former should be derived from the latter, but clearly distinguishable from it, for example by the addition of a phrase such as ‘: an electronic transcription’ or ‘a digital edition’. This will distinguish the electronic work from the source text in citations and in catalogues which contain descriptions of both types of material.
The electronic work will also have an external name (its ‘filename’ or ‘data set name’) or reference number on the computer system where it resides at any time. This name is likely to change frequently, as new copies of the file are made on the computer system. Its form is entirely dependent on the particular computer system in use and thus cannot always easily be transferred from one system to another. Moreover, a given work may be composed of many files. For these reasons, these Guidelines strongly recommend that such names should not be used as the title for any electronic work.
Helpful guidance on the formulation of useful descriptive titles in difficult cases may be found in chapter 25 of Anglo-American Cataloguing Rules (2002–2005)) or another national cataloguing code.
The elements author, editor, sponsor, funder, and principal, are specializations of the more general respStmt element. These elements are used to provide the statements of responsibility which identify the person(s) responsible for the intellectual or artistic content of an item and any corporate bodies from which it emanates.
Any number of such statements may occur within the title statement. At a minimum, identify the author of the text and (where appropriate) the creator of the file. If the bibliographic description is for a corpus, identify the creator of the corpus. Optionally include also names of others involved in the transcription or elaboration of the text, sponsors, and funding agencies. The name of the person responsible for physical data input need not normally be recorded, unless that person is also intellectually responsible for some aspect of the creation of the file.
Where the person whose responsibility is to be documented is not an author, sponsor, funding body, or principal researcher, the respStmt element should be used. This has two subcomponents: a name element identifying a responsible individual or organization, and a resp element indicating the nature of the responsibility. No specific recommendations are made at this time as to appropriate content for the resp: it should make clear the nature of the responsibility concerned, as in the examples below.
Names given may be personal names or corporate names. Give all names in the form in which the persons or bodies wish to be publicly cited. This would usually be the fullest form of the name, including first names.6
<title>Capgrave's Life of St. John Norbert: a
machine-readable transcription</title>
<respStmt>
<resp>compiled by</resp>
<name>P.J. Lucas</name>
</respStmt>
</titleStmt>
<title>Two stories by Edgar Allen Poe: electronic version</title>
<author>Poe, Edgar Allen (1809-1849)</author>
<respStmt>
<resp>compiled by</resp>
<name>James D. Benson</name>
</respStmt>
</titleStmt>
<title>Yogadarśanam (arthāt
yogasūtrapūṭhaḥ):
a digital edition.</title>
<title>The Yogasūtras of Patañjali:
a digital edition.</title>
<funder>Wellcome Institute for the History of Medicine</funder>
<principal>Dominik Wujastyk</principal>
<respStmt>
<name>Wieslaw Mical</name>
<resp>data entry and proof correction</resp>
</respStmt>
<respStmt>
<name>Jan Hajic</name>
<resp>conversion to TEI-conformant markup</resp>
</respStmt>
</titleStmt>
TEI: The Edition Statement¶2.2.2 The Edition Statement
The editionStmt element is the second component of the fileDesc element. It is optional but recommended.
- editionStmt (편집 진술) 텍스트의 한 판에 관련된 정보를 모아 놓는다.
It contains either phrases or more specialized elements identifying the edition and those responsible for it:
- edition (편집, 판) 텍스트의 한 판의 특성을 기술한다.
- respStmt (책임성 진술) 텍스트, 편집, 녹음 또는 총서의 지적 내용에 대한 책임성 진술을 제시한다. 여기에서 작가, 편집자 등에 대한 특별한 요소는 충분치 않거나 적용되지 않는다.
- name (이름, 고유명사) 고유명사 또는 명사구를 포함한다.
- resp (책임성) 개인의 지적 책임성에 관한 특성을 기술하는 구를 포함한다.
For printed texts, the word edition applies to the set of all the identical copies of an item produced from one master copy and issued by a particular publishing agency or a group of such agencies. A change in the identity of the distributing body or bodies does not normally constitute a change of edition, while a change in the master copy does.
For electronic texts, the notion of a ‘master copy’ is not entirely appropriate, since they are far more easily copied and modified than printed ones; nonetheless the term edition may be used for a particular state of a machine-readable text at which substantive changes are made and fixed. Synonymous terms used in these Guidelines are version, level, and release. The words revision and update, by contrast, are used for minor changes to a file which do not amount to a new edition.
No simple rule can specify how ‘substantive’ changes have to be before they are regarded as producing a new edition, rather than a simple update. The general principle proposed here is that the production of a new edition entails a significant change in the intellectual content of the file, rather than its encoding or appearance. The addition of analytic coding to a text would thus constitute a new edition, while automatic conversion from one coded representation to another would not. Changes relating to the character code or physical storage details, corrections of misspellings, simple changes in the arrangement of the contents and changes in the output format do not normally constitute a new edition, whereas the addition of new information (e.g. a linguistic analysis expressed in part-of-speech tagging, sound or graphics, referential links to external data sets) almost always does.
Clearly, there will always be borderline cases and the matter is somewhat arbitrary. The simplest rule is: if you think that your file is a new edition, then call it such. An edition statement is optional for the first release of a computer file; it is mandatory for each later release, though this requirement cannot be enforced by the parser.
Note that all changes in a file considered significant, whether or not they are regarded as constituting a new edition or simply a new revision, should be independently noted in the revision description section of the file header (see section 2.6 The Revision Description).
The edition element should contain phrases describing the edition or version, including the word edition, version, or equivalent, together with a number or date, or terms indicating difference from other editions such as new edition, revised edition etc. Any dates that occur within the edition statement should be marked with the date element. The n attribute of the edition element may be used as elsewhere to supply any formal identification (such as a version number) for the edition.
One or more respStmt elements may also be used to supply statements of responsibility for the edition in question. These may refer to individuals or corporate bodies and can indicate functions such as that of a reviser, or can name the person or body responsible for the provision of supplementary matter, of appendices, etc., in a new edition. For further detail on the respStmt element, see section 3.11 Bibliographic Citations and References.
<edition n="P2">Second draft, substantially
extended, revised, and corrected.</edition>
</editionStmt>
<edition>Student's edition, <date>June 1987</date>
</edition>
<respStmt>
<resp>New annotations by</resp>
<name>George Brown</name>
</respStmt>
</editionStmt>
TEI: Type and Extent of File¶2.2.3 Type and Extent of File
The extent element is the third component of the fileDesc element. It is optional.
- extent 전달 매체, 즉, 디지털 또는 비디지털로 저장된 텍스트의, 다양한 단위로 명시되는, 대략적 규모를 기술한다.
For printed books, information about the carrier, such as the kind of medium used and its size, are of great importance in cataloguing procedures. The print-oriented rules for bibliographic description of an item's medium and extent need some re-interpretation when applied to electronic media. An electronic file exists as a distinct entity quite independently of its carrier and remains the same intellectual object whether it is stored on a magnetic tape, a CD-ROM, a set of floppy disks, or as a file on a mainframe computer. Since, moreover, these Guidelines are specifically aimed at facilitating transparent document storage and interchange, any purely machine-dependent information should be irrelevant as far as the file header is concerned.
This is particularly true of information about file-type although library-oriented rules for cataloguing often distinguish two types of computer file: ‘data’ and ‘programs’. This distinction is quite difficult to draw in some cases, for example, hypermedia or texts with built in search and retrieval software.
Although it is equally system-dependent, some measure of the size of the computer file may be of use for cataloguing and other practical purposes. Because the measurement and expression of file size is fraught with difficulties, only very general recommendations are possible; the element extent is provided for this purpose. It contains a phrase indicating the size or approximate size of the computer file in one of the following ways:
- in bytes of a specified length (e.g. ‘4000 16-bit bytes’)
- as falling within a range of categories, for example:
- less than 1 Mb
- between 1 Mb and 5 Mb
- between 6 Mb and 10 Mb
- over 10 Mb
- in terms of any convenient logical units (for example, words or sentences, citations, paragraphs)
- in terms of any convenient physical units (for example, blocks, disks, tapes)
The use of standard abbreviations for units of quantity is recommended where applicable, here as elsewhere (see http://physics.nist.gov/cuu/Units/binary.html).
2 Mb</extent>
<extent>4.2 MiB</extent>
<extent>4532 bytes</extent>
<extent>3200 sentences</extent>
<extent>Five 90 mm High Density Diskettes</extent>
<measure unit="MiB" quantity="4.2">About four megabytes</measure>
<measure unit="pages" quantity="245">245 pages of source
material</measure>
</extent>
TEI: Publication, Distribution, Licensing, etc.¶2.2.4 Publication, Distribution, Licensing, etc.
The publicationStmt element is the fourth component of the fileDesc element and is mandatory. Its function is to name the agency by which a resource is made available (for example, a publisher or distributor) and to supply any additional information about the way in which it is made available such as licensing conditions, identifying numbers, etc.
- publicationStmt (출판 진술) 전자 또는 기타 텍스트의 출판 또는 배포에 관한 정보를 모아 놓는다.
It may contain either a simple prose description organized as one or more paragraphs, or the more specialised elements described below.
A structured publication statement must begin with one of the following elements:
- publisher 서지 항목의 출판이나 배포에 책임이 있는 기구명을 제시한다.
- distributor 텍스트 배포 권한을 갖는 개인 또는 기관의 이름을 제시한다.
- authority (배포권자) 출판사 또는 배포자 외에 전자 파일 사용 허가에 대한 권한을 갖는 개인 또는 기관의 이름을 제시한다.
These elements form the model.publicationStmtPart.agency class; if the agency making the resource available is unknown, but other structured information about it is available, an explicit statement such as ‘publisher unknown’ should be used.
The publisher is the person or institution by whose authority a given edition of the file is made public. The distributor is the person or institution from whom copies of the text may be obtained. Where a text is not considered formally published, but is nevertheless made available for circulation by some individual or organization, this person or institution is termed the release authority.
Whichever of these elements is chosen, it may be followed by one or more of the following elements, which together form the model.publicationStmtPart.detail class
- pubPlace (출판지) 서지 대상이 출판된 장소명을 포함한다.
- address 예를 들어, 출판사, 기관, 개인의 우편 주소를 포함한다.
- idno (식별 숫자) 서지 정보 항목을 식별하기 위해 사용되는 표준 또는 비표준 숫자를 제시한다.
type 예를 들어 ISBN 또는 기타 표준 일련번호로, 숫자를 범주화한다. 제안값은 다음을 포함한다: 1] ISBN; 2] ISSN; 3] DOI; 4] URI; 5] VIAF; 6] ESTC; 7] OCLC - availability 텍스트 이용에 관한 정보를 제시한다. 예를 들어 사용 또는 배포의 제한, 저작권 상태 등.
status 텍스트의 현재 이용 가능성을 식별하는 부호를 제공한다. - date 다양한 형식의 날짜를 포함한다.
- licence contains information about a licence or other legal agreement applicable to the text.
<publisher>Oxford University Press</publisher>
<pubPlace>Oxford</pubPlace>
<date>1989</date>
<idno type="ISBN">0-19-254705-4</idno>
<availability>
<p>Copyright 1989, Oxford University Press</p>
</availability>
</publicationStmt>
<authority>James D. Benson</authority>
<pubPlace>London</pubPlace>
<date>1994</date>
</publicationStmt>
A resource may have (for example) both a publisher and a distributor, or more than one publisher each using different identifiers for the same resource, and so on. For this reason, the sequence of at least one model.publicationStmtPart.agency element followed by zero or more model.publicationStmtPart.detail elements may be repeated as often as necessary.
<publisher>Sigma Press</publisher>
<address>
<addrLine>21 High Street,</addrLine>
<addrLine>Wilmslow,</addrLine>
<addrLine>Cheshire M24 3DF</addrLine>
</address>
<date>1991</date>
<distributor>Oxford Text Archive</distributor>
<idno type="OTA">1256</idno>
<availability>
<p>Available with prior consent of depositor for
purposes of academic research and teaching only.</p>
</availability>
<date>1994</date>
</publicationStmt>
The date element used within publicationStmt always refers to the date of publication, first distribution, or initial release. If the text was created at some other date, this may be recorded using the creation element within the profileDesc element. Other useful dates (such as dates of collection of data) may be given using a note in the notesStmt element.
<publisher>University of Victoria Humanities Computing and Media Centre</publisher>
<pubPlace>Victoria, BC</pubPlace>
<date>2011</date>
<availability status="restricted">
<licence target="http://creativecommons.org/licenses/by-sa/3.0/"> Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License
</licence>
</availability>
</publicationStmt>
TEI: The Series Statement¶2.2.5 The Series Statement
The seriesStmt element is the fifth component of the fileDesc element and is optional.
- seriesStmt (연속간행물 진술) 연속 간행물에 대한 정보를 모아 놓는다.
In bibliographic parlance, a series may be defined in one of the following ways:
- A group of separate items related to one another by the fact that each item bears, in addition to its own title proper, a collective title applying to the group as a whole. The individual items may or may not be numbered.
- Each of two or more volumes of essays, lectures, articles, or other items, similar in character and issued in sequence.
- A separately numbered sequence of volumes within a series or serial.
A seriesStmt element may contain a prose description or one or more of the following more specific elements:
- title 다양한 종류의 작업에 대한 전체 제목을 제공한다.
- idno (식별 숫자) 서지 정보 항목을 식별하기 위해 사용되는 표준 또는 비표준 숫자를 제시한다.
- respStmt (책임성 진술) 텍스트, 편집, 녹음 또는 총서의 지적 내용에 대한 책임성 진술을 제시한다. 여기에서 작가, 편집자 등에 대한 특별한 요소는 충분치 않거나 적용되지 않는다.
- resp (책임성) 개인의 지적 책임성에 관한 특성을 기술하는 구를 포함한다.
- name (이름, 고유명사) 고유명사 또는 명사구를 포함한다.
The idno may be used to supply any identifying number associated with the item, including both standard numbers such as an ISSN and particular issue numbers. (Arabic numerals separated by punctuation are recommended for this purpose: 6.19.33, for example, rather than VI/xix:33). Its type attribute is used to categorize the number further, taking the value ISSN for an ISSN for example. Multiple seriesStmt elements may be supplied if the TEI document is associated with more than one series.
<title level="s">Machine-Readable Texts for the Study of
Indian Literature</title>
<respStmt>
<resp>ed. by</resp>
<name>Jan Gonda</name>
</respStmt>
<biblScope unit="volume">1.2</biblScope>
<idno type="ISSN">0 345 6789</idno>
</seriesStmt>
TEI: The Notes Statement¶2.2.6 The Notes Statement
The notesStmt element is the sixth component of the fileDesc element and is optional. If used, it contains one or more note elements, each containing a single piece of descriptive information of the kind treated as ‘general notes’ in traditional bibliographic descriptions.
- notesStmt (주석 진술) 서지 정보 기술의 다른 부분에 기록된 것들 이외에, 텍스트에 관한 정보를 제공하는 주석들을 모아 놓는다.
- note contains a note or annotation.
Some information found in the notes area in conventional bibliography has been assigned specific elements in these Guidelines; in particular the following items should be tagged as indicated, rather than as general notes:
- the nature, scope, artistic form, or purpose of the file; also the genre or other intellectual category to which it may belong: e.g. ‘Text types: newspaper editorials and reportage, science fiction, westerns, and detective stories’. These should be formally described within the profileDesc element (section 2.4 The Profile Description).
- an abstract or summary of the content of a document which has been supplied by the encoder because no such abstract forms part of the content of the source. This should be supplied in the abstract element within the profileDesc element (section 2.4 The Profile Description).
- summary description providing a factual, non-evaluative account of the subject content of the file: e.g. ‘Transcribes interviews on general topics with native speakers of English in 17 cities during the spring and summer of 1963.’ These should also be formally described within the profileDesc element (section 2.4 The Profile Description).
- bibliographic details relating to the source or sources of an electronic text: e.g. ‘Transcribed from the Norton facsimile of the 1623 Folio’. These should be formally described in the sourceDesc element (section 2.2.7 The Source Description).
- further information relating to publication, distribution, or release of the text, including sources from which the text may be obtained, any restrictions on its use or formal terms on its availability. These should be placed in the appropriate division of the publicationStmt element (section 2.2.4 Publication, Distribution, Licensing, etc.).
- publicly documented numbers associated with the file: e.g. ‘ICPSR study number 1803’ or ‘Oxford Text Archive text number 1243’. These should be placed in an idno element within the appropriate division of the publicationStmt element. International Standard Serial Numbers (ISSN), International Standard Book Numbers (ISBN), and other internationally agreed upon standard numbers that uniquely identify an item, should be treated in the same way, rather than as specialized bibliographic notes.
Nevertheless, the notesStmt element may be used to record potentially significant details about the file and its features, e.g.:
- dates, when they are relevant to the content or condition of the computer file: e.g. ‘manual dated 1983’, ‘Interview wave I: Apr. 1989; wave II: Jan. 1990’
- names of persons or bodies connected with the technical production, administration, or consulting functions of the effort which produced the file, if these are not named in statements of responsibility in the title or edition statements of the file description: e.g. ‘Historical commentary provided by Mark Cohen’
- availability of the file in an additional medium or information not already recorded about the availability of documentation: e.g. ‘User manual is loose-leaf in eleven paginated sections’
- language of work and abstract, if not encoded in the langUsage element, e.g. ‘Text in English with summaries in French and German’
- The unique name assigned to a serial by the International Serials Data System (ISDS), if not encoded in an idno
- lists of related publications, either describing the source itself, or concerned with the creation or use of the electronic work, e.g. ‘Texts used in Burrows (1987)’
<note>Historical commentary provided by Mark Cohen.</note>
<note>OCR scanning done at University of Toronto.</note>
</notesStmt>
<title>…</title>
<respStmt>
<persName>Mark Cohen</persName>
<resp>historical commentary</resp>
</respStmt>
<respStmt>
<orgName>University of Toronto</orgName>
<resp>OCR scanning</resp>
</respStmt>
</titleStmt>
TEI: The Source Description¶2.2.7 The Source Description
The sourceDesc element is the seventh and final component of the fileDesc element. It is a mandatory element and is used to record details of the source or sources from which a computer file is derived. This might be a printed text or manuscript, another computer file, an audio or video recording of some kind, or a combination of these. An electronic file may also have no source, if what is being catalogued is an original text created in electronic form.
- sourceDesc (원전 기술) 전자 파일을 생성하거나 도출한 원전 텍스트에 대한 기술을 제시한다.
<p>Born digital.</p>
</sourceDesc>
Alternatively, it may contain elements drawn from the following three classes:
- model.biblLike 서지 기술을 포함하는 요소를 모아 놓는다.
bibl (서지 인용) 하위 성분이 명시적으로 구분된 또는 그렇지 않은 덜 구조화된 서지 인용을 포함한다. biblFull (완전히 구조화된 서지 인용 정보) 완전히 구조화된 서지 정보를 포함하며, 그 안에 TEI 파일 기술의 모든 성분이 제시된다. biblStruct (구조화된 서지 인용) 서지의 하위 요소만이 나타나는, 명시적 순서로 구성되는 구조화된 서지 인용을 포함한다. listBibl (인용 목록) 여러 종류의 서지 인용 목록을 포함한다. msDesc (원고 기술) 하나의 식별가능한 원고에 대한 기술을 포함한다. - model.sourceDescPart sourceDesc 내부에 사용되어 여러 번 출현할 수 있는 요소들을 모아 놓는다.
recordingStmt (녹음 진술) 구어 텍스트 전사의 기반으로 사용된 녹음 집합을 기술한다. scriptStmt (스크립트 진술) 구어 텍스트에 사용되는 스크립트의 상세 항목을 제시하는 인용을 포함한다. - model.listLike 목록 같은 요소를 모아 놓는다.
list 목록의 형태로 정리된 항목을 포함한다. listApp (list of apparatus entries) contains a list of apparatus entries. listEvent (list of events) contains a list of descriptions, each of which provides information about an identifiable event. listNym (표준 이름 목록) 어떤 사물에 대한 표준화된 이름 목록을 포함한다. listObject (list of objects) contains a list of descriptions, each of which provides information about an identifiable physical object. listOrg (조직 목록) 식별 가능한 조직에 관한 정보를 제공하며, 각각에 대한 기술 목록을 포함한다. listPerson (개인 목록) 기술 목록을 포함하며, 그 개별 기술 각각은, 언어 상호작용 참여자 또는 역사적 원전에서 언급되는 사람과 같이, 식별 가능한 사람 또는 사람군에 관한 정보를 제공한다. listPlace (장소 목록) 장소 목록을 포함하며, 그 뒤에 수의적으로 그들 사이에 정의된 (포함 외의) 관련성 목록을 제시한다. listRelation 비공식적으로 산문체 또는 공식적으로 표현된 관계 연결을 통하여 사람, 장소, 그리고 조직 사이에 식별되는 관련성에 관한 정보를 제공한다. listWit (비교 대상 텍스트 목록) 비평적 참조 도구에 의해 참조된 모든 비교 대상 텍스트에 대한 정의 목록. 그리고 이는 계층적 그룹으로 수의적으로 구성된다. table 행과 열의 테이블 형식으로 제시된 텍스트를 포함한다.
- bibl (서지 인용) 하위 성분이 명시적으로 구분된 또는 그렇지 않은 덜 구조화된 서지 인용을 포함한다.
- biblStruct (구조화된 서지 인용) 서지의 하위 요소만이 나타나는, 명시적 순서로 구성되는 구조화된 서지 인용을 포함한다.
- listBibl (인용 목록) 여러 종류의 서지 인용 목록을 포함한다.
<bibl>The first folio of Shakespeare, prepared by
Charlton Hinman (The Norton Facsimile, 1968)</bibl>
</sourceDesc>
<biblStruct xml:lang="fr">
<monogr>
<author>Eugène Sue</author>
<title>Martin, l'enfant trouvé</title>
<title type="sub">Mémoires d'un valet de chambre</title>
<imprint>
<pubPlace>Bruxelles et Leipzig</pubPlace>
<publisher>C. Muquardt</publisher>
<date when="1846">1846</date>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
When the header describes a text derived from some pre-existing TEI-conformant or other digital document, it may be simpler to use the following element, which is designed specifically for documents derived from texts which were ‘born digital’:
- biblFull (완전히 구조화된 서지 인용 정보) 완전히 구조화된 서지 정보를 포함하며, 그 안에 TEI 파일 기술의 모든 성분이 제시된다.
For further discussion see section 2.2.8 Computer Files Derived from Other Computer Files.
When the module for manuscript description is included in a schema, this class also makes available the following element:
- msDesc (원고 기술) 하나의 식별가능한 원고에 대한 기술을 포함한다.
This element enables the encoder to record very detailed information about one or more manuscript or analogous sources, as further discussed in 10 Manuscript Description.
The model.sourceDescPart class also makes available additional elements when additional modules are included. For example, when the spoken module is included, the sourceDesc element may also include the following special-purpose elements, intended for cases where an electronic text is derived from a spoken text rather than a written one:
- scriptStmt (스크립트 진술) 구어 텍스트에 사용되는 스크립트의 상세 항목을 제시하는 인용을 포함한다.
- recordingStmt (녹음 진술) 구어 텍스트 전사의 기반으로 사용된 녹음 집합을 기술한다.
Full descriptions of these elements and their contents are given in section 8.2 Documenting the Source of Transcribed Speech.
A single electronic text may be derived from multiple source documents, in whole or in part. The sourceDesc may therefore contain a listBibl element grouping together bibl, biblStruct, or msDesc elements for each of the sources concerned. It is also possible to repeat the sourceDesc element in such a case. The decls attribute described in section 15.3 Associating Contextual Information with a Text may be used to associate parts of the encoded text with the bibliographic element from which it derives in either case.
The source description may also include lists of names, persons, places, etc. when these are considered to form part of the source for an encoded document. When such information is recorded using the specialized elements discussed in the namesdates module (13 Names, Dates, People, and Places), the class model.listLike makes available the following elements to hold such information:
- listNym (표준 이름 목록) 어떤 사물에 대한 표준화된 이름 목록을 포함한다.
- listOrg (조직 목록) 식별 가능한 조직에 관한 정보를 제공하며, 각각에 대한 기술 목록을 포함한다.
- listPerson (개인 목록) 기술 목록을 포함하며, 그 개별 기술 각각은, 언어 상호작용 참여자 또는 역사적 원전에서 언급되는 사람과 같이, 식별 가능한 사람 또는 사람군에 관한 정보를 제공한다.
- listPlace (장소 목록) 장소 목록을 포함하며, 그 뒤에 수의적으로 그들 사이에 정의된 (포함 외의) 관련성 목록을 제시한다.
TEI: Computer Files Derived from Other Computer Files¶2.2.8 Computer Files Derived from Other Computer Files
If a computer file (call it B) is derived not from a printed source but from another computer file (call it A) which includes a TEI header, then the source text of computer file B is another computer file, A. The five sections of A's file header will need to be incorporated into the new header for B in slightly differing ways, as listed below:
- fileDesc
- A's file description should be copied into the sourceDesc section of B's file description, enclosed within a biblFull element
- profileDesc
- A's profileDesc should be copied into B's, in principle unchanged; it may however be expanded by project-specific information relating to B.
- encodingDesc
- A's encoding practice may or (more likely) may not be the same as B's. Since the object of the encoding description is to define the relationship between the current file and its source, in principle only changes in encoding practice between A and B need be documented in B. The relationship between A and its source(s) is then only recoverable from the original header of A. In practice it may be more convenient to create a new complete encodingDesc for B based on A's.
- xenoData
- B is a new computer file, with a different source than A's source (namely, A). Thus it is unlikely that metadata from other schemes about A or its source can be copied wholesale to B, although there may be similarities.
- revisionDesc
- B is a new computer file, and should therefore have a new revision description. If, however, it is felt useful to include some information from A's revisionDesc, for example dates of major updates or versions, such information must be clearly marked as relating to A rather than to B.
This concludes the discussion of the fileDesc element and its contents.
TEI: The Encoding Description¶2.3 The Encoding Description
The encodingDesc element is the second major subdivision of the TEI header. It specifies the methods and editorial principles which governed the transcription or encoding of the text in hand and may also include sets of coded definitions used by other components of the header. Though not formally required, its use is highly recommended.
- encodingDesc (부호화 기술) 전자 텍스트와 그것의 원전 텍스트 혹은 텍스트들 사이의 관련성을 기록한다.
The encoding description may contain any combination of paragraphs of text, marked up using the p element, along with more specialized elements taken from the model.encodingDescPart class. By default, this class makes available the following elements:
- projectDesc (프로젝트 기술) 전자 파일이 부호화된 목적을 상세히 기술하며, 아울러 그것이 수집된 절차에 관한 기타 관련 정보를 기술한다, .
- samplingDecl (표본 추출 선언) 코퍼스 또는 텍스트 집단 구축에서 표본 추출에 사용된 원리와 방법에 대한 산문체 기술을 포함한다.
- editorialDecl (편집 실행 선언) 텍스트 부호화에서 적용된 편집 원리 및 기준의 상세 항목을 제시한다.
- tagsDecl (태깅 선언) 문서에 적용된 태깅에 관한 정보를 상세하게 제공한다.
- styleDefDecl (style definition language declaration) specifies the name of the formal language in which style or renditional information is supplied elsewhere in the document. The specific version of the scheme may also be supplied.
- refsDecl (참조 선언) 표준 참조가 이러한 텍스트에 대해 어떻게 구성되는가를 명시한다.
- classDecl (분류 선언) 텍스트 내 어디서든 사용될 수 있는 분류 부호를 정의하는 분류법을 포함한다.
- geoDecl (지리적 좌표 선언) 문서 내 어디서든지 geo 요소의 내용으로 표현된 지리적 좌표로 사용된 표기법과 자료를 기재한다.
- unitDecl (unit declarations) provides information about units of measurement that are not members of the International System of Units.
- schemaSpec (스키마 명시) TEI 구조 스키마 및 문서를 생성한다.
- schemaRef (schema reference) describes or points to a related customization or schema file
Each of these elements is further described in the appropriate section below. Other modules have the ability to extend this class; examples are noted in section 2.3.12 Module-Specific Declarations
TEI: The Project Description¶2.3.1 The Project Description
The projectDesc element may be used to describe, in prose, the purpose for which a digital resource was created, together with any other relevant information concerning the process by which it was assembled or collected. This is of particular importance for corpora or miscellaneous collections, but may be of use for any text, for example to explain why one kind of encoding practice has been followed rather than another.
- projectDesc (프로젝트 기술) 전자 파일이 부호화된 목적을 상세히 기술하며, 아울러 그것이 수집된 절차에 관한 기타 관련 정보를 기술한다, .
<projectDesc>
<p>Texts collected for use in the
Claremont Shakespeare Clinic, June 1990.</p>
</projectDesc>
</encodingDesc>
TEI: The Sampling Declaration¶2.3.2 The Sampling Declaration
- samplingDecl (표본 추출 선언) 코퍼스 또는 텍스트 집단 구축에서 표본 추출에 사용된 원리와 방법에 대한 산문체 기술을 포함한다.
- the size of individual samples
- the method or methods by which they were selected
- the underlying population being sampled
- the object of the sampling procedure used
<p>Samples of 2000 words taken from the beginning of the text.</p>
</samplingDecl>
<p>Text of stories only has been transcribed. Pull quotes, captions,
and advertisements have been silently omitted. Any mathematical
expressions requiring symbols not present in the ISOnum or ISOpub
entity sets have been omitted, and their place marked with a GAP
element.</p>
</samplingDecl>
A sampling declaration which applies to more than one text or division of a text need not be repeated in the header of each such text. Instead, the decls attribute of each text (or subdivision of the text) to which the sampling declaration applies may be used to supply a cross-reference to it, as further described in section 15.3 Associating Contextual Information with a Text.
TEI: The Editorial Practices Declaration¶2.3.3 The Editorial Practices Declaration
The editorialDecl element is used to provide details of the editorial practices applied during the encoding of a text.
- editorialDecl (편집 실행 선언) 텍스트 부호화에서 적용된 편집 원리 및 기준의 상세 항목을 제시한다.
It may contain a prose description only, or one or more of a set of specialized elements, members of the TEI model.editorialDeclPart class. Where an encoder wishes to record an editorial policy not specified above, this may be done by adding a new element to this class, using the mechanisms discussed in chapter 23.3 Customization.
Some of these policy elements carry attributes to support automated processing of certain well-defined editorial decisions; all of them contain a prose description of the editorial principles adopted with respect to the particular feature concerned. Examples of the kinds of questions which these descriptions are intended to answer are given in the list below.
- correction
- correction (수정 원리) 텍스트의 수정의 상황과 방법에 대해 설명한다.
status 텍스트에 적용된 수정의 정도를 표시한다. method 텍스트의 수정사항을 표시하기 위해 채택한 방법을 나타낸다.
Was the text corrected during or after data capture? If so, were corrections made silently or are they marked using the tags described in section 3.4 Simple Editorial Changes? What principles have been adopted with respect to omissions, truncations, dubious corrections, alternate readings, false starts, repetitions, etc.?
- correction (수정 원리) 텍스트의 수정의 상황과 방법에 대해 설명한다.
- normalization
- normalization 전자 형식으로 변환할 때 수행된 원본 텍스트의 표준화 또는 규칙화의 정도를 표시한다.
source [att.global.source] specifies the source from which some aspect of this element is drawn. method 텍스트의 표준화를 표시하기 위해 채택한 방법론을 제시한다.
Was the text normalized, for example by regularizing any non-standard spellings, dialect forms, etc.? If so, were normalizations performed silently or are they marked using the tags described in section 3.4 Simple Editorial Changes? What authority was used for the regularization? Also, what principles were used when normalizing numbers to provide the standard values for the value attribute described in section 3.5.3 Numbers and Measures and what format used for them?
- normalization 전자 형식으로 변환할 때 수행된 원본 텍스트의 표준화 또는 규칙화의 정도를 표시한다.
- punctuation
- punctuation specifies editorial practice adopted with respect to punctuation marks in the original.
marks indicates whether or not punctation marks have been retained as content within the text. placement indicates the positioning of punctuation marks that are associated with marked up text as being encoded within the element surrounding the text or immediately before or after it.
Are punctuation marks present in the original source retained? Are they identified with the element pc, or implied by markup? If retained, how are they placed with respect to related elements? For example, do commas and periods appear inside or outside elements marking phrases and sentences?
- punctuation specifies editorial practice adopted with respect to punctuation marks in the original.
- quotation
- quotation 원본 인용 부호에 관해 채택한 편집 방식을 명시한다.
marks (인용 부호) 인용 부호가 텍스트에 유지되었는지 여부를 표시한다.
How were quotation marks processed? Are apostrophes and quotation marks distinguished? How? Are quotation marks retained as content in the text or replaced by markup? Are there any special conventions regarding for example the use of single or double quotation marks when nested? Is the file consistent in its practice or has this not been checked? See section 3.3.3 Quotation for discussion of ways in which quotation marks may be encoded.
- quotation 원본 인용 부호에 관해 채택한 편집 방식을 명시한다.
- hyphenation
- hyphenation 원본 텍스트의 하이픈이 부호화된 버전에서 처리된 방법을 요약한다.
eol (행의 끝) 텍스트에서 행의 끝 하이픈이 포함되었는지의 여부를 표시한다.
Does the encoding distinguish ‘soft’ and ‘hard’ hyphens? What principle has been adopted with respect to end-of-line hyphenation where source lineation has not been retained? Have soft hyphens been silently removed, and if so what is the effect on lineation and pagination? See section 3.2.2 Hyphenation for discussion of ways in which hyphenation may be encoded.
- hyphenation 원본 텍스트의 하이픈이 부호화된 버전에서 처리된 방법을 요약한다.
- segmentation
- segmentation 예를 들어 문장, 음성 단위, 문자적 층위로 텍스트가 분절되는 원리를 기술한다.
How is the text segmented? If s or seg segmentation units have been used to divide up the text for analysis, how are they marked and how was the segmentation arrived at?
- stdVals
- stdVals (표준 값) 표준화된 날짜 또는 숫자 값이 제시될 때 사용되는 형식을 명시한다.
In most cases, attributes bearing standardized values (such as the when or when-iso attribute on dates) should conform to a defined W3C or ISO datatype. In cases where this is not appropriate, this element may be used to describe the standardization methods underlying the values supplied.
- interpretation
- interpretation 전사본에 덧붙여 텍스트에 부착된 분석적 또는 해석적 정보의 범위를 기술한다.
Has any analytic or ‘interpretive’ information been provided—that is, information which is felt to be non-obvious, or potentially contentious? If so, how was it generated? How was it encoded? If feature-structure analysis has been used, are fsdDecl elements (section 18.11 Feature System Declaration) present?
<segmentation>
<p>
<gi>s</gi> elements mark orthographic sentences and
are numbered sequentially
within their parent <gi>div</gi> element
</p>
</segmentation>
<interpretation>
<p>The part of speech analysis applied throughout section 4 was
added by hand and has not been checked.</p>
</interpretation>
<correction>
<p>Errors in transcription controlled by using the
WordPerfect spelling checker.</p>
</correction>
<normalization source="http://szotar.sztaki.hu/webster/">
<p>All words converted to Modern American spelling following
Websters 9th Collegiate dictionary.</p>
</normalization>
<quotation marks="all">
<p>All opening quotation marks represented by entity reference
<ident type="ge">odq</ident>; all closing quotation marks
represented by entity reference <ident type="ge">cdq</ident>.</p>
</quotation>
</editorialDecl>
An editorial practices declaration which applies to more than one text or division of a text need not be repeated in the header of each such text. Instead, the decls attribute of each text (or subdivision of the text) to which it applies may be used to supply a cross-reference to it, as further described in section 15.3 Associating Contextual Information with a Text.
TEI: The Tagging Declaration¶2.3.4 The Tagging Declaration
The tagsDecl element is used to record the following information about the tagging used within a particular document:
- the namespace to which elements appearing within the transcribed text belong.
- how often particular elements appear within the text, so that a recipient can validate the integrity of a text during interchange.
- any comment relating to the usage of particular elements not specified elsewhere in the header.
- a default rendition applicable to all instances of an element.
This information is conveyed by the following elements:
- rendition 원본 텍스트에서 요소의 모양에 대한 정보를 제공한다.
selector contains a selector or series of selectors specifying the elements to which the contained style description applies, expressed in the language specified in the scheme attribute. - att.styleDef provides attributes to specify the name of a formal definition language used to provide formatting or rendition information.
scheme 모양을 기술하는 언어를 식별한다. schemeVersion supplies a version number for the style language provided in scheme. - namespace 그 자식이 기술한 요소가 속하는 네임스페이스의 공식적 이름을 제공한다.
- tagUsage 텍스트에서 특정 요소의 사용에 관한 정보를 제공한다.
The tagsDecl element is descriptive, rather than prescriptive: if used, it simply documents practice in the TEI document containing it. The elements constituting a TEI customization file (discussed in chapter 22 Documentation Elements) by contrast document expected practice in a number of documents, and may thus be used prescriptively. If there is an inconsistency between the actual state of a document and what is documented by its tagsDecl, then the latter should be corrected. If there is an inconsistency between a document and what is required by the customization file, or a schema derived from it, then it will usually be the document that requires correction.
The tagsDecl element consists of an optional sequence of rendition elements, each of which must bear a unique identifier, followed by an optional sequence of one or more namespace elements, each of which contains a series of tagUsage elements, up to one for each element type from that namespace occurring within the associated text element. Note that these tagUsage elements must be nested within a namespace element, and cannot appear directly within the tagsDecl element.
TEI: Rendition¶2.3.4.1 Rendition
The rendition element allows the encoder to specify how one or more elements are rendered in the original source in any of the following ways:
- using an informal prose description
- using a standard stylesheet language such as CSS or XSL-FO
- using a project-defined formal language
One or more such specifications may be associated with elements of a document in two ways:
- the selector attribute on any rendition element may be used to select a collection of elements to which it applies
- the global rendition attribute may be used on any element to indicate its rendition, overriding or complementing any supplied default value
The global rend and style attributes may also be used to describe the rendering of an element. See further 1.3.1.1.3 Rendition Indicators.
The content of a rendition element may describe the appearance of the source material using prose, a project-defined formal language, or any standard languages such as the Cascading Stylesheet Language (Bos et al. (eds.) (2011)) or the XML vocabulary for specifying formatting semantics which forms a part of the W3C's Extensible Stylesheet Language (Berglund (ed.) (2006)). A styleDefDecl element (2.3.5 The Default Style Definition Language Declaration) may be supplied within the encodingDesc to specify which of these applies by default, and it may be overridden for one or more specific rendition elements using the scheme attribute.
font-size: 110%;
margin-top: 0.5em;
margin-bottom: 0.5em;
</rendition>
font-size: 100%;
margin-top: 0;
margin-bottom: 0;
</rendition>
In the following extended example we consider how to capture the appearance of a typical early 20th century titlepage, such as that in the following figure:
Elements for the encoding of the information on a titlepage are presented in 4.6 Title Pages; here we consider how we might go about encoding some of the visual information as well, using the rendition element and its corresponding attributes.
schemeVersion="2.1"/>
<!-- ... -->
<tagsDecl>
<rendition xml:id="center">text-align: center;</rendition>
<rendition xml:id="small">font-size: small;</rendition>
<rendition xml:id="large">font-size: large;</rendition>
<rendition xml:id="x-large">font-size: x-large;</rendition>
<rendition xml:id="xx-large">font-size: xx-large</rendition>
<rendition xml:id="expanded">letter-spacing: +3pt;</rendition>
<rendition xml:id="x-space">line-height: 150%;</rendition>
<rendition xml:id="xx-space">line-height: 200%;</rendition>
<rendition xml:id="red">color: red;</rendition>
</tagsDecl>
<docTitle rendition="#center #x-space">
<titlePart>
<lb/>
<hi rendition="#x-large">THE POEMS</hi>
<lb/>
<hi rendition="#small">OF</hi>
<lb/>
<hi rendition="#red #xx-large">ALGERNON CHARLES SWINBURNE</hi>
<lb/>
<hi rendition="#large #xx-space">IN SIX VOLUMES</hi>
</titlePart>
<titlePart rendition="#xx-space">
<lb/> VOLUME I.
<lb/>
<hi rendition="#red #x-large">POEMS AND BALLADS</hi>
<lb/>
<hi rendition="#x-space">FIRST SERIES</hi>
</titlePart>
</docTitle>
<docImprint rendition="#center">
<lb/>
<pubPlace rendition="#xx-space">LONDON</pubPlace>
<lb/>
<publisher rendition="#red #expanded">CHATTO & WINDUS</publisher>
<lb/>
<docDate when="1904" rendition="#small">1904</docDate>
</docImprint>
</titlePage>
When CSS is used as the style definition language, the scope attribute may be used to specify CSS pseudo-elements. These pseudo-elements are used to specify styling applicable to only a portion of the given text. For example, the first-letter
pseudo-element defines styling to be applied to the first letter in the targeted element, while the before
and after
pseudo-elements can be used often in conjunction with the "content" property to add additional characters which need to be added before or after the element content to make it more closely resemble the appearance of the source.
scheme="css" scope="before">content:
'“';</rendition>
<rendition xml:id="quoteAfter" scheme="css"
scope="after">content:
'”';</rendition>
quoteBefore
and quoteAfter
. Where a q element is actually rendered in the source with initial and final quotation marks, it may then be encoded as follows: ago...</q>
TEI: Tag Usage¶2.3.4.2 Tag Usage
As noted above, each namespace element, if present, should contain up to one occurrence of a tagUsage element for each element type from the given namespace that occurs within the outermost text element associated with the teiHeader in which it appears.7 The tagUsage element may be used to supply a count of the number of occurrences of this element within the text, which is given as the value of its occurs attribute. It may also be used to hold any additional usage information, which is supplied as running prose within the element itself.
</tagUsage>
(1734) edition only </tagUsage>
The content of the tagUsage element is not susceptible of automatic processing. It should not therefore be used to hold information for which provision is already made by other components of the encoding description. A TEI-conformant document is not required to provide any tagUsage elements or occurs attributes, but if it does, then the counts provided must correspond with the number of such elements present in the associated text.
TEI: The Default Style Definition Language Declaration¶2.3.5 The Default Style Definition Language Declaration
The content of the rendition element, the value of its selector attribute, and the value of the style attribute are expressed using one of a small number of formally defined style definition languages. For ease of processing, it is strongly recommended to use a single such language throughout an encoding project, although the TEI system permits a mixture.
The element styleDefDecl, a sibling of the tagsDecl element, is used to supply the name of the default style definition language. The name is supplied as the value of the scheme attribute and may take any of the following values:
- free
- Informal free text description
- css
- Cascading Stylesheet Language
- xslfo
- Extensible Stylesheet Language Formatting Objects
- other
- A user-defined formal description language
. The schemeVersion attribute may be used to supply the precise version of the style definition language used, and the content of this element, if any, may supply additional information.
When the style attribute is used, its value must always be expressed using whichever default style definition language is in force. If more than one occurrence of the styleDefDecl is provided, there will be more than one default available, and the decls attribute must be used to select which is applicable in a given context, as discussed in section 15.3 Associating Contextual Information with a Text.
TEI: The Reference System Declaration¶2.3.6 The Reference System Declaration
The refsDecl element is used to document the way in which any standard referencing scheme built into the encoding works.
- refsDecl (참조 선언) 표준 참조가 이러한 텍스트에 대해 어떻게 구성되는가를 명시한다.
It may contain either a series of prose paragraphs or the following specialized elements:
- cRefPattern (표준 참조 유형) 표준 참조를 URI로 변환하기 위한 표현 및 대체 유형을 명시한다.
- refState (참조 상태) 이정표 방법으로 정의된 표준 참조의 한 성분을 명시한다.
- att.patternReplacement provides attributes for regular-expression matching and replacement.
matchPattern specifies a regular expression against which the values of other attributes can be matched. replacementPattern 하위 유형 대치가 실행되면, URI를 제공하는 ‘replacement pattern’를 명시한다.
Note that not all possible referencing schemes are equally easily supported by current software systems. A choice must be made between the convenience of the encoder and the likely efficiency of the particular software applications envisaged, in this context as in many others. For a more detailed discussion of referencing systems supported by these Guidelines, see section 3.10 Reference Systems below.
A referencing scheme may be described in one of three ways using this element:
- as a prose description
- as a series of pairs of regular expressions and XPaths
- as a concatenation of sequentially organized milestones
Each method is described in more detail below. Only one method can be used within a single refsDecl element.
More than one refsDecl element can be included in the header if more than one canonical reference scheme is to be used in the same document, but the current proposals do not check for mutual inconsistency.
TEI: Prose Method¶2.3.6.1 Prose Method
The referencing scheme may be specified within the refsDecl by a simple prose description. Such a description should indicate which elements carry identifying information, and whether this information is represented as attribute values or as content. Any special rules about how the information is to be interpreted when reading or generating a reference string should also be specified here. Such a prose description cannot be processed automatically, and this method of specifying the structure of a canonical reference system is therefore not recommended for automatic processing.
<p>The <att>n</att> attribute of each text in this corpus carries a
unique identifying code for the whole text. The title of the text is
held as the content of the first <gi>head</gi> element within each
text. The <att>n</att> attribute on each <gi>div1</gi> and
<gi>div2</gi> contains the canonical reference for each such
division, in the form 'XX.yyy', where XX is the book number in Roman
numerals, and yyy the section number in arabic. Line breaks are
marked by empty <gi>lb</gi> elements, each of which includes the
through line number in Casaubon's edition as the value of its
<gi>n</gi> attribute.</p>
<p>The through line number and the text identifier uniquely identify
any line. A canonical reference may be made up by concatenating the
<gi>n</gi> values from the <gi>text</gi>, <gi>div1</gi>, or
<gi>div2</gi> and calculating the line number within each part.</p>
</refsDecl>
TEI: Search-and-Replace Method¶2.3.6.2 Search-and-Replace Method
This method often requires a significant investment of effort initially, but permits extremely flexible addressing. For details, see section 16.2.5 Canonical References.
- cRefPattern (표준 참조 유형) 표준 참조를 URI로 변환하기 위한 표현 및 대체 유형을 명시한다.
TEI: Milestone Method¶2.3.6.3 Milestone Method
This method is appropriate when only ‘milestone’ tags (see section 3.10.3 Milestone Elements) are available to provide the required referencing information. It does not provide any abilities which cannot be mimicked by the search-and-replace referencing method discussed in the previous section, but in the cases where it applies, it provides a somewhat simpler notation.
A reference based on milestone tags concatenates the values specified by one or more such tags. Since each tag marks the point at which a value changes, it may be regarded as specifying the refState of a variable. A reference declaration using this method therefore specifies the individual components of the canonical reference as a sequence of refState elements:
- refState (참조 상태) 이정표 방법으로 정의된 표준 참조의 한 성분을 명시한다.
delim (구분 문자) 참조 성분 뒤에 구분 문자열을 제공한다. length 참조 성분의 고정 길이를 명시한다. - att.milestoneUnit provides an attribute to indicate the type of section which is changing at a specific milestone.
unit 이 이정표에서 단락 변경을 위한 관례적 이름을 제공한다. 제안값은 다음을 포함한다: 1] page; 2] column; 3] line; 4] book; 5] poem; 6] canto; 7] speaker; 8] stanza; 9] act; 10] scene; 11] section; 12] absent; 13] unnumbered
For example, the reference ‘Matthew 12:34’ might be thought of as representing the state of three variables: the book variable is in state ‘Matthew’; the chapter variable is in state ‘12’, and the verse variable is in state ‘34’. If milestone tagging has been used, there should be a tag marking the point in the text at which each of the above ‘variables’ changes its state.8 To find ‘Matthew 12:34’ therefore an application must scan left to right through the text, monitoring changes in the state of each of these three variables as it does so. When all three are simultaneously in the required state, the desired point will have been reached. There may of course be several such points.
The delim and length attributes are used to specify components of a canonical reference using this method in exactly the same way as for the stepwise method described in the preceding section. The other attributes are used to determine which instances of milestone tags in the text are to be checked for state-changes. A state-change is signalled whenever a new milestone tag is found with unit and, optionally, ed attributes identical to those of the refState element in question. The value for the new state may be given explicitly by the n attribute on the milestone element, or it may be implied, if the n attribute is not specified.
<refState ed="first" unit="page" length="2"
delim="."/>
<refState ed="first" unit="line" length="3"/>
</refsDecl>
The milestone referencing scheme, though conceptually simple, is not supported by a generic XML parser. Its use places a correspondingly greater burden of verification and accuracy on the encoder.
A reference system declaration which applies to more than one text or division of a text need not be repeated in the header of each such text. Instead, the decls attribute of each text (or subdivision of the text) to which the declaration applies may be used to supply a cross-reference to it, as further described in section 15.3 Associating Contextual Information with a Text.
TEI: The Classification Declaration¶2.3.7 The Classification Declaration
- classDecl (분류 선언) 텍스트 내 어디서든 사용될 수 있는 분류 부호를 정의하는 분류법을 포함한다.
- taxonomy 서지 정보 인용으로 비명시적으로 또는 구조화된 분류법으로 명시적으로 텍스트를 분류하는 유형을 정의한다.
- category 사용자가 정의한 분류법 안의 개별 기술 범주를 포함한다. 상위 범주 내에 포함될 수 있다.
- catDesc (범주 기술) 간단한 산문체 기술 형식 또는 TEI의 형식적 텍스트 기술에 의해 사용된 상황 매개변수를 통해 분류법 또는 텍스트 유형 내에서 어떤 범주를 기술한다.
<bibl>
<title>Dewey Decimal Classification</title>
<edition>Abridged Edition 12</edition>
</bibl>
</taxonomy>
<bibl>Brown Corpus</bibl>
<category xml:id="b.a">
<catDesc>Press Reportage</catDesc>
<category xml:id="b.a1">
<catDesc>Daily</catDesc>
</category>
<category xml:id="b.a2">
<catDesc>Sunday</catDesc>
</category>
<category xml:id="b.a3">
<catDesc>National</catDesc>
</category>
<category xml:id="b.a4">
<catDesc>Provincial</catDesc>
</category>
<category xml:id="b.a5">
<catDesc>Political</catDesc>
</category>
<category xml:id="b.a6">
<catDesc>Sports</catDesc>
</category>
</category>
<category xml:id="b.d">
<catDesc>Religion</catDesc>
<category xml:id="b.d1">
<catDesc>Books</catDesc>
</category>
<category xml:id="b.d2">
<catDesc>Periodicals and tracts</catDesc>
</category>
</category>
</taxonomy>
<category xml:id="poe">
<catDesc>Poetry</catDesc>
<category xml:id="sonn">
<catDesc>Sonnet</catDesc>
<category xml:id="shakesSonn">
<catDesc>Shakespearean Sonnet</catDesc>
</category>
<category xml:id="petraSonn">
<catDesc>Petrarchan Sonnet</catDesc>
</category>
</category>
<category xml:id="met">
<catDesc>Metrical Categories</catDesc>
<category xml:id="ft">
<catDesc>Metrical Feet</catDesc>
<category xml:id="iamb">
<catDesc>Iambic</catDesc>
</category>
<category xml:id="troch">
<catDesc>trochaic</catDesc>
</category>
</category>
<category xml:id="ftNm">
<catDesc>Number of feet</catDesc>
<category xml:id="penta">
<catDesc>>Pentameter</catDesc>
</category>
<category xml:id="tetra">
<catDesc>>Tetrameter</catDesc>
</category>
</category>
</category>
</category>
</taxonomy>
<!-- elsewhere in document -->
<lg ana="#shakesSonnet #iamb #penta">
<l>Shall I compare thee to a summer's day</l>
<!-- ... -->
</lg>
<catDesc xml:lang="pl">literatura piękna</catDesc>
<catDesc xml:lang="en">fiction</catDesc>
<category xml:id="litProza">
<catDesc xml:lang="pl">proza</catDesc>
<catDesc xml:lang="en">prose</catDesc>
</category>
<category xml:id="litPoezja">
<catDesc xml:lang="pl">poezja</catDesc>
<catDesc xml:lang="en">poetry</catDesc>
</category>
<category xml:id="litDramat">
<catDesc xml:lang="pl">dramat</catDesc>
<catDesc xml:lang="en">drama</catDesc>
</category>
</category>
TEI: The Geographic Coordinates Declaration¶2.3.8 The Geographic Coordinates Declaration
The following element is provided to indicate (within the header of a document, or in an external location) that a particular coordinate notation, or a particular datum, has been employed in a text. The default notation is a string containing two real numbers separated by whitespace, of which the first indicates latitude and the second longitude according to the 1984 World Geodetic System (WGS84).
TEI: The Unit Declaration¶2.3.9 The Unit Declaration
When documents feature units of measurement that are not listed in the International System of Units, the unitDecl element may be used in the encoding description to provide definitions and information about their origins and equivalents.
- unitDecl (unit declarations) provides information about units of measurement that are not members of the International System of Units.
The unitDecl contains one or more unitDef child elements that serve to describe units of measure which may be marked in unit elements within the text.
- unitDef (unit definition) contains descriptive information related to a specific unit of measurement.
- unit contains a symbol, a word or a phrase referring to a unit of measurement in any kind of formal or informal system.
- conversion defines how to calculate one unit of measure in terms of another.
formula [att.formula] An formula is provided to describe a mathematical calculation such as a conversion between measurement systems. fromUnit indicates a source unit of measure that is to be converted into another unit indicated in toUnit. toUnit the target unit of measurement for a conversion from a source unit referenced in fromUnit. from [att.datable.w3c] 표준 형식으로 기간의 시작 지점을 표시한다. to [att.datable.w3c] 표준 형식으로 기간의 종료 지점을 나타낸다. notBefore [att.datable.w3c] yyyy-mm-dd와 같은 표준 형식으로 사건의 가능한 한 이른 날짜를 명시한다. notAfter [att.datable.w3c] yyyy-mm-dd와 같은 표준 형식으로 사건의 가능한 한 나중 날짜를 명시한다. when [att.datable.w3c] 표준형식으로 날짜 또는 시간의 값을 제공한다. - att.formula provides attributes for defining a mathematical formula.
formula An formula is provided to describe a mathematical calculation such as a conversion between measurement systems.
Within the unitDef, a conversion element may be used to store information relating to conversion between units. The conversion element holds a special pair of attributes, fromUnit and toUnit, which serve to indicate the direction of a calculation from one unit of measure (stored in fromUnit) to another (stored in toUnit). A mathematical calculation to define the relation between these units may be stored in formula, as shown in the following examples. The formula attribute takes a value expressed as an XPath expression, which means that division must be expressed with ‘div’ so as not to be confused with the forward slash used in path navigation.
<unitDecl>
<unitDef xml:id="keel" type="weight">
<label>keel</label>
<placeName ref="#england"/>
<conversion fromUnit="#chalder"
toUnit="#keel" formula="$fromUnit * 20" from="1421"
to="1676"/>
<conversion fromUnit="#chalder"
toUnit="#keel" formula="$fromUnit * 16" from="1676"
to="1824"/>
<desc>Keel was a unit measuring weight of coal. It had been equal to 20 chalders from 1421 to 1676, and it was made to be equivalent to 16 chalders from 1676 to 1824.</desc>
</unitDef>
<unitDef xml:id="chalder" type="weight">
<label>chalder</label>
<placeName ref="#england"/>
<conversion fromUnit="#bushel"
toUnit="#chalder" formula="$fromUnit * 32" from="1421"
to="1676"/>
<conversion fromUnit="#bushel"
toUnit="#chalder" formula="$fromUnit * 36" from="1676"
to="1824"/>
<desc>Chalder was a unit measuring weight of coal. It had been equal to 32 bushels from 1421 to 1676, and it was made to be equivalent to 36 bushels from 1676 to 1824.</desc>
</unitDef>
<unitDef xml:id="bushel" type="weight">
<label>bushel</label>
<placeName ref="#england"/>
<desc>Bushel was a unit measuring weight of coal.</desc>
</unitDef>
</unitDecl>
</encodingDesc>
<unitDecl>
<unitDef xml:id="Celsius"
type="temperature">
<label>Celsius or Centigrade scale</label>
<conversion fromUnit="#Fahrenheit"
toUnit="#Celsius" formula="($fromUnit - 32) * (5 div 9)"/>
<desc>To convert from the Fahrenheit to the Celsius scale, subtract 32 from the Celsius temperature and multiply by 5/9.</desc>
</unitDef>
</unitDecl>
</encodingDesc>
TEI: The Schema Specification¶2.3.10 The Schema Specification
The schemaSpec element contains a schema specification. When this element appears inside encodingDesc, it allows embedding of a TEI ODD customization file inside a TEI header; alternatively, this element may be used in the body of an ODD document. The use of ODD files, and their relationship to schemas, is described in detail in 22 Documentation Elements.
<!-- Other encoding description elements... -->
<schemaSpec ident="myTEICustomization"
docLang="en" prefix="tei_" xml:lang="en" source="#NONE">
<moduleRef key="core"/>
<moduleRef key="tei"/>
<moduleRef key="header"/>
<moduleRef key="textstructure"/>
</schemaSpec>
</encodingDesc>
url="http://www.tei-c.org/release/xml/tei/custom/odd/tei_lite.odd"/>
<schemaRef type="interchangeRNG"
url="http://www.tei-c.org/release/xml/tei/custom/odd/tei_lite.rng"/>
<schemaRef type="projectODD"
url="file:///schema/project.odd"/>
TEI: The Application Information Element¶2.3.11 The Application Information Element
It is sometimes convenient to store information relating to the processing of an encoded resource within its header. Typical uses for such information might be:
- to allow an application to discover that it has previously opened or edited a file, and what version of itself was used to do that;
- to show (through a date) which application last edited the file to allow for diagnosis of any problems that might have been caused by that application;
- to allow users to discover information about an application used to edit the file
- to allow the application to declare an interest in elements of the file which it has edited, so that other applications or human editors may be more wary of making changes to those sections of the file.
The class model.applicationLike provides an element, application, which may be used to record such information within the appInfo element.
- appInfo (애플리케이션 정보) TEI 파일을 편집한 애플리케이션에 관한 정보를 기록한다.
- application 문서에 사용한 애플리케이션에 관한 정보를 제시한다.
ident 버전 또는 표시명과 상관없이 애플리케이션의 확인소를 제공한다. version 확인소 또는 표시명과 상관없이 애플리케이션의 버전을 제공한다.
Each application element identifies the current state of one software application with regard to the current file. This element is a member of the att.datable class, which provides a variety of attributes for associating this state with a date and time, or a temporal range. The ident and version attributes should be used to uniquely identify the application and its major version number (for example, ImageMarkupTool 1.5). It is not intended that an application should add a new application each time it touches the file.
<application version="1.5"
ident="ImageMarkupTool" notAfter="2006-06-01">
<label>Image Markup Tool</label>
<ptr target="#P1"/>
<ptr target="#P2"/>
</application>
</appInfo>
TEI: Module-Specific Declarations¶2.3.12 Module-Specific Declarations
The elements discussed so far are available to any schema. When the schema in use includes some of the more specialized TEI modules, these make available other more module-specific components of the encoding description. These are discussed fully in the documentation for the module in question, but are also noted briefly here for convenience.
The fsdDecl element is available only when the iso-fs module is included in a schema. Its purpose is to document the feature system declaration (as defined in chapter 18.11 Feature System Declaration) underlying any analytic feature structures (as defined in chapter 18 Feature Structures) present in the text documented by this header.
The metDecl element is available only when the verse module is included in a schema. Its purpose is to document any metrical notation scheme used in the text, as further discussed in section 6.4 Rhyme and Metrical Analysis. It consists either of a prose description or a series of metSym elements.
The variantEncoding element is available only when the textcrit module is included in a schema. Its purpose is to document the method used to encode textual variants in the text, as discussed in section 12.2 Linking the Apparatus to the Text.
TEI: The Profile Description¶2.4 The Profile Description
The profileDesc element is the third major subdivision of the TEI header. It is an optional element, the purpose of which is to enable information characterizing various descriptive aspects of a text or a corpus to be recorded within a single unified framework.
- profileDesc (텍스트-개요 기술) 분명하게 언어와 특수 언어가 사용된 텍스트, 텍스트가 생산된 상황, 참여자, 배경에 관한 비서지적 측면을 상세히 기술한다.
In principle, almost any component of the header might be of importance as a means of characterizing a text. The author of a written text, its title or its date of publication, may all be regarded as characterizing it at least as strongly as any of the parameters discussed in this section. The rule of thumb applied has been to exclude from discussion here most of the information which generally forms part of a standard bibliographic style description, if only because such information has already been included elsewhere in the TEI header.
The profileDesc element contains elements taken from the model.profileDescPart class. The default members of this class are the following :
- abstract contains a summary or formal abstract prefixed to an existing source document by the encoder.
- creation 텍스트 생성에 관한 정보를 포함한다.
- langUsage (언어 사용) 텍스트 내에 나타나는 언어, 특수 언어, 레지스터, 방언 등을 기술한다.
- textClass (텍스트 분류) 표준 분류 스키마, 시소러스 등을 통해서 텍스트의 특성 또는 주제를 기술하는 정보를 모아 놓는다.
- correspDesc (correspondence description) contains a description of the actions related to one act of correspondence.
- calendarDesc (calendar description) contains a description of the calendar system used in any dating expression found in the text.
These elements are further described in the remainder of this section.
When the corpus module described in chapter 15 Language Corpora is included in a schema, three further elements become available within the profileDesc element:
- textDesc (텍스트 기술) 장면적 매개변수를 통해 텍스트에 대해 기술한다.
- particDesc (참여 기술) 언어적 상호작용에서 식별가능한 화자, 음성, 또는 기타 참여자를 기술한다.
- settingDesc (무대 기술) 언어적 상호작용이 발생하는 무대 또는 배경을 산문적 기술로서 또는 일련의 무대 요소로서 기술한다.
For descriptions of these elements, see section 15.2 Contextual Information.
When the transcr module for the transcription of primary sources described in chapter 11 Representation of Primary Sources is included in a schema, the following elements become available within the profileDesc element:
- handNotes 원본 텍스트 내에서 식별되는 필적을 기록하는 하나 이상의 handNote 요소를 포함한다.
- listTranspose supplies a list of transpositions, each of which is indicated at some point in a document typically by means of metamarks.
For a description of the handNotes element, see section 11.3.2.1 Document Hands. Its purpose is to group together a number of handNote elements, each of which describes a different hand or equivalent identified within a manuscript. The handNote element can also appear within a structured manuscript description, when the msdescription module described in chapter 10 Manuscript Description is included in a schema. For this reason, the handNote element is actually declared within the header module, but is only accessible to a schema when one or other of the transcr or msdescription modules is included in a schema. See further the discussion at 11.3.2.1 Document Hands.
The listTranspose element is discussed in detail in section 11.3.4.5 Transpositions.
TEI: Creation¶2.4.1 Creation
- creation 텍스트 생성에 관한 정보를 포함한다.
<date when="1992-08">August 1992</date>
<rs type="city">Taos, New Mexico</rs>
</creation>
TEI: Language Usage¶2.4.2 Language Usage
The langUsage element is used within the profileDesc element to describe the languages, sublanguages, registers, dialects, etc. represented within a text. It contains one or more language elements, each of which provides information about a single language, notably the quantity of that language present in the text. Note that this element should not be used to supply information about any non-standard characters or glyphs used by this language; such information should be recorded in the charDecl element in the encoding description (see further 5 Characters, Glyphs, and Writing Modes).
- langUsage (언어 사용) 텍스트 내에 나타나는 언어, 특수 언어, 레지스터, 방언 등을 기술한다.
- language 텍스트 내에서 사용되는 언어 또는 특수 언어의 특징을 기술한다.
usage 이 언어를 사용하는 텍스트의 대략적 백분율(분량)을 명시한다. ident (확인소) 요소로 기록된 언어를 식별하고기 위해 사용되는, 전체 xml:lang 속성에 의해 참조되는 BCP 47에서 정의된 방식대로 구축된 언어 부호를 제시한다.
A language element may be supplied for each different language used in a document. If used, its ident attribute should specify an appropriate language identifier, as further discussed in section vi.1. Language Identification. This is particularly important if extended language identifiers have been used as the value of xml:lang attributes elsewhere in the document.
<language ident="fr-CA" usage="60">Québecois</language>
<language ident="en-CA" usage="20">Canadian business English</language>
<language ident="en-GB" usage="20">British English</language>
</langUsage>
TEI: The Text Classification¶2.4.3 The Text Classification
The textClass element is used to classify a text in some way.
- textClass (텍스트 분류) 표준 분류 스키마, 시소러스 등을 통해서 텍스트의 특성 또는 주제를 기술하는 정보를 모아 놓는다.
Text classification may be carried out according to one or more of the following methods:
- by reference to a recognized international classification such as the Dewey Decimal Classification, the Universal Decimal Classification, the Colon Classification, the Library of Congress Classification, or any other system widely used in library and documentation work
- by providing a set of keywords, as provided for example by British Library or Library of Congress Cataloguing in Publication data
- by referencing any other taxonomy of text categories recognized in the field concerned, or peculiar to the material in hand; this may include one based on recurring sets of values for the situational parameters defined in section 15.2.1 The Text Description, or the demographic elements described in section 15.2.2 The Participant Description
The last of these may be particularly important for dealing with existing corpora or collections, both as a means of avoiding the expense or inconvenience of reclassification and as a means of documenting the organizing principles of such materials.
The following elements are provided for this purpose:
- keywords 텍스트의 주제 또는 특성을 식별하는 주제어 또는 구의 목록을 포함한다.
scheme 관련 키워드 집합이 정의된 통제 어휘를 식별하며, 이 정의는 통제 어휘 내에서 수행된다. - classCode (분류 부호) 표준 분류 체계에서 이 텍스트에 대하여 사용된 분류 부호를 포함한다.
scheme 사용하고 있는 분류 체계 또는 분류법을 표시한다. - catRef (범주 참조) 분류법 또는 텍스트 유형 내에서 하나 이상의 정의된 범주를 명시한다.
The keywords element simply categorizes an individual text by supplying a list of keywords which may describe its topic or subject matter, its form, date, etc. In some schemes, the order of items in the list is significant, for example, from major topic to minor; in others, the list has an organized substructure of its own. No recommendations are made here as to which method is to be preferred. Wherever possible, such keywords should be taken from a recognized source, such as the British Library/Library of Congress Cataloguing in Publication data in the case of printed books, or a published thesaurus appropriate to the field.
<term>Babbage, Charles</term>
<term>Mathematicians - Great Britain - Biography</term>
</keywords>
<term>English literature -- History and criticism -- Data processing.</term>
<term>English literature -- History and criticism -- Theory, etc.</term>
<term>English language -- Style -- Data processing.</term>
<term>Style, Literary -- Data processing.</term>
</keywords>
<term>ceremonials</term>
<term>fairs</term>
<term>street life</term>
</keywords>
<!-- elsewhere in the document -->
<taxonomy xml:id="welch">
<bibl>
<title>Notes on London Municipal Literature, and a Suggested
Scheme for Its Classification</title>
<author>Charles Welch</author>
<edition>1895</edition>
</bibl>
</taxonomy>
If no authority file exists, perhaps because the keywords used were assigned directly by an author, the scheme attribute should be omitted.
Alternatively, if the keyword vocabulary itself is locally defined, the scheme attribute will point to the local definition, which will typically be held in a taxonomy element within the classDecl part of the encoding description (see section 2.3.7 The Classification Declaration).
The catRef element categorizes an individual text by pointing to one or more category elements using the target attribute, which it inherits from the att.pointing class. The category element (which is fully described in section 2.3.7 The Classification Declaration) holds information about a particular classification or category within a given taxonomy. Each such category must have a unique identifier, which may be supplied as the value of the target attribute for catRef elements which are regarded as falling within the category indicated.
scheme="http://www.example.com/browncorpus"/>
<catRef target="http://www.example.com/SUC/#A45"/>
In general, it is a matter of style whether to use a single catRef with multiple identifiers in the value of target or multiple catRef elements, each with a single identifier in the value of target. However, note that maintenance of a TEI document with a large number of values within a single target can be cumbersome.
The distinction between the catRef and classCode elements is that the values used as identifying codes are exhaustively enumerated for the former, typically within the TEI header. In the latter case, however, the values use any externally-defined scheme, and therefore may be taken from a more open-ended descriptive classification system.
TEI: Abstracts¶2.4.4 Abstracts
<abstract>
<p>This paper is a draft studying
various aspects of using the TEI
as a reference serialization framework
for LMF. Comments are welcome to bring
this to a useful document for the
community.
</p>
</abstract>
</profileDesc>
<abstract xml:lang="en">
<p>The recent archaeological emphasis
on the study of settlement patterns,
landscape and palaeoenvironments has
shaped and re-shaped our understanding
of the Viking settlement of Iceland.
This paper reviews the developments
in Icelandic archaeology, examining
both theoretical and practical advances.
Particular attention is paid to new
ideas in terms of settlement patterns
and resource exploitation. Finally,
some of the key studies of the ecological
consequences of the Norse
<foreign xml:lang="is">landnám</foreign>
are presented. </p>
</abstract>
<abstract xml:lang="fr">
<p>L’accent récent des
recherches archéologiques sur l’étude des
configurations spatiales des colonies, de la
géographie des sites ainsi que des éléments
paléo-environnementaux nous mène à réexaminer
et réévaluer nos connaissances acquises sur
la colonisation de l’Islande par les Vikings.
Cet article passe en revue le développement
de l’archéologie islandaise en examinant les
progrès théoriques et pratiques en la matière.
Une attention particulière est portée sur
l’étude des configurations spatiales des
colonies ainsi qu’une considération des
questions d’exploitation des ressources.
Finalement, l’article présente un aperçu des
études principales qui traitent des
conséquences écologiques du
<foreign xml:lang="is">landnám</foreign>
islandais.</p>
</abstract>
</profileDesc>
<abstract>
<list>
<item>An annual HBC supply ship is
set to the North West Coast for mid-September.</item>
<item>
<name key="pelly_jh">Pelly</name> writes
to ascertain the British Government's plans
for the lands associated with the Oregon Treaty;
he wants to know what will happen to the HBC's
establishment on the southern <name type="place"
key="vancouver_island">Vancouver Island</name>.
He adds that a former Crown grant, an 1838 exclusive
trade-grant for the lands in question, has yet to
expire.</item>
<item>The minutes discuss the nature of the HBC's
original entitlements and question whether or not,
and in what capacity, the Oregon Treaty affects the
HBC's position. The majority council further
investigation, and to reply cautiously and
judiciously to the HBC inquiry.</item>
<item>A
summary of a meeting with <name key="pelly_jh">Pelly</name> is offered in
order to elucidate the HBC's intentions.</item>
<item>
<name key="grey_hg">Lord Grey</name> calls
for greater consideration on the issue of
colonization; he asks that <name key="stephen_j">Stephen</name> write the Company,
asking them to detail their intentions, and to
state their legal opinion for entitlement.
</item>
</list>
</abstract>
</profileDesc>
TEI: Calendar Description¶2.4.5 Calendar Description
The calendarDesc element is used within the profileDesc element to document objects referenced by means of either the calendar attribute on date or the datingMethod attribute on any member of the att.datable class.
- calendarDesc (calendar description) contains a description of the calendar system used in any dating expression found in the text.
This element may contain one or more calendar elements:
- calendar describes a calendar or dating system used in a dating formula in the text.
<calendar xml:id="Gregorian">
<p>Gregorian calendar</p>
</calendar>
<calendar xml:id="Stardate">
<p>Fictional Stardate (from Star Trek series)</p>
</calendar>
<calendar xml:id="BP">
<p>Calendar years before present (measured from 1950)</p>
</calendar>
</calendarDesc>
to the nearest decimal point</date>...</p>
TEI: Correspondence Description¶2.4.6 Correspondence Description
The correspDesc element is used within the profileDesc element to provide detailed correspondence-specific metadata, concerning in particular the communicative aspects (sending, receiving, forwarding etc.) associated with an act of correspondence.
This information is complementary to the detailed descriptions of physical objects (such as letters) associated with correspondence acts, which are typically provided by the sourceDesc element.
- correspDesc (correspondence description) contains a description of the actions related to one act of correspondence.
The correspDesc element contains the elements correspAction and correspContext, describing the actions identified and the context in which the correspondence occurs respectively.
- correspAction (correspondence action) contains a structured description of the place, the name of a person/organization and the date related to the sending/receiving of a message or any other action related to the correspondence.
type describes the nature of the action. 제안값은 다음을 포함한다: 1] sent; 2] received; 3] transmitted; 4] redirected; 5] forwarded - correspContext (correspondence context) provides references to preceding or following correspondence related to this piece of correspondence.
<ref type="prev" target="#CLF0102">Previous letter of <persName>Chamisso</persName> to <persName>de La
Foye</persName>: <date when="1807-01-16">16 January 1807</date>
</ref>
<ref type="next" target="#CLF0104">Next letter of <persName>Chamisso</persName> to <persName>de La Foye</persName>:
<date when="1810-05-07">07 May 1810</date>
</ref>
</correspContext>
Many types of correspondence actions may be distinguished. The type attribute should be used to indicate the type of action being documented, using values such as those suggested above.
<persName>Adelbert von Chamisso</persName>
<placeName>Vertus</placeName>
<date when="1807-01-29"/>
</correspAction>
<correspAction type="received">
<persName>Louis de La Foye</persName>
<placeName>Caen</placeName>
<date>unknown</date>
</correspAction>
<persName>Hermann Hesse</persName>
<persName>Ninon Hesse</persName>
<placeName>Montagnola</placeName>
</correspAction>
<persName>PN0001</persName>
<date when="1999-06-02"/>
</correspAction>
<correspAction type="received" subtype="to">
<persName>PN0002</persName>
</correspAction>
<correspAction type="received" subtype="to">
<persName>PN0003</persName>
</correspAction>
<correspAction type="received" subtype="cc">
<persName>PN0004</persName>
</correspAction>
<correspAction type="sent">
<persName>John Gneisenau Neihardt</persName>
<placeName>Branson (Montgomery)</placeName>
<date when="1932-12-17"/>
</correspAction>
<correspAction type="received">
<persName xml:id="JTH">Julius Temple House</persName>
<placeName>New York</placeName>
</correspAction>
</correspDesc>
<correspDesc xml:id="message2">
<correspAction type="sent">
<persName>Enid Neihardt</persName>
<placeName>Branson (Montgomery)</placeName>
<date when="1932-12-17"/>
</correspAction>
<correspAction type="received">
<persName sameAs="#JTH"/>
<placeName>New York</placeName>
</correspAction>
</correspDesc>
TEI: Non-TEI Metadata¶2.5 Non-TEI Metadata
Projects often maintain metadata about their TEI documents in more than one form or system. For example, a project may have a database of bibliographic information on the set of documents they intend to encode. From this database, both a MARC record and a teiHeader are generated. The document is then encoded, during which process additional information is added to the teiHeader manually. Then, when the document is published on the web, a Dublin Core record is generated for discoverability of the resource. It is sometimes advantageous to store some or all of the non-TEI metadata in the TEI file.
Such non-TEI data may be placed anywhere within a TEI file (other than as the root element), as it does not affect TEI conformance. However, it is easier for humans to manage these kinds of data if they are grouped together in a single location. In addition, such grouping makes it easy to avoid accidentally flagging non-TEI data as errors during validation of the file against a TEI schema. The xenoData element, which may appear in the TEI header after the fileDesc but before the optional revisionDesc, is provided for this purpose.
- xenoData (non-TEI metadata) provides a container element into which metadata in non-TEI formats may be placed.
The xenoData element may contain anything except TEI elements. It may contain one or more elements from outside the TEI9 or data in some non-XML text format.10
rdf
is bound to the namespace http://www.w3.org/1999/02/22-rdf-syntax-ns#
, the prefix dc
is bound to the namespace http://purl.org/dc/elements/1.1/
, and the prefix cc
is bound to the namespace http://web.resource.org/cc/
. <rdf:RDF>
<cc:Work rdf:about="">
<dc:title>Applied Software Project Management - review</dc:title>
<dc:type rdf:resource="http://purl.org/dc/dcmitype/Text"/>
<dc:license rdf:resource="http://creativecommons.org/licenses/by-sa/2.0/uk/"/>
</cc:Work>
<cc:License rdf:about="http://creativecommons.org/licenses/by-sa/2.0/uk/">
<cc:permits rdf:resource="http://web.resource.org/cc/Reproduction"/>
<cc:permits rdf:resource="http://web.resource.org/cc/Distribution"/>
<cc:requires rdf:resource="http://web.resource.org/cc/Notice"/>
<cc:requires rdf:resource="http://web.resource.org/cc/Attribution"/>
<cc:permits rdf:resource="http://web.resource.org/cc/DerivativeWorks"/>
<cc:requires rdf:resource="http://web.resource.org/cc/ShareAlike"/>
</cc:License>
</rdf:RDF>
</xenoData>
TEI: The Revision Description¶2.6 The Revision Description
The final sub-element of the TEI header, the revisionDesc element, provides a detailed change log in which each change made to a text may be recorded. Its use is optional but highly recommended. It provides essential information for the administration of large numbers of files which are being updated, corrected, or otherwise modified as well as extremely useful documentation for files being passed from researcher to researcher or system to system. Without change logs, it is easy to confuse different versions of a file, or to remain unaware of small but important changes made in the file by some earlier link in the chain of distribution. No significant change should be made in any TEI-conformant file without corresponding entries being made in the change log.
- revisionDesc (수정 기술) 하나의 파일에 대한 수정 이력을 요약한다.
- listChange groups a number of change descriptions associated with either the creation of a source text or the revision of an encoded text.
- change 몇몇 연구자들 사이에 공유된 전자 텍스트의 특정 버전에 대한 특정 변경 또는 수정 사항을 요약한다.
The main purpose of the revision description is to record changes in the text to which a header is prefixed. However, it is recommended TEI practice to include entries also for significant changes in the header itself (other than the revision description itself, of course). At the very least, an entry should be supplied indicating the date of creation of the header.
The log consists of a list of entries, one for each change. Changes may be grouped and organised using either the listChange element described in section 11.6 Identifying Changes and Revisions or the simple list element described in section 3.7 Lists. Alternatively, a simple sequence of change elements may be given. The attributes when and who may be supplied for each change element to indicate its date and the person responsible for it respectively. The description of the change itself can range from a simple phrase to a series of paragraphs. If a number is to be associated with one or more changes (for example, a revision number), the global n attribute may be used to indicate it.
It is recommended to give changes in reverse chronological order, most recent first.
<!-- ... --><revisionDesc>
<change n="RCS:1.39" when="2007-08-08"
who="#jwernimo.lrv">Changed <val>drama.verse</val>
<gi>lg</gi>s to <gi>p</gi>s. <note>we have opened a discussion about the need for a new
value for <att>type</att> of <gi>lg</gi>, <val>drama.free.verse</val>, in order to address
the verse of Behn which is not in regular iambic pentameter. For the time being these
instances are marked with a comment note until we are able to fully consider the best way
to encode these instances.</note>
</change>
<change n="RCS:1.33" when="2007-06-28"
who="#pcaton.xzc">Added <att>key</att> and <att>reg</att>
to <gi>name</gi>s.</change>
<change n="RCS:1.31" when="2006-12-04"
who="#wgui.ner">Completed renovation. Validated.</change>
</revisionDesc>
<title>The Amorous Prince, or, the Curious Husband, 1671</title>
<author>
<persName ref="#abehn.aeh">Behn, Aphra</persName>
</author>
<respStmt xml:id="pcaton.xzc">
<persName>Caton, Paul</persName>
<resp>electronic publication editor</resp>
</respStmt>
<respStmt xml:id="wgui.ner">
<persName>Gui, Weihsin</persName>
<resp>encoder</resp>
</respStmt>
<respStmt xml:id="jwernimo.lrv">
<persName>Wernimont, Jacqueline</persName>
<resp>encoder</resp>
</respStmt>
</titleStmt>
TEI: Minimal and Recommended Headers¶2.7 Minimal and Recommended Headers
The TEI header allows for the provision of a very large amount of information concerning the text itself, its source, its encodings, and revisions of it, as well as a wealth of descriptive information such as the languages it uses and the situation(s) in which it was produced, together with the setting and identity of participants within it. This diversity and richness reflects the diversity of uses to which it is envisaged that electronic texts conforming to these Guidelines will be put. It is emphatically not intended that all of the elements described above should be present in every TEI header.
The amount of encoding in a header will depend both on the nature and the intended use of the text. At one extreme, an encoder may expect that the header will be needed only to provide a bibliographic identification of the text adequate to local needs. At the other, wishing to ensure that their texts can be used for the widest range of applications, encoders will want to document as explicitly as possible both bibliographic and descriptive information, in such a way that no prior or ancillary knowledge about the text is needed in order to process it. The header in such a case will be very full, approximating to the kind of documentation often supplied in the form of a manual. Most texts will lie somewhere between these extremes; textual corpora in particular will tend more to the latter extreme. In the remainder of this section we demonstrate first the minimal, and next a commonly recommended, level of encoding for the bibliographic information held by the TEI header.
<fileDesc>
<titleStmt>
<title>Thomas Paine: Common sense, a
machine-readable transcript</title>
<respStmt>
<resp>compiled by</resp>
<name>Jon K Adams</name>
</respStmt>
</titleStmt>
<publicationStmt>
<distributor>Oxford Text Archive</distributor>
</publicationStmt>
<sourceDesc>
<bibl>The complete writings of Thomas Paine, collected and edited
by Phillip S. Foner (New York, Citadel Press, 1945)</bibl>
</sourceDesc>
</fileDesc>
</teiHeader>
The only mandatory component of the TEI header is the fileDesc element. Within this, titleStmt, publicationStmt, and sourceDesc are all required constituents. Within the title statement, a title is required, and an author should be specified, even if it is unknown, as should some additional statement of responsibility, here given by the respStmt element. Within the publicationStmt, a publisher, distributor, or other agency responsible for the file must be specified. Finally, the source description should contain at the least a loosely structured bibliographic citation identifying the source of the electronic text if (as is usually the case) there is one.
<fileDesc>
<titleStmt>
<title>Common sense, a machine-readable transcript</title>
<author>Paine, Thomas (1737-1809)</author>
<respStmt>
<resp>compiled by</resp>
<name>Jon K Adams</name>
</respStmt>
</titleStmt>
<editionStmt>
<edition>
<date>1986</date>
</edition>
</editionStmt>
<publicationStmt>
<distributor>Oxford Text Archive.</distributor>
<address>
<addrLine>Oxford University Computing Services,</addrLine>
<addrLine>13 Banbury Road,</addrLine>
<addrLine>Oxford OX2 6RB,</addrLine>
<addrLine>UK</addrLine>
</address>
</publicationStmt>
<notesStmt>
<note>Brief notes on the text are in a
supplementary file.</note>
</notesStmt>
<sourceDesc>
<biblStruct>
<monogr>
<editor>Foner, Philip S.</editor>
<title>The collected writings of Thomas Paine</title>
<imprint>
<pubPlace>New York</pubPlace>
<publisher>Citadel Press</publisher>
<date>1945</date>
</imprint>
</monogr>
</biblStruct>
</sourceDesc>
</fileDesc>
<encodingDesc>
<samplingDecl>
<p>Editorial notes in the Foner edition have not
been reproduced. </p>
<p>Blank lines and multiple blank spaces, including paragraph
indents, have not been preserved. </p>
</samplingDecl>
<editorialDecl>
<correction status="high"
method="silent">
<p>The following errors
in the Foner edition have been corrected:
<list>
<item>p. 13 l. 7 cotemporaries contemporaries</item>
<item>p. 28 l. 26 [comma] [period]</item>
<item>p. 84 l. 4 kin kind</item>
<item>p. 95 l. 1 stuggle struggle</item>
<item>p. 101 l. 4 certainy certainty</item>
<item>p. 167 l. 6 than that</item>
<item>p. 209 l. 24 publshed published</item>
</list>
</p>
</correction>
<normalization>
<p>No normalization beyond that performed
by Foner, if any. </p>
</normalization>
<quotation marks="all">
<p>All double quotation marks
rendered with ", all single quotation marks with
apostrophe. </p>
</quotation>
<hyphenation eol="none">
<p>Hyphenated words that appear at the
end of the line in the Foner edition have been reformed.</p>
</hyphenation>
<stdVals>
<p>The values of <att>when-iso</att> on the <gi>time</gi>
element always end in the format <val>HH:MM</val> or
<val>HH</val>; i.e., seconds, fractions thereof, and time
zone designators are not present.</p>
</stdVals>
<interpretation>
<p>Compound proper names are marked. </p>
<p>Dates are marked. </p>
<p>Italics are recorded without interpretation. </p>
</interpretation>
</editorialDecl>
<classDecl>
<taxonomy xml:id="lcsh">
<bibl>Library of Congress Subject Headings</bibl>
</taxonomy>
<taxonomy xml:id="lc">
<bibl>Library of Congress Classification</bibl>
</taxonomy>
</classDecl>
</encodingDesc>
<profileDesc>
<creation>
<date>1774</date>
</creation>
<langUsage>
<language ident="en" usage="100">English.</language>
</langUsage>
<textClass>
<keywords scheme="#lcsh">
<term>Political science</term>
<term>United States -- Politics and government —
Revolution, 1775-1783</term>
</keywords>
<classCode scheme="#lc">JC 177</classCode>
</textClass>
</profileDesc>
<revisionDesc>
<change when="1996-01-22" who="#MSM"> finished proofreading </change>
<change when="1995-10-30" who="#LB"> finished proofreading </change>
<change notBefore="1995-07-04" who="#RG"> finished data entry at end of term </change>
<change notAfter="1995-01-01" who="#RG"> began data entry before New Year 1995 </change>
</revisionDesc>
</teiHeader>
Many other examples of recommended usage for the elements discussed in this chapter are provided here, in the reference index and in the associated tutorials.
TEI: Note for Library Cataloguers¶2.8 Note for Library Cataloguers
A strong motivation in preparing the material in this chapter was to provide in the TEI header a viable chief source of information for cataloguing computer files. The TEI header is not a library catalogue record, and so will not make all of the distinctions essential in standard library work. It also includes much information generally excluded from standard bibliographic descriptions. It is the intention of the developers, however, to ensure that the information required for a catalogue record be retrievable from the TEI file header, and moreover that the mapping from the one to the other be as simple and straightforward as possible. Where the correspondence is not obvious, it may prove useful to consult one of the works which were influential in developing the content of the TEI header. These include:
- ISBD
- ISBD: International Standard Bibliographic Description is an international standard setting out what information should be recorded in a description of a bibliographical item. Until a consolidated edition published in 2011, there was a general standard called ISBD(G) and separate ISBDs covering different types of material, e.g. ISBD(M) for monographs, ISBD(ER) for electronic resources. These separate ISBDs follow the same general scheme as the main ISBD(G), but provide appropriate interpretations for the specific materials under consideration.
- AACR2
- The Anglo-American Cataloguing Rules (second edition) were published in 1978, with revisions appearing periodically through 2005. AACR2 provides guidelines for the construction of catalogues in general libraries in the English-speaking world. AACR2 is explicitly based on the general framework of the ISBD(G) and the subsidiary ISBDs: it gives a description of how to describe bibliographic items and how to create access points such as subject or name headings and uniform titles. Other national cataloguing codes exist as well, including the Z44 series of standards from issued by the Association française de normalisation (AFNOR), Regeln für die alphabetische Katalogisierung in wissenschaftlichen Bibliotheken (RAK-WB), Regole italiane di catalogazione per autore (RICA), and Система стандартов по информации, библиотечному и издательскому делу. Библиографическая запись. Библиографическое описание. Общие требования и правила составления (ГОСТ 7.1).
- ANSI Z39.29
- The American National Standard for Bibliographic References was an American national standard governing bibliographic references for use in bibliographies, end-of-work lists, references in abstracting and indexing publications, and outputs from computerized bibliographic data bases. A revised version is maintained by the National Information Standards Organization (NISO). The related ISO standard is ISO 690. Other relevant national standards include BS 5605:1990, BS 6371:1983. DIN 1505-2, and ГОСТ 7.0.5.
Since the TEI file description elements are based on the ISBD areas, it should be possible to use the content of file description as the basis for a catalog record for a TEI document. However, cataloguers should be aware that the permissive nature of the TEI Guidelines may lead to divergences between practice in using the TEI file description and the comparatively strict recommendations of AACR2 and other national cataloguing codes. Such divergences as the following may preclude automatic generation of catalogue records from TEI headers:
- The TEI Guidelines do not require that text be transcribed from the ‘chief source of information’ using normalized capitalization and punctuation .
- The TEI title statement may not categorize constituent titles in the same way as prescribed by a national cataloguing code.
- The TEI title statement contains authors, editors, and other responsible parties in separate elements, with names which may not have been normalized; it does not necessarily contain a single statement of responsibility .
- There is no specific place in a TEI header to specify the main entry or added entries (name or title headings under which a catalogue record is filed) for the catalogue record.
- The TEI header does not require use of a particular vocabulary for subject headings nor require the use of subject headings.
TEI: The TEI Header Module¶2.9 The TEI Header Module
The module described in this chapter makes available the following components:
- 모듈 header: The TEI header
- 정의 요소: abstract appInfo application authority availability biblFull cRefPattern calendar calendarDesc catDesc catRef category change classCode classDecl conversion correction correspAction correspContext correspDesc creation distributor edition editionStmt editorialDecl encodingDesc extent fileDesc funder geoDecl handNote hyphenation idno interpretation keywords langUsage language licence listChange listPrefixDef namespace normalization notesStmt prefixDef principal profileDesc projectDesc publicationStmt punctuation quotation refState refsDecl rendition revisionDesc samplingDecl schemaRef scriptNote segmentation seriesStmt sourceDesc sponsor stdVals styleDefDecl tagUsage tagsDecl taxonomy teiHeader textClass titleStmt unitDecl unitDef xenoData
- 정의 부류: att.patternReplacement
The selection and combination of modules to form a TEI schema is described in 1.2 Defining a TEI Schema.
<
and &
respectively. See Character References.