Text Encoding Initiative
15. Figures and Graphics
Not all the components of a document are necessarily textual. The most straightforward text will often contain diagrams or illustrations, to say nothing of documents in which image and text are inextricably intertwined, or electronic resources in which the two are complementary.
The encoder may simply record the presence of a graphic within the text, possibly with a brief description of its content, by using the elements described in this section. The same elements may also be used to embed digitized versions of the graphic within an electronic document.
Any textual information accompanying the graphic, such as a heading and/or caption, may be included within the <figure> element itself, in a <head> and one or more <p> elements, as may also any text appearing within the graphic itself. It is strongly recommended that a prose description of the image be supplied, as the content of a <figDesc> element, for the use of applications which are not able to render the graphic, and to render the document accessible to vision-impaired readers. (Such text is not normally considered part of the document proper.)
<pb n="412"/> <figure></figure> <pb n="413"/>(Note that the end-tag may not be omitted, even though the element has no content). More usually, a graphic will have at the least an identifying title, which should be encoded using the <head> element. It is also often convenient to include a brief description of the image, as in the following example:
<figure> <head>Mr Fezziwig's Ball</head> <figDesc>A Cruikshank engraving showing Mr Fezziwig leading a group of revellers.</figDesc> </figure>
When a digitized version of the graphic concerned is available, it is clearly preferable to embed it at the appropriate point within the document. Graphic elements such as pictures are typically stored in separate entities (files) from those containing the text of a document, and using a different notation (storage format). The TEI Lite DTD supports graphics encoded using the CGM, PNG, TIFF, GIF, or JPEG standards under the SGML notation names cgm, png, tiff, gif, and jpeg respectiovely.1
Whatever format is used to encode the image, it may be embedded within the document in the same way. The first step is to declare an entity of a particular type, which specifies a name for the entity, an external identifier (such as a file name) for it, and the notation used. For example, assuming that the digitized image of Mr Fezziwig's ball were held in TIFF format in the file fezzi.tff, an entity declaration like the following would be necessary:
<!ENTITY fezziPic SYSTEM "fezzi.tff" NDATA tiff>All such declarations must be processed before the document itself; ways of doing this are beyond the scope of the present document, but are discussed in the Gentle Introduction to XML and many other introductory texts on SGML and XML.
<figure entity="fezziPic"> <head>Mr Fezziwig's Ball</head> <figDesc>A Cruikshank engraving showing Mr Fezziwig leading a group of revellers.</figDesc> </figure>