1 Standardization and the TEI

Standards come into being for a variety of reasons, and in a variety of ways, not always entirely explicable. They may be entirely market-defined; for example by manufacturers' attempts singly or as a group to control market share, or by consumers' desires to simplify purchase decisions. Standards also result from pressure applied by well-intentioned groups of experts, or as a consequence of legislation in the public interest. And finally, standards come about as the expression of some emergent consensus within some large community. This last method is the most likely to last, but the most difficult to achieve.

In creating such a consensus, there is an inevitable tension between the need to transform what is simply tried and tested into something normative and binding on the one hand, and the reluctance to straitjacket or constrain unanticipated development on the other. This is particularly true of the research and development arena, which depends for its survival on innovation, and thus the ability to provide answers to as yet unformulated questions, while at the same time being as concerned as any other community to codify existing practice. The research community is populated by experts, who need maximal flexibility and who distrust constraint, but also by novices, who need access to that accumulated expertise in a consistent and codified form, if only in order to rebel against it and thus become experts in their turn.

Standardization of the way in which information is stored and represented (rather than processed) is the key to a number of closely related problems, all of central concern to users of modern Information Technology, be they academic or commercial. For creators of language resources in particular, it addresses the difficulty of ensuring that information is reusable; the difficulty of ensuring that information represented in different ways can be seamlessly integrated; and the difficulty of facilitating loss-free information interchange between the widest choice of different platforms, different application systems and different languages.

By standardizing at the level of text representation, we can hope to retain the flexibility needed to develop new applications, while ensuring that old ones continue to function. By attempting a theory-neutral standardization, at the level where consensus exists, we avoid the need to reinvent the wheel, without requiring that everyone drive a particular brand of bicycle.

In this spirit, the TEI Guidelines which form the topic of this paper, aim to provide not a set of normative rules for particular applications, but rather a modular and extensible framework, within which particular application-specific norms can be defined. The development of such TEI-aware norms is already underway in a number of contexts, most significantly for the present audience, within EAGLES and related EU projects such as Multext, but also in a wide variety of corpus building, scholarly editing and digital library projects. Such projects have in common the need to customize and make less generic the framework defined by the TEI, retaining as they do so the capacity for interchange for which it was developed. The general principles, and many of the specific mechanisms, underlying this approach are of clear relevance to all large scale users of information technology.

This paper [See note 1] describes the origins and organization of the TEI scheme, including some technical details of how it may be customized for multiple application areas, and an overview of its coverage.


Back to table of contents
On to next section
Back to previous section