Extending msDescription to Describe Printed Books (and to describe manuscripts better)

(Dot Porter)

Document Contents

The original abstract for this paper stated that it would discuss options for extending the MS Description module to support the description of printed books.¹ Options mentioned specifically in that abstract include

Selective renaming and slight modification of element definitions to allow printed in addition to handwritten text;
The addition of a few new elements to describe elements of printed books not also found in handwritten manuscripts (typeface, printing press/practices); and
The addition of new elements <collationFormula> and <codexStructure>, already proposed by the PB working group, which will strengthen description of manuscripts as well as satisfy the needs of printed book descriptions.

Once I began research into the practice of printed book description, however, it became clear that they require a more systematic investigation than I have been able to make, preferably involving the work of multiple subject specialists. As written, then, rather than making specific recommendations this paper discusses a few issues regarding the description of printed books, and makes some proposals for further activity to be undertaken either through the existing Physical Bibliography workgroup or in a new Special Interest Group. Workgroups are officially composed and tasked by the TEI Council to formally recommend extensions or modifications to the TEI Guidelines, while Special Interest Groups are community-led groups that may make recommendations to the Council, but they have no official charge. In practice, however, the Council has been very open to the suggestions that have come out of the existing SIGs.

A few examples of “descriptions of printed books”

For this paper I consulted with Jim Birchfield, Curator of Rare Books at the University of Kentucky Libraries, who directed me to several different places where I might find descriptions of early printed books. There are a number of different types of descriptions, and for this paper I looked at those found in three places: rare books sales catalogues, library catalogues, and printed bibliographies. There may in fact be additional types, and the group responsible for further activity would need to take care that all types are considered in their work.

Although all types can contain the same basic information, the focus may be quite different. See attached appendix for multiple examples of each different type.

1. Rare books sales catalogues

Focus on individual copies
Variation from bookseller to bookseller. Some catalogues focus very much on physical description, others on historical background – of the copy itself, influences of the work, importance of the author, etc.
Examples: François Levaillant, Histoire naturelle des oiseaux de paradis… Entry 70 in Catalogue of Science, Medicine, Natural History Fifteen, William Patrick Watson Antiquarian Books, 2008 (description with three illustrations); “A ‘Masque’ for the Artillery Company”, entry 2 in Early English Books and Shakespeare in Translation, including books to be exhibited at the Glove on Bankside, Bernard Quaritch Ltd, Autumn 2008 (two works in one volume)

2. Library catalogues

Focus on individual copies
Focus on physical description rather than historical background
Electronic catalogues will normally be encoded using bibliographical standards, e.g. MARC, tending to be more data-centric than narrative description.
Examples: William Shakespeare, The life and death of King Richard the Second, (London, 1631), described in Hamnet, the Folger Library online catalogue; Four short but complete descriptions (entries 787-790) from pp. 308-309, James E. Walsh, A Catalogue of the Fifteenth-Century Printed Books in the Harvard University Library, Volume 1. Books Printed in Germany, German-Speaking Switzerland, And Austria-Hungary. Medieval & Renaissance Texts & Studies, Vol. 84 (Binghamton, NY, 1991)

3. Bibliographies

Tend to focus on the “ideal copy” with reference made to individual copies
Serious variation depending on the subject of the bibliography. The “Four Specimen Bibliographical Descriptions” provided by Gaskell, for example, have four very different foci: Specimen 2, from a bibliography of the output of Cambridge University Press, focuses on the process of manufacture (including an incredibly detailed collation formula, description of types, and details of production including the names of compositors, pressmen and correctors, and details of payment).
Examples: Entry 202 (a) from W. W. Greg, A bibliography of the English printed drama to the Restoration (London, 1939); Entry 131 from Appendix I of D. F. McKenzie, The Cambridge University Press 1696-1712 (Cambridge, 1966) (both reprinted in Philip Gaskell, A New Introduction to Bibliography (New York & Oxford, 1972)).

A major difference between descriptions of manuscripts and descriptions of printed books is the focus of manuscript description (and thus msDesc) on the item at hand, and the focus of printed books description (indeed, of bibliography in general, including the bibl* family of tags in TEI) on more general levels of description (edition, issue, work, etc.).² In thinking about the relationship between the specific and general in description it may be helpful to turn to the Functional Requirements for Bibliographic Records (FRBR), a conceptual framework for considering bibliography from the very general (the “work”) through the realizations of that work (the “expression” and “manifestation”) down to the specific physical objects (the “item”). Printed books descriptions appear to include a combination of all the FRBR levels, which may indicate the need for a combination of the bibl* family of tags for description of work, expression, and manifestation, with msDesc for item level description. In any case, work will need to be done to ensure that all levels required for complete description of printed books are available in msDesc.

What’s already being done?

Since 2004, the TEI Workgroup on Physical Bibliography has been working to develop recommendations for encoding the physical structure of codex books, which would be useful for describing both manuscripts and printed codex books, although not useful for other non-codex printed or handwritten materials (such as broadsides, papyrus scrolls, or inscribed stones).

“Domain” section from the “Charge for the Workgroup on Physical Bibliography” (complete charge here)

This work group is charged with developing guidelines for encoding information about the physical structure of printed books: specifically, information about how individual pages are located and identified within the larger structures of signatures and gatherings. The audience being served is fairly narrowly construed as the community of descriptive and analytical bibliographers for whom this information is essential as a way of documenting the construction of the physical book and as the basis for further analysis of printing practices and the history of particular editions.

The charge, therefore, is limited to the structural description of the codex. The recommendations coming out of the workgroup include creating children for the <collation> element including <collationFormula> and <codexStructure>. collationFormula is described as being

designed to be used to encode any of the standard kinds of collation formulae, such as the type of collation formula specified by Fredson Bowers in his influential book Principles of Bibliographical Description, the kinds used by manuscript cataloguers, and the kind employed in the Gesamtkatalog der Wiegendrucke, or to be adaptable to a project-specific style of collation.

codexStructure is described as

enclos[ing] a complex of elements that together describe the full physical form of a printed or handwritten book, such as <gathering> , <leaf> , and <page> . In the case of multi-volume works, <codexStructure> may be repeated for each volume.

Current draft recommendations from the PB workgroup, including collation details and book structure, are available online: http://www.tei-c.org/Activities/Workgroups/PB/PB-draft.xml .

There is much information contained in descriptions of printed books that is not covered by the PB workgroup charge. In the last half of 2008, however, the TEI Council has made a few notable changes to the Guidelines that move forward the the encoding of the description of printed materials within the msDescription module. These changes have been motivated by specific requirements arising out of the ENRICH project, which seeks to “create seamless access to distributed information about manuscripts and rare old printed books in Europe.”

The first proposal (submitted 2008-08-16 by Lou Burnard: http://sourceforge.net/tracker/index.php?func=detail&aid=2055122&group_id=106328&atid=644065) was to “permit the elements docAuthor, docDate, docImprint, and docTitle (currently permitted only within a titlePage) within msItem and msContents.” This measure will help with one problem that was immediately obvious to me when I started researching this paper: that the current msDescription does not have the kinds of descriptors that would be necessary for identifying materials with specific imprints, publisher and printer, and dates, authors and titles associated with them (unlike most manuscripts).

The second proposal (submitted 2008-08-16 by Lou Burnard (?): http://sourceforge.net/tracker/index.php?func=detail&aid=2055116&group_id=106328&atid=644065) was to add new elements <typeDesc> and <typeNote>, “to contain descriptions of the typographic features of a printed source, within the physDesc.” These elements would be analogous to handDesc and handNote, which are used to describe scribal hands. In his final comment on the proposal, Lou Burnard states that “[When we find a prefix to replace ms*] then we can probably get rid of both typeDesc and handDesc in favour of XXXdesc, with nested typeNote and handNotes as appropriate.”

A third proposal that is not specifically related to the issue of describing printed materials but which nevertheless will have a bearing on it is that of the content model for physDesc. This proposal (submitted 2008-07-30 by Lou Burnard: http://sourceforge.net/tracker/index.php?func=detail&aid=2032879&group_id=106328&atid=644065) was to loosen the content model for physDesc, which up until then had been entirely “unstructured” or entirely composed of specialized model.physDescPart elements. In the small amount of research done for this paper, looking at several different types of book descriptions, it is fairly clear that the new workgroup will have to deal with issues of structured vs. unstructured information in the same record, and this proposal (which according to the record on Sourceforge so far has been implemented only in <bindingDesc> and <binding>) will be a helpful starting point for those discussions.

What’s yet to be done?

Although the PB working group and the TEI Council have made several important changes and recommendations, it will require dedicated effort in order to gauge what further modifications will need to be made to the TEI Guidelines to support the encoding of descriptions of printed books. This activity should take place in a organized way, under the auspices of a workgroup or SIG, rather than in reaction to requests from specific individuals and projects (exemplified by the changes made on behalf of the ENRICH project).

Suggestions for the future activity

Following are a few suggestions for specific areas that should be considered for future activity.

Recommendations on how to encode all levels of description (according to e.g. the FRBR model), including the “ideal copy”: msDesc is designed for the description of individual copies. Printed books and other printed materials, in addition to being described as individual copies and according to different bibliographic levels (see discussion of FRBR above), may also be described as the so-called “ideal copy.” As defined by Roy Stokes in The Function of Bibliography (Aldershot: Gower, 1982), the ideal copy is “an assessment of the physical details of the book and their exact relationship to the state in which the book was planned to appear at the time of its initial publication.” Although it may be of use for describing manuscript stemmata, the concept of the ideal copy isn’t really appropriate for manuscript descriptions and there is no obvious way to encode this.
Reconsideration of the form and function of msIdentifier: As defined, msIdentifier identifies a manuscript in reference to its current physical location, usually in a library or other bibliographic institution: bloc, country, district, geogName, institution, msName, placeName, region, repository, settlement. Judging from the descriptions that I have seen, these identifiers will not always be useful for the identification of printed books, especially those listed in rare books catalogues. A bookseller may be considered a “repository” and a catalogue number may be considered an “idno,” but books listed in catalogues may be sold and no longer repose in the bookseller’s warehouse, while the numbers referring to specific books will change from catalogue to catalogue; this information, unless paired with a specific catalogue, does not provide the “unambiguous means of uniquely identifying a particular manuscript” or book, as promised in the Guidelines.
Consider expanding the definition of <history>, or create new parallel elements for the inclusion of more broad historical information related to the content of the book: <history> is defined specifically as “describing the full history of a manuscript or manuscript part.” In the descriptions of printed books that I have seen, especially those in rare books sales catalogues, in addition to the history of the specific object they include discussions of the history of the work (the object being just one representation of that work), or biographical information about the authors, printers, publishers, etc.
Evaluate the draft recommendations put forth by the Physical Bibliography workgroup.

Appendix

Footnotes

1.: Sincere thanks to Lou Burnard and James Cummings for reading and commenting on earlier versions of this paper, and to Jim Birchfield for tutoring me in the finer points of rare books description. Back to context...
2.: This is admittedly a soft division; manuscript descriptions may discuss other manuscripts and other versions of the text contained therein, while printed books descriptions, especially those found in rare books sales catalogues and library catalogues, will include detailed descriptions of the specific copy. Back to context...

TEI

Members Meeting 2008

Extending msDescription to Describe Printed Books (and to describe manuscripts better)

Document Contents

A few examples of “descriptions of printed books”

1. Rare books sales catalogues

2. Library catalogues

3. Bibliographies

What’s already being done?

What’s yet to be done?

Suggestions for the future activity

Appendix

Footnotes

Document Contents