Notes from the TEI Print Dictionary Group Meeting - April 3rd, 1991 S. Warwick-Armstrong Date: 3.16.91 Place: Tempe, Arizona Participants: B. Amsler (chair), N. Calzolari, L. Guthrie, N. Ide, F. Tompa, C. Van Ess-Dykema, S. Warwick-Armstrong Agenda: 1. Identification of a base level tag set 2. Identification of higher-order grouping of tags 3. Standards of documentation 4. Standards for errata and comments 5. Desirability and feasibility of separate data base format standard This report is a summary of the major issues discussed during the meeting and the conclusions that were reached. 0. Organization of the TEI dictionary groups Some discussion was devoted to clarifying the current situation w.r.t. dictionary working groups in the TEI. The following points were made: - for phase II, two working groups on the topic dictionaries have been organized, one to work on the tagging of printed dictionaries, whose members are: B. Amsler (co-chair), B. Boguraev, N. Calzolari (co-chair), N. Ide, F. Tompa, C. Van Ess-Dykema, and S. Warwick-Armstrong and L. Guthrie (observer) and one to consider tagging computational lexica, whose members are: B. Ingria (chair), N. Calzolari, J. Pustejovsky, S. Warwick-Armstrong - the goal of the print dictionary working group is to rewrite the dictionary section of the TEI guidelines - the preliminary goal of the computational group is to study the feasibility of tagging computational lexica in view of exchange of data between systems (if possible, the group will define a common tag set for the information that computational lexica contain) - Given the closely related aims of the TEI and the newly created CLR, the print dictionary working group will keep the CLR informed of all activities. The representative of the CLR is Louise Guthrie (currently participating in the TEI group as an observer). In the discussion of what guidelines the print group is aiming for, the issue arose concerning the actual format of the entries, e.g. for typesetting purposes. Providing information adequate for 'generating' a dictionary entry is not the same as providing the information necessary for 'regenerating' a given printed entry. It was agreed that the first goal is to provide a structural definition of dictionary entries and then to consider what must be added for a given printed entry. Tags that preserve the printed form may already be foreseen in other parts of the guidelines. In order to account for the specific needs of lexicographers and publishing houses who are very much concerned with the physical presentation of the entries, we discussed devoting one meeting to this topic with invited participants from the various publishing houses. In preparation, a preliminary set of tags accounting for structure and content of the entries (1st draft of dictionary guidelines) would be drawn up and thus provide a basis for discussing the needs of the publishing community. Invited participants would be asked to comment on the proposals and to identify their specific needs not addressed in the draft document. This meeting could take place in Oxford (September/October) as many of the potential participants may be attending the OED meeting. 1. & 2. Identification of tags for dictionaries Initial work will concentrate on defining a required set of base tags and a recommended set of grouping tags. The starting point for defining the base tags will be the current TEI guidelines dictionary section (p. 183-187) plus other relevant sections (e.g. bibliographic citations, linguistic analysis, etc.). Other documents to be consulted are the Acquilex preliminary report, a document previously circulated on bilingual dictionaries (TEI AIW20), a new paper from Vassar-CNRS "Outline of a Data Base Model for Electronic Dictionaries", and the list of tags from the Waterloo work on Oxford dictionaries and tags proposed in the Amsler & Tompa proposal. Monolingual tags will be the starting point; bilingual tags will be added as necessary. An initial proposal for base and higher-order tags is foreseen for June 1st. 3. Standards of documentation Though the group did agree that the working goal was to provide a DTD for dictionaries, there was quite a bit of confusion as to what this actually implied. It became clear that the group needs to gain familiarity with DTDs in general before working on the actual guidelines for the dictionary section. The dictionary section foreseen for the TEI will essentially consist of three types of information (with pointers to related sections): - definitions of required base tags and recommended higher-order tags - sample entries - DTDs for dictionaries with a discussion of extensions for given samples An open question remained on whether the group is responsible for a section on implicit information such as semantic relations, "is-a", "synonym", etc. The encoding of this type of information might overlap with the work carried out in computational dictionary group. Another type of information that might be included explicitly or merely as a pointer to another section of the guidelines concerns 'headers' to dictionaries. The two aspects identified were information about machine-readable dictionaries and how to use the header section guidelines (or suggestions on which parts to use). The first task the group will undertake is to look at the 'header section' in order to determine how it can be used and whether any additions (in the dictionary or the header sections) are necessary. The document provided with "Webster's 7th" by Peterson was suggested as a sample of what information can be provided with a dictionary. 4. Standards for errata and comments The only information mentioned in this context was a history of corrections as potentially desirable. As N. Ide will be attending the text criticism group where this issue will also be addressed, the topic was postponed until she reports back from that meeting. 5. On database formats The concern of mapping to and from SGML and databases was briefly discussed. The conclusion was that this topic was not and will not be within the mandate of the print dictionary working group. It was however noted that some of the concerns are indirectly addressed in the higher-order tags to be specified. 6. Dates and plans In view of providing new guidelines for March '92, a first draft of the dictionary section should be prepared for end of September '91 (for the meeting in Oxford). A preliminary set of tags will be defined by June 1st for both mono- and bilingual dictionaries. A meeting to address lexicographer's and publisher's concerns is recommended to be held at the OED meeting in Oxford. The date fixed for the next dictionary group meeting is October 2nd in Oxford. Specific tasks were assigned as follows: Existing monolingual tags will be circulated to all - TEI and Amsler & Tompa tags : B. Amsler - OED ("standardized") tags : F. Tompa - OALD -> TEI mapping : N. Ide Areas to concentrate on (expertise in) were divided up as follows: monolingual: B. Amsler, N. Ide, F. Tompa bilingual: N. Calzolari, C. Van Ess-Dykema, S. Warwick-Armstrong DTD: B. Amsler, N. Ide