Introduction to XPATH

[9 Sep (Sun) Morning] Half-day Workshops

Introduction to XPATH

Syd Bauman (Northeastern University), Sarah Stanley (Florida State University), Elisa Beshero-Bondar (University of Pittsburgh at Greensburg)

(Room A3)

Abstract

This workshop will cover the basics of XPath, an extremely powerful language for navigating and processing XML. With XPath, you can search for highly specific textual features, such as find all the <div> elements with a specific @type value, or how often a certain character in a play speaks in verse or in prose. XPath is the underlying language used by XSLT, XQuery, Schematron, the TEI Processing Model, and in some XPointer schemes and TEI ODD constructs. While experience with writing valid TEI is desirable, participants only need to know how to write well-formed XML to benefit from this workshop.

Outline

This workshop will cover the basics of XPath, the syntax used for navigating and processing XML documents. Learning XPath will help any encoder of TEI to comprehend how their code fits and nests in the XML tree, and how to express relationships among its various parts. You can apply XPath to check for accuracy of text encoding and to identify patterns and irregularities in your coding. Learning XPath will also improve your vocabulary for describing the various units (“nodes”) of XML that are navigable, and you will learn how these are “walkable” and tractable for processing. Writing XPath is a fundamental skill for the transformation and querying of XML documents, and for developing schemas to check your encoding. XPath is also used by the TEI community in various contexts, for adapting the TEI Processing Model and for designing TEI ODD customizations and TEI XPointer schemes.

We will not cover all the applications of XPath in this workshop, but participants will gain perspective and experience to continue practicing and learning. In this workshop, you will learn about the vocabulary that XPath uses to talk about the component parts of XML trees. You will also learn about XPath syntax, including how to find ancestors and descendants and how to refine queries with predicates. While familiarity with TEI is desirable, it is not absolutely necessary for this workshop. At minimum, however, we recommend that participants have written an XML document and have basic familiarity with the XML syntax and the concepts of well-formedness and validity before attending this workshop. If you have written code with angle brackets and want to understand how you can process it and what you can build with it, this workshop is for you.

During the course of the workshop, you will learn how XPath can help you transform data from your documents into various new formats and visualizations, and you will see some examples of how XPath is used in practice. The workshop will include a short demonstration of XSLT XQuery, and participants will get to look at how XPath is used as a basis for transforming documents. While the workshop will not cover how to write XSLT or XQuery, which are used to transform and process XML documents, you should leave with a good grasp of how XPath is utilized in these contexts and the workshop will prepare you for more advanced work in writing these kinds of programs.

This half-day workshop will consist of:

Introduction, setup, and basics of XPath vocabulary

Set up: (check oXygen installations, load files)
Understanding XML as a Tree
Understanding XPath expressions:

Navigation: XPath Axes
Path Steps
XML Node types

Introduction to XPath syntax

Constructing XPath queries
Refining your searches with predicate “filters”
XPath functions (a selection)

Hands on practice writing XPath expressions
Future directions

Overview of some uses of XPath in the TEI
Testing TEI with XPath (demonstration of Schematron)
Transforming TEI with XPath (demonstration of XSLT)

Participation and requirements

To participate in this workshop, attendees must bring their own laptops. Participants will receive e-mail in advance of the workshop with information about installing the <oXygen/> XML Editor and be provided an extended complimentary trial license key courtesy of SyncroSoft. No other installation is necessary.

The workshop should be held in a room with a projector with an HDMI connection (please let us know if HDMI is not available). Preferably, the space would be configured with participants seated in a semi-circle, rather than in rows, but if that is not possible, classroom-style seating will suffice.

The workshop should be capped at 15 participants.

Workshop Instructors

Syd Bauman is a Senior XML Programmer/Analyst with the Northeastern University Digital Scholarship Group. He is a hard-core devotee of descriptive markup, and has taught workshops in TEI since 2002 and XSLT (including XPath) since 2002. Syd served as the North American Editor of the TEI from 2001 to 2007, and currently serves on the TEI Technical Council.

Sarah Stanley is the Digital Humanities Librarian at Florida State University. She received a master’s in English Literature from Northeastern University in 2015. In her capacity as digital humanities librarian, Sarah is interested in creating sustainable infrastructures for digital scholarship that make concerted efforts to guide digital humanities practitioners towards digital literacy. Sarah also works with text encoding and the creation of digital editions. She currently serves on the Text Encoding Initiative Technical Council.

Elisa Beshero-Bondar is a member of the TEI Technical Council, as well as an Associate Professor of English and Director of the Center for the Digital Text at the University of Pittsburgh at Greensburg. Her projects investigate complex texts such as epics, plays, and multi-volume voyage logs, and involve her in experimentations with the TEI, including refining methods for computer-assisted collation of editions and probing questions of interoperability to reconcile diplomatic and critical edition encodings. She is the founder and organizer of the Digital Mitford project and its annual coding school.

[9 Sep (Sun) Morning] Half-day Workshops