Daily Archives: February 1, 2016

Why We TEI

I originally wrote this in November 2015 in an email to a team at Wooster working on a digital Independent Study, or “IS” as it’s called at Wooster. I have been lucky to be involved with a Wooster senior, Tess Henthorne, who wants to encode and study a heretofore unpublished work by a little-known early twentieth-century woman writer. In our project meetings we were wrestling with a couple of issues: (a) what is our desired scholarly output and (b) do we need the Text Encoding Initiative (TEI) guidelines to achieve that? I wrote this email as a reminder – mostly as a touchstone for myself – of how we might frame the utility of textual encoding in and of itself, and how we might frame later phases of the project that allow our student researcher to further (or further complicate) her arguments about the text.

Tess is giving a presentation to the English department later this week, so I post this (slightly revised version) here, in part, as a primer not on the process of encoding but on the rationale for deciding to do it, and in TEI. My hope is that it can provide some context for the kinds of questions with which students (and the rest of us) wrestle when thinking through digital projects. I also hope (gentle reader) that you’ll correct or complicate my ideas herein, either by comment below or on our Facebook page as a comment on the post.

website_blog_scholar_screen2We should remember that one encodes in TEI for (maybe) two distinct, if related, reasons. Primarily TEI allows a scholar to plug into and share with a larger community of textual and literary scholars. I think that, for the purposes of the IS, we’re talking about a tiered argument for the utility of TEI encoding for which this is our baseline purpose. In this scenario, to my mind, the IS looks similar to a traditional IS with the addition of a digital edition that is basic TEI, enriched with header data and a potential contribution to the larger TEI/literary studies community. The argument here isn’t necessarily the making-public (i.e. publication) of a digital version of Montel, but a digital edition that can comprise the basis for future or additional digital analysis. What Tess has been calling the “Critical Introduction” (I think?) is the thing that looks like a typical argument in literary studies. But the IS will also include a section that explains “why a digital edition,” which is basically the argument above: in an increasingly digital scholarly landscape this edition will (a) contribute to the community, (b) increase the potential for future work on Montel/Fornell [the author] because of discoverability and standard markup (i.e. TEI), and (c) create a digital surrogate that has a greater chance of survival as technologies and digital infrastructure evolve.

This is all related to the primary reason that we TEI. (Yes, I’m using it as a verb :-). )

I actually think there’s a sub-reason in here that represents another possible tier as we consider the IS, and that’s the critical encoding. (Maybe it’s not a “sub-reason” but a more specific example of why we TEI.) So, whether or not it’s displayed to the world in (say) boilerplate, markup is an argument. This is a very important point when we’re talking about TEI to literary scholars. One makes a critical decision to tag something (e.g. the word “Jersey”), and one then makes a critical decision about how to tag it (i.e. is that a place? or a cow? or an article of clothing? a person’s nickname? Is it actually two or more of these at once: a metaphor?). So the argument for this tier would look something like this: Tess has an interpretation of Montel and Fornell, she decides how to represent the interpretation via markup of Montel, and then produces a TEI-encoded digital edition that is a representation of that interpretation.[note] Because it uses standard guidelines, all of the points above about its contribution to the “digital scholarly landscape” still hold. The IS then has to fold in an argument, similar to the first, that justifies the existence of this digital component. And this is an important point: this is all true whether or not a text is displayed on the web or otherwise visible. It’s the existence of this digital object that’s important, not necessarily that it’s human-readable. Not yet, anyway.

This brings me to the last (only secondary?) point: what one encodes in TEI is the potential for further digital research that this digital edition affords. Once we’re talking about doing something with that marked up, structured digital object — creating a display of critically encoded text using (say) Boilerplate — we’re talking about a different kind of digital humanities project. This represents the advanced tier of this enterprise because it involves programming and a different (new, additional, necessarily subsequent) set of research questions. Then it becomes less about the literary argument as such and more about how we can leverage technology to (a) make that argument more visible or (b) read the argument through a different lens altogether. It’s about reading something other than the text itself, something that’s one remove from the text itself. It’s a valuable enterprise — it’s a digital humanities enterprise — but it’s different in its methods and logic than the encoding itself. And, importantly, it relies on that thoughtful encoding — it relies on the digital, structured representation of an interpretation — for it to tell us anything.

It might be useful to think of these tiers as different kinds of reading. The first two are still, at their core, about reading the text, which is something that looks familiar to all of us, especially we literature/humanities types. The latter tier, though, is about reading an interpretation of an interpretation. In its best form it utilizes the encoded text as data: it reads your reading, interprets your interpretation. And then the researcher is interpreting the interpretation of your original interpretation. I know, right? Inception.

None of this is to say that we can’t do this latter tier: get something up on the internet using Boilerplate, hack the CSS to render your encoding in a special way, or even think about network graphs. In fact, I’d love to ask Jon Breitenbucher if he has a student who might be interested in hacking Boilerplate as a side project, or in trying to do something interesting with visualizing Tess’s interpretation; in either case the results would benefit Tess’s IS, but the IS doesn’t depend upon it. I just want to say that I think there’s an argument to be made for this enterprise exclusive of a representation that is translated/rendered for the world.



Note: I’ll complicate this point a little below, but I know that, in addition to “representing” text or interpretation thereof, encoding also facilitates further analysis.