Fuzzy Places and Ambiguous Data

-by Gabi Kirilloff

The types of visualizations that GIS enable can help us trace threads and commonalities through a subtle reworking of our visual perspective. Maps and graphs can help us see old data in a new light. This ability is something that I associate with the application of digital methodologies to literary studies overall. However, a potential pitfall often accompanies these benefits, namely, that maps and other types of visualizations have the potential to obscure the acts of interpretation that are taking place behind the scenes. As Johanna Drucker writes:

…graphical tools [such as GIS] are a kind of intellectual Trojan horse, a vehicle through which assumptions about what constitutes information swarm with potent force. These assumptions are cloaked in a rhetoric taken wholesale from the techniques of the empirical sciences that conceals their epistemological biases under a guise of familiarity.

Drucker is not opposing the use of maps altogether, rather she is pointing out the ways in which visualizations, much like other critical tools, include acts of interpretation that aren’t always visible. Sometimes the scholar drives these acts of interpretation. Sometimes technical constraints mandate interpretive decisions. However, what make a map different from say a journal article or even TEI encoding, is the degree to which the map can appear to present an unbiased (and non-interpretive) “truth.”

I’ve been considering these issues a lot in my own work; I’m attempting to create a series of interactive maps that showcase the places that the American author Willa Cather writes about in her novels. For example, the following is a portion of a map of the places Cather mentions in her novel, My Ántonia


 Countries and states that Cather mentions are indicated with orange coloring, while cities are indicated with a purple dot. The size of the dot corresponds to frequency of mentions. Seems simple enough. The intended goal of this and other maps is to explore the relationship between Cather’s writing about place and her first hand experience of place. Maps of Cather’s travels and maps of the places that she mentions in her letters would also be available, and would for example, spark questions such as: does Cather frequently write about the places she traveled to? What types of places does Cather write about that she never experienced? When Cather writes letters to specific people, does she tend to talk about specific places?

In order to make my research more straightforward, I decided early on to focus on the “real” concrete places that Cather mentions (excluding, for example, fictional places that are based on real locations). Further, because of the nature of the tools I am using to extract this information from Cather’s writing (I am using the Stanford Named Entity Recognizer, which can extract proper place names from a text.), I decided to only examine her use of proper geographic places names. So for example, “It was raining in New York city” would be extracted and counted while “It was raining in the big apple” would not.

These decisions, made partially because of technical constraints, have deeply theoretical implications. They are themselves interpretive acts that call into questions the very use of the term “place” in my research. The above map then, far from displaying all of the “places” Cather references in My Ántonia, is really displaying something far more specific: all of Cather’s reference to real geographic locations, in which the reference takes the form of a proper place name. If you think about it, these are two very different things. The inclusion of dialect, cultural references, and physical description are all ways that an author may write about a place without ever using a proper place name. This is not to say that the specific references to place that I am extracted are insignificant, only that it’s important to keep in mind the constraints of the methodologies I’ve chosen.

 I’ve run into a couple issues that have further pushed me to think about exactly what I mean by “place.” I’ve been wondering how I should represent geographic locations that have “fuzzy” boundaries. For example, the Stanford NER will extract “the South” as a place reference. However, the implied meaning of this location varies based on context; “the South” in Sapphira and the Slave Girl (a novel set in Virginia) may mean something different than “the South” in Shadows on the Rock (a novel set in Quebec). Even if the implied place was consistent, what are the boundaries of “the South,” and what did this geographic region mean to Cather? As visualizations, maps are not especially amenable to displaying these types of ambiguities.

Another problem that I’ve run into has to do with place names that are used to describe an object, rather than necessarily signify the place in question. I’m not extracting the adjectival forms of place names (e.g. French), but I do end up extracting place names that function like adjectives. “Panama hat,” “India ink,” and “Lombardy Poplars” to name just a few. There are also place names tied up with people: e.g. “Our lady of Guadalupe.” I find these instances pretty fascinating because they raise some interesting questions about what constitutes a reference to, or an understanding of, a place. My first instinct is to say that of course “Panama hat” isn’t a reference to a place – these two words together become something entirely different, something more idiomatic. But it’s not so easy to draw that type of line. For example, are the distinctions between “Jamaican rum,” “Jamaica rum” (this one is in my corpus), and “rum from Jamaica” meaningful? Should one more of these count as a reference to Jamaica but not the others?

Part of me resists the idea of removing these types of references, since in a way the inclusion of the place in such phrases offers us an indication of the rich relationship that these objects have with specific geographic locations and cultures. Take for example, the following from The Song of the Lark, “The bed was very wide, and the mattress thin and hard. Over the fat pillows were “shams” embroidered in Turkey red…” After looking this up, I found out that Turkey redis a color that was used in the 18th and 19th centuries. The dye originated in Turkey. When Cather makes such a reference, there is no way to know what knowledge she had access to: did she know the history of this term? Did she know that it referred to the country of Turkey? However, the same could be true of many of her more concrete references as well. A character in O Pioneers! quotes the Bible, and this quote includes a reference to Lebanon. Given that this is a quote from another text, should this be considered a place reference? It may not be Cather’s words but she did choose to include them.

I think that these examples help to highlight the ambiguous, complicated, and fuzzy nature of humanities data. Which brings to mind Drucker’s rejection of the term “data” altogether in favor of the term “capta,” “which is ‘taken’ actively while data is assumed to be a ‘given’ able to be recorded and observed.” This point offers a useful reminder of the active and interpretative nature of my work and it has certainly given me something to think about as I slog through “Bokhara carpets,” “Malaga grapes,” and “Moselle wine.”


