When I started my work with KPLEX, I was not expecting to encounter so many references to literature. Specifically, to works of fiction I have read in my capacity as an erstwhile undergraduate and graduate student of literature who had (and still has) a devout personal interest in the very particular, paranoid postmodern fictions that crawled out of the Americas (North and South) like twitchy angst-ridden spiders in the mid-to-latter half of the 20th century. The George Orwell references did not surprise me all that much; after all, everyone loves to reference 1984. But Jorge Luis Borges, Thomas Pynchon, and Don DeLillo? These guys produced (the latter two are still producing) the kind of paranoiac post-Orwellian literature that could be nicely summed up by the Nirvana line “Just because you’re paranoid/ Don’t mean they’re not after you,” which is itself a slightly modified lift straight out of Joseph Heller’s Catch 22.
It seems, however, that when it comes to outlining, theorising and speculating over the state, uses, and value of data in 21st century society, the paranoid tinfoil hat wearing Americans and their close predecessor, the Argentinian master of the labyrinth, got there first.
We are all by now familiar with—or have at least likely heard reference to—the surveillance system in operation in 1984; a two-way screen that captures image and sound so that the inhabitants of Orwell’s world are always potentially being watched and listened to. In a post-Snowden era this all-seeing all-hearing panoptic Orwellian entity has already been referenced to death, and indeed, as Rita Raley points out, Orwell’s two-way screen has long been considered inferior to the “disciplinary and control practice of monitoring, aggregating, and sorting data.” In other words, to the practice of “dataveillance.“ But Don DeLillo’s vision of the role data would play in our future was somewhat different, more nuanced, and most importantly, is less overtly classifiable as dystopian; in fact, it reads rather like a description of an assiduous Google Search, yet it is to be found in the pages of a book first published in 1985:
It’s what we call a massive data-base tally. Gladney, J.A.K. I punch in the name, the substance, the exposure time and then I tap into your computer history. Your genetics, your personals, your medicals, your psychologicals, your police-and-hospitals. It comes back pulsing stars. This doesn’t mean anything is going to happen to you as such; at least not today or tomorrow. It just means you are the sum total of your data. No man escapes that.
Dataveillance is interesting because its function is not just to record and monitor, but also to speculate, to predict, and maybe even to prescribe. As a result, as Raley points out, its future value is speculative: “it awaits the query that would produce its value.” By value Raley is referring to the economic value this data may have in terms of its potential to influence people to buy and sell things; and so, we have a scenario wherein data is traded in a manner akin to shares or currency, where “data is the new oil of the internet”:
Data speculation means amassing data so as to produce patterns, as opposed to having an idea for which one needs to collect supporting data. Raw data is the material for informational patterns to come, its value unknown or uncertain until it is converted into the currency of information. And a robust data exchange, with so-termed data handlers and data brokers, has emerged to perform precisely this work of speculation. An illustrative example is BlueKai, “a marketplace where buyers and sellers trade high-quality targeting data like stocks,” more specifically, an auction for the near-instant circulation of user intent data (keyword searches, price searching and product comparison, destination cities from travel sites, activity on loan calculators).
This environment of highly sophisticated, near-constant amassing of data leads us back to DeLillo and his observation, made back in 1985, that “you are the sum total of your data.” And this is perhaps the very environment that leads Geoffrey Bowker to declare, in his provocative afterword to the collection of essays ‘Raw Data’ is an Oxymoron (2013), that we as humans are “entering into”, are “being entered” into, “the dataverse.” Within this dataverse, Bowker—who is being self-consciously hyperbolic—claims it is possible to “take the unnecessary human out of the equation,” envisioning a scenario wherein “our interaction with the world and each other is being rendered epiphenomenal to these data-program-data cycles” and one where, ultimately, “if you are not data, you don’t exist.” But this is precisely where we must be most cautious, particularly when it comes to the nascent dataverse of humanities researchers. Because while we might tentatively make the claim to be within a societal dataverse now, the alignment of data with existence and experience is still far from total. We cannot yet fully capture the entirety of the white noise of selfhood.
And this is where things start to get interesting, because what is perhaps dystopian from a contemporaneous perspective—that is, the presence somewhere out there of near infinitesimal quanta of data pertaining to you, your preferences, your activities— a scenario that might reasonably lead us to reach for those tinfoil hats, is, conversely, a desirable one from the perspective of historians and other humanities researchers. A data sublime, a “single database fantasy” wherein one could access everything, where nothing is hidden, and where the value, the intellectual, historical, and cultural value of the raw data is always speculative, always potentially of value to the researcher, and thus amassed and maintained with the same reverence associated with high value data traded today on platforms such as BlueKai. Because as it is, the amassing of big data for humanities researchers, particularly when it comes to converting extant analogue archives and collections, subjects the material to a hierarchising process wherein items of potential future value (speculative value) are left out or hidden; greatly diminishing their accessibility and altering the richness or fertility of the research landscape available to future scholars. After all, “if you are not data, you don’t exist.” But if you don’t exist then, to paraphrase Raley, you cannot be subjected to the search or query of future scholars and researchers, the search or query that would determine your value.
As we move towards these data sublime scenarios, it is important not to lose sight of the fact that that which is considered data now, this steadily accumulating catalogue of material pertaining to us as individuals or humans en masse, still does not capture everything. And if this is true now then it is doubly true (ability to resist Orwellian doublespeak at this stage in blogpost = zero) of our past selves and the analogue records that constitute the body of humanities research. How do we incorporate the “not data” in an environment where data is currency?
Happy Day of DH!
 Raley, “Dataveillance and Countervailance” in Gitelman ed., “Raw Data” is an Oxymoron, 124.
 Roger Clarke, quoted in ibid.
 Don DeLillo, White Noise, quoted in Gitelman ed., “Raw Data” is an Oxymoron, 121, emphasis in original.
 Raley, “Dataveillance and Countervailance” in ibid., 123–4.
 Julia Angwin, “The Web’s New Gold Mine: Your Secrets,” quoted in ibid., 123.
 Raley, “Dataveillance and Countervailance” in ibid., 123.
 Geoffrey Bowker, “Data Flakes: An Afterword to ‘Raw Data’ Is an Oxymoron” in ibid., 167.
 Bowker, in ibid., 170.
 Raley, “Dataveillance and Countervailance” in ibid., 128
 Bowker, in ibid., 170.
Featured image is a still taken the film version of 1984.