“there is no persistently objective view […]‘Raw Data, is an oxymoron’.

I’m reading “Raw Data” is an Oxymoron by Lisa Gitelman at the moment, it’s an interesting book, which attempts to situate some of the hullaballoo around big data in a more critical frame and features contributions by Geoffrey C. Bowker, Kevin R. Brine, Ellen Gruber Garvey, Lisa Gitelman, Steven J. Jackson, Virginia Jackson, Markus Krajewski, Mary Poovey, Rita Raley, David Ribes, Daniel Rosenberg, Matthew Stanley, and Travis D. Williams.

The broad methodological thrust of the book, applies approaches originating in science studies (e.g. critical appraisals of the practices of science) toward an expanded understanding of the boundaries of knowledge production as a socio-cultural activity. Many of the chapters touch on issues of data objectivity, or what I call the ‘problem of representation’ in big data.

In the opening chapter Daniel Rosenberg excavates the etymology of ‘data’ as a theological term from the 17th Century referring to incontrovertible scriptural truths. Around the same time scientists began to use data as a term to refer to empirical evidences gathered to test scientific hypothesis. We’ve inherited this double understanding of the term, namely i) data as rhetorical and transcendental form ii) as object generated through observational processes. While in the former human agency is entirely absent (data is ‘God-given’) in the latter human subjectivity is ideally removed as a contaminating agent in the generation of truths.

It’s interesting to note that ‘transcendence’ (of the human, of subjectivity) is key to both understandings and underpin a common contemporary ‘belief’ in data’s neutrality or objectivity.

Data, as Lisa Gitelman and Virginia Jackson remind us in the introduction, need to be interpreted against the complexity of a seamless world – which entails the use of the imagination.

“every discipline and disciplinary institution has its own norms and standards for the imagination of data’

To go further we can argue that an imaginative relation to data is not merely confined to acts of interpretation where ontological biases are bought to bare, but is present throughout the entire chain of data gathering, structuring (syntactical), and presentation (semantic). This point is picked up on by Brine and Povey later on in the book in their work on economic modelling. In order to be usable, data has to be cleaned and made compatible with the various databases and algorithmic processes within which it will be stored and linked. By the time it reaches the public domain it already exists at many levels of removal from an original referent.

Data then always exceeds its technical base and has a complex relationship towards the claims of objectivity, commonly attributed to it. Or paraphrase Deleuze, data is always social first. However, and it’s a big however, data practices in the broad do adhere to phenomena ‘in the world’ demonstrably establishing objective relations to it, in ways for example that save lives and enable us to understand climate change.

In this sense data can have a claim on ‘truth’ but to paraphrase the Facebook relationship status option ‘it’s complicated’ and human subjectivity is always implicated. The issue here is not a problematizing of objectivity in data but a more complex set of issues around how data (or the idea of objectivity itself) is situated as a historically specific set of social and technical activities and precisely how an artistic practice might inhabit these. Data has emerged from a very specific set of epistemic conditions, what happens when artists engage with these and how does this affect art practice?

Drawing on seminal work in science studies by Daston and Galison,
Gitleman and Jackson correlate the rhetorics of ‘mechanical objectivity’ associated with data with the early history of photography. From the earliest, the virtuous function of the photographic image as a impartial indexing of the world (Fox Talbot’s “Pencil of Nature” or Andre Bazin’s argument that photography removed “the sin” of subjectivity) were problematized by dissenting discourses and practices that stressed photography’s ability to also establish subjective relations. For Gitleman and Jackson the ability of photography to gather-up complex arrangements of subjectivity, materiality and mechanical objectivity describes not an either/or situation but a hybrid in which:

“mechanical objectivity is an “epistemic virtue” among other competing virtues”. The presumptive objectivity of the photographic image, like the presumptive rawness of data, seems necessary somehow-resilient in common parlance, utile in commonsense”

It’s a useful analogy but we also need to emphasise that the epistemologies of data are very different from those of photography, even if the advent of digital photography has blurred the distinction.

But to summarise, data is always social first, subject to all manner of subjective framing. In such terms it cannot be considered ‘persistently’ objective in its relations but there is I would argue a form of indexicality at play, a stickiness of data to that which it measures.