For example, Shanks draws a sharp distinction between the stop motion creations of Harryhausen and computer-generated imagery in the way that the technique of stop motion animation never quite disappears into the background which is part of both its charm and effect, unlike the emphasis on photorealistic models in CGI.
In CGI the objective is often to have the imagery fabricated by the computer blend in so one doesn’t notice where the fabrication begins or ends. The rhetorical purpose of CGI is to fool, to deceive. Harryhausen’s models don’t look “real”. More precisely, they don’t look “natural”. No one need be fooled. One admires the craft in their making. (Shanks 2020)
Shawn Graham recently pointed me (and a number of colleagues!) to a new paper entitled ‘Computer vision, human senses, and language of art’ by Lev Manovich (2020) in a tweet in which he asked what we made of it … so, challenge accepted!
Lev Manovich is, of course, a professor of computer science and a prolific author, focusing on cultural analytics, artificial intelligence, and media theory, amongst other things. In this particular paper, he proposes that numbers, and their associated analytical methods, offer a new language for describing cultural artefacts. The idea that this is novel may be news to those who have been engaged in quantitative analyses across the humanities since before the introduction of the computer, but aspects of his argument go further than this. The actual context of the paper is as yet unclear since it is online first and not yet assigned to a volume. That said, a number of other open access online first papers in AI & Society seem to address similar themes, so one might imagine it to be a contribution to a collection of digital humanities-related papers concerning images and computer vision.
It’s an interesting paper, not least since – as Manovich says himself (p2) – it presents the perspective of an outside observer writing about the application of technological methods within the humanities. Consequently it can be tempting to grump about how he “doesn’t understand” or “doesn’t appreciate” what is already done within the humanities, but it’s perhaps best to resist that temptation as far as possible.
There was a flurry of interest in the technical press during the summer with the news that GitHub had placed much of the open source code it held into an almost improbably long-term Arctic archive (e.g. Kimball 2020; Metcalf 2020; Vaughan 2020). GitHub’s timing seemed propitious: in the midst of a global pandemic, with wild fires burning out of control on the west coast of the USA and elsewhere, and with upgrades to the nearby Global Seed Vault recently finished after being flooded as a consequence of global warming.
The Arctic World Archive was set up by Piql in 2017 and situated in a decommissioned mineshaft deep within the permafrost near Longyearbyen on the Svalbard archipelago. The data are stored on reels of piqlFilm (see Piql 2019, Piql nd), a high-resolution photosensitive film claimed to be secure for 750 years (and over 1000 years in cold low-oxygen conditions) and hence require no cycle of refresh and migrate, unlike all other forms of digital archive. The film holds both analog (text, images etc.) and digital information, with digital data stored as high resolution QR codes. Explanations of how to decode and retrieve the information are included as text at the beginning of each reel that can simply be read by holding it up to a light source with a magnifying glass, and Piql claim that only a camera/scanner and a computer of some kind will be required to restore the information in the future which means that the archive outlives any technology used to store the data in the first place.
We’ve all experienced that rush of recollection when we uncover some long-hidden or long-lost object from our past in the bottom of a drawer or box, triggering memories of encounters, activities, people, and places. We’re accustomed to the idea that we use evocative things as stored memories, deliberately or inadvertently, and as distributed extensions of our embodied memory (e.g. Heersmink 2018). Is it the same with digital objects? For example, van Dijck asks:
Are analog and digital objects interchangeable in the making, storing, and recalling of memories? Do digital objects change our inscription and remembrance of lived experience, and do they affect the memory process in our brains? (2007, xii).
Perhaps it’s a neurosis brought on by the contemplation of my excavation backlog, but I think there is a difference: that not all analog objects are equally interchangeable with digital equivalents in terms of their functioning as distributed memories, and that this difference is significant when we consider the archaeological narratives we are able to construct from our digital records. It may be that this perspective is coloured by the physical nature of my backlog from the 1980s and 1990s which for various reasons sits on the cusp of analog/digital recording. Although Ruth Tringham recalls how in the 1980s the digital recording of hitherto paper records was distrusted (Tringham 2010, 87), not least due to concerns about the fragility of the hardware and impermanence of the product, in my case it was rather more prosaic: as someone working with computers full-time in my day job I had no desire to turn my excavation experience into a busman’s holiday as the on-site computer technician. The downside was that I subsequently gave myself the monumental task of manually entering the record sheets into the database and scanning/digitising the plans and sections in the off-season. In retrospect, however, this provides the opportunity to consider the different affordances of the two sets of analog and digital records, a perception that is reinforced by the pre-pandemic experience of packing my office which incorporated two days of sorting and moving the physical archive and about five minutes transferring the digital files.
There are quite a few metaphors associated with archaeological data, many of which relate to its apparent mystery. For example, Gavin Lucas has described the archaeological record as being “haunted by absences” created by decay and destruction (Lucas 2012, 178). In a similar vein, Alison Wylie has described archaeological data as “shadowy” and that archaeology is defined “by the challenges of working with gaps and absences in its primary data” (Wylie 2017, 204). In a special issue of the Science, Technology, & Human Values journal on ‘Data Shadows’, Leonelli et al. describe data in terms of its presence, but also in terms of its unavailability, inaccessibility, or its absence, defining absence as a descriptor of how “data are missing, incomplete, unreliable, ignored, unwanted, or untagged” (Leonelli et al. 2017, 192). As Chris Chippendale described it,
Archaeology is plagued in many an instance with poorly defined variables (usually thought of as ‘data’) drawn from ill-understood populations, and with uncertain articulations between the entities whose logical relations we seek to understand. (2000, 611)
Bill Caraher has recently been considering the nature of ‘legacy data’ in archaeology (Caraher 2019) (with a commentary by Andrew Reinhard). Amongst other things, he suggests there has been a shift from paper-based archives designed with an emphasis on the future to digital archives which often seem more concerned with present utility. Coincidentally, Bill’s post landed just as I was pondering the nature of the relationship between digital archives and our use of data.
So do digital archives represent a paradigm shift from traditional archives and archival practice, or are they simply a technological development of them? Digital archives are commonly understood to be a means of storing, organising, maintaining, and making data accessible in digital format. Relative to traditional archives they are therefore not limited by physical space or its associated costs and so can make much more information available more easily, cheaply, and widely. But a consequence of this can be a kind of ‘storage mania’, in which data become easier to accumulate than to delete because of digitalisation, and where data are released from the limitations of time and space through their dematerialisation (Sluis 2017, 28). This is akin to David Berry’s “infinite archives” (2017, 107), who suggests that “One way of thinking about computational archives and new forms of abstraction they produce is the specific ways in which they manage the ‘derangement’ of knowledge through distance.” (Berry 2017, 119). At the same time, digital archives represent new technological material structures built on the performativity of the software which delivers large-scale processing of these apparently dematerialised data (Sluis 2017, 28).
“research data which has not been shared or published by any means and is thus in contravention of the ‘FAIR’ principles which require data to be Findable Accessible, Interoperable and Reusable”.
Although the DPC jury hopes that this is a small group, I rather suspect that there is an unseen mountain of unpublished research data in archaeology (and in the interest of full disclosure: reader, I have some).
This crossed my screen at the same time as a paper published in the Harvard Data Science Review by Stephen Stigler: ‘Data Have a Limited Shelf Life’, in which he argues that data, unlike wines, do not improve with age. He suggests that old data are “Often … no more than decoration; sometimes they may be misleading in ways that cannot easily be discovered”, while emphasising this is not the same as saying they have no value. Using three examples of old statistical data, he shows how misleading and incomplete they can be if their full background is not known. In each case, the data were selected from a prior source, not always accurately referenced if at all. In some instances, uncovering the original data flagged problems with the sample that had been taken, in others it revealed a greater breadth and depth of information which had gone un-used because the particular research question had stripped them away.
Given the years, the money, expertise and energy we’ve spent on creating and managing archaeological data archives, the relative lack of evidence of reuse is a problem. Making our data open and available doesn’t equate to reusing it, nor does making it accessible necessarily correspond to making it usable. But if we’re not reusing data, how can we justify these resources? In their reflections on large-scale online research infrastructures Holly Wright and Julian Richards (2018) have recently suggested that we need to understand how to optimize archives and their interfaces in order to maximize the use and reuse of archaeological data, and explore how archaeological archives can better respond to user needs alongside ways to document and understand both quantitative and qualitative reuse.
However, I would argue that all these kinds of issues (alongside those of citation, recognition, training, etc.) while not resolved are at least known and mostly acknowledged. The real challenges to data reuse lie elsewhere and entail a much deeper understanding and appreciation of what reuse entails: issues associated with the re-presentation and interpretation of old data, the nature and purpose of reuse, and the opportunities and risks presented by reuse. Such questions are not specific to digital data; however, digital data change the terms of engagement with their near-instant access, volume, and flexibility, and their potentially transformative effects on the practice of archaeology now and in the future.
I recently published a paper, ‘Resilient Scholarship in the Digital Age’, which looked at the tensions between digital practice and academic labour (Huggett 2019). My focus was on the nature of academic experience within the modern university and the way in which the professional and personal life of the university academic is influenced by the digital technologies which enable and support the neoliberal commodification and commercialisation of universities (at least in the UK, North America and Australasia). It was a difficult paper to write, not least because of a strong personal interest and involvement, but also because of the way it ranged across digital sociology, the sociality of labour, resilience theory, management theory, feminist and Marxist theory, and so on, most of which was entirely new to me.
The referees were very positive in their comments (thankfully!), but one particular observation they made was that in focussing on university academia, I overlooked the implications for archaeological scholarship more widely, given that much of it occurs within the realms of Cultural Resource Management and related contract work, within governmental departments and non-governmental agencies, as well as within community initiatives. This is certainly true, as is underlined in the periodic surveys of archaeological employment in the UK (e.g. Aitchison 2019). However, in my response to the editors I argued that this was too broad a definition of scholarship for the scope of this particular paper, and, perhaps more importantly, would require a level of knowledge about the scholarly experience outside the university environment that I simply didn’t have – it’s some 30 years since I worked in contract archaeology, for example. Other people are better qualified than I to discuss scholarship in these working contexts.
We’re becoming increasingly accustomed to talk of Big Data in archaeology and at the same time beginning to see the resurgence of Artificial Intelligence in the shape of machine learning. And we’ve spent the last 20 years or so assembling mountains of data in digital repositories which are becoming big data resources for mining in the pursuit of machine learning training data. At the same time we are increasingly aware of the restrictions that those same repositories impose upon us – the use of pre-cooked ‘what/where/when’ queries, the need to (re)structure data in order to integrate different data sources and suppliers, and their largely siloed nature which limits cross-repository connections, for example. More generally, we are accustomed to the need to organise our data in specific ways in order to fit the structures imposed by database management systems, or indeed, to fit our data into the structures predefined by archaeological recording systems, both of which shape subsequent analysis. But what if it doesn’t need to be this way?