Dipping in Data Lakes

We’re becoming increasingly accustomed to talk of Big Data in archaeology and at the same time beginning to see the resurgence of Artificial Intelligence in the shape of machine learning. And we’ve spent the last 20 years or so assembling mountains of data in digital repositories which are becoming big data resources for mining in the pursuit of machine learning training data. At the same time we are increasingly aware of the restrictions that those same repositories impose upon us – the use of pre-cooked ‘what/where/when’ queries, the need to (re)structure data in order to integrate different data sources and suppliers, and their largely siloed nature which limits cross-repository connections, for example. More generally, we are accustomed to the need to organise our data in specific ways in order to fit the structures imposed by database management systems, or indeed, to fit our data into the structures predefined by archaeological recording systems, both of which shape subsequent analysis. But what if it doesn’t need to be this way?

Towards a digital ethics of agential devices

Image by Rawpixel CC0 1.0 via Creative Commons

Discussion of digital ethics is very much on trend: for example, the Proceedings of the IEEE special issue on ‘Ethical Considerations in the Design of Autonomous Systems’ has just been published (Volume 107 Issue 3), and the Philosophical Transactions of the Royal Society A published a special issue on ‘Governing Artificial Intelligence – ethical, legal and technical opportunities and challenges’ late in 2018. In that issue, Corinne Cath (2018, 3) draws attention to the growing body of literature surrounding AI and ethical frameworks, debates over laws governing AI and robotics across the world and points to an explosion of activity in 2018 with a dozen national strategies published and billions in government grants allocated. She also notes the way that many of the leaders in both debates and the technologies are based in the USA which itself presents an ethical issue in terms of the extent to which AI systems mirror the US culture rather than socio-cultural systems elsewhere around the world (Cath 2018, 4).

Agential devices, whether software or hardware, essentially extend the human mind by scaffolding or supporting our cognition. This broad definition therefore runs the gamut of digital tools and technologies, from digital cameras to survey devices (e.g. Huggett 2017), through software supporting data-driven meta-analyses and their incorporation in machine-learning tools, to remotely controlled terrestrial and aerial drones, remotely operated vehicles, autonomous surface and underwater vehicles, and lab-based robotic devices and semi-autonomous bio-mimetic or anthropomorphic robots. Many of these devices augment archaeological practice, reducing routinised and repetitive work in the office environment and in the field. Others augment work by developing data-driven methods which represent, store, and manipulate information in order to undertake tasks previously thought to be uncomputable or incapable of being automated. In the process, each raises ethical issues of various kinds. Whether agency can be associated with such devices can be questioned on the basis that they have no intent, responsibility or liability, but I would simply suggest that anything we ascribe agency to acquires agency, especially bearing in mind the human tendency to anthropomorphize our tools and devices. What I am not suggesting, however, is that these systems have a mind or consciousness themselves, which represents a whole different ethical set of questions.

Intrinsic Digitality

One might imagine that a claim that

“The archaeological record is intrinsically digital, not in the sense that it turns digital once the data have been entered and processed, but, more radically, in the sense that it is by its very nature digital, in its genesis and its structure.” (Buccellati 2017, 232)

would pique the interest of any digital archaeologist. But strangely, that seems not to be the case: Giorgio Buccellati’s book appears to be currently unreviewed and largely, it seems, unremarked upon. Two exceptions to this generalisation are Gavin Lucas and Bill Caraher. In his latest book, Gavin Lucas suggests that Buccellati’s characterisation of archaeology as natively digital is problematic (2019, 91), but the critique is limited as the book’s focus lies elsewhere, on textuality. In his response to Sara Perry and James Taylor’s ‘Theorising the Digital’ paper (2018), in which they point to the disconnect between the demonstrable impact of digital archaeology on archaeological method relative to its comparative lack of effect on archaeological theory, Bill Caraher suggests (2018) that Buccellati’s book represents a rare example of the interplay between digital theory and broader archaeological theory. So why does Buccellati argue that archaeology is natively digital? And is his characterisation of digitality useful to digital archaeology, as well as to archaeology more broadly?

A Push Button Archaeology

Adapted from original by włodi (via Wikimedia Commons) CC-BY-SA 2.0

Buttons figure large in the world around us. Just in the last year we’ve seen everything from presidents boasting about the size of their nuclear buttons to Apple being faced with a class action over the failure of their new ‘improved’ butterfly keys to Amazon’s Dash buttons being barred in Germany for not providing information about price prior to being pressed. In archaeology, we’ve become accustomed to buttons and button-presses generating data, performing analyses, and presenting results, ranging across the digital instruments we employ and the software tools we rely on. So, to pick a random example, “researchers will be able to compare ceramics across thousands of sites with a click of the button.” (Smith et al 2014, 245).

Rachel Plotnick has recently discussed the place of buttons in our cultural imaginary:

… push a button and something magical begins. A sound erupts that seems never to have existed before. A bomb explodes. A vote registers. A machine animates, whirling and processing. A trivial touch of the finger sets these forces in motion. The user is all powerful, sending the signal that turns on a television, a mobile phone, a microwave. She makes everything go. Whether or not she understands how the machine works, she determines the fate of the universe. (Plotnick 2018, xiv).

Explainability in digital systems

Created via http://www.hetemeel.com/

Some time ago, I suggested that machine-learning systems in archaeology ought to be able to provide human-scale explanations in support of their conclusions, noting that many of the techniques used in ML were filtering down into automated methods used to classify, extract and abstract archaeological data. I concluded: “We would expect an archaeologist to explain their reasoning in arriving at a conclusion; why should we not expect the same of a computer system?”.

This seemed fair enough at the time, if admittedly challenging. What I hadn’t appreciated, though, was the controversial nature of such a claim. For sure, in that piece I referred to Yoshua Bengio’s argument that we don’t understand human experts and yet we trust them, so why should we not extend the same degree of trust to an expert computer (Pearson 2016)? But it transpires this is quite a common argument posited against claims that systems should be capable of explaining themselves, not least among high-level Google scientists. For example, Geoff Hinton recently suggested in an interview that to require that you can explain how your AI systems works (as, for example, the GDPR regulations do) would be a disaster:

Digital Place, Cognitive Space

CC0 Adapted from photo by Jordan Madrid on Unsplash

To what extent does our use of digital devices to capture and process archaeological data affect our perceptions of what was there? Mark Altaweel (2018) has recently asked a similar question in relation to GPS technologies – how do these affect our understanding and experience of place? He suggests that they diminish our sense of place and experiences that we might otherwise have as we navigate according to their recommendations. Certainly, satnavs are notorious for taking our navigational cognitive load upon themselves and consequently leading drivers who are insufficiently aware of their surroundings into undesirable, even dangerous situations. We might think that the human cognitive load that is thereby freed up by such devices ought to be capable of being diverted into more useful, more extensive, areas – we literally have the space to think about bigger and deeper things as a consequence of their application. This kind of argument frequently arises in relation to the value of automation, for instance, and can be seen in the kinds of discussions surrounding the use of structure-from-motion photogrammetric recording on archaeological excavations, for example. But is this supposed release of cognitive space an unalloyed good? Or is this a case of the technologies distancing us from the physicality of the archaeological material and space in front of us?

Artificial Archaeologies

Adapted from original by Adam Purves CC BY 2.0

In his book Homo Deus, Yuval Noah Harari rather randomly chooses archaeology as an example of a job area that is ‘safe’ from Artificial Intelligence:

The likelihood that computer algorithms will displace archaeologists by 2033 is only 0.7 per cent, because their job requires highly sophisticated types of pattern recognition, and doesn’t produce huge profits. Hence it is improbable that corporations or government will make the necessary investment to automate archaeology within the next twenty years (Harari 2015, 380; citing Frey and Osborne 2013).

It’s an intriguing proposition, but is he right? Certainly, archaeology is far from a profit-generating machine, but he rather assumes that it’s down to governments or corporations to invest in archaeological automation: a very limited perspective on the origins of much archaeological innovation. However, the idea that archaeology is resistant to artificial intelligence is something that is worth unpicking.

Archaeology of the Datanthropocene

Original by Clarence Alford CC0 via Pixabay

Some time ago, David Berry introduced the term ‘infrasomatization’ (Berry 2016) which he defines as the production of constitutive infrastructures; specifically the way that digital algorithms are deployed and change existing infrastructures, and how they alter rationalities by introducing computational interdependencies and structural brittleness into our systems (Berry 2018). In the process, he has just coined another new term: the Datanthropocene, the data-intensive society. This is closely linked to ‘big data’ approaches, data-intensive science, and he suggests that it “creates new economic structures but also new social realities and data-intensive subjectivities and hence new problems for society to negotiate”.

Of course, debates continue about the Anthropocene, not least whether or not it can even be defined as a specific epoch – does it start with the atomic era, for instance, or maybe even with the introduction of agriculture, or is it primarily associated with human-created climate change, pollution and extinctions?

Interactive Visualisation

CC0 1.0 Public Domain. Original by Stefan Keller via Pixabay

If a visualisation is to be perceived as realistic, is it increasingly required to respond to the viewer’s actions? Is static visualisation becoming old hat? Has interactivity become a necessary part of engendering perception, action, and emotion in our response to a visualisation? And what do we mean by interactivity?

Of course, interactivity may take various forms. For instance, it may entail navigation facilities: an ability to change the viewpoint, to move through the visualisation. It may also entail manipulation facilities: the ability to modify the visualisation, to move and re-organise elements. But what are we actually interacting with?

Evidently we see a visual representation or simulation of an environment so we are interacting with that simulation. But this implies a single interface, between us as the physical embodied viewer/actor and the visualisation. Indeed, Virtual Reality is characterised as the transparent invisible interface which is all-encompassing and three-dimensional; the user is surrounded by an immersive, total simulation in which the interface both disappears and becomes the experienced simulation at one and the same time (Pold 2005). But is this true?

Digital Data Relations

Data is the new oil
(adapted from original by Gerd Leonhard, CC-BY-SA 2.0)

We sometimes underestimate the impact of digital data on archaeology because we have become so accustomed to the capture, processing, and analysis of data using our digital tools. Of course, archaeology is by no means alone in this respect. For example, Sandra Rendgren, who writes about data visualisation, infographics and interactive media, recently pointed to the creation of a new genre of journalism that has arisen from the availability of digital data and the means to analyse them (2018a). But this growth in reliance on digital data should lead to a re-consideration of what we actually mean by data. Indeed, Sandra Rendgren suggests that the term ‘data’ can be likened to a transparent fluid – “always used but never much reflected upon” – because of its ubiquity and apparent lack of ambiguity (2018b).

