Faith, Trust, and Pixie Dust

Trust - broken egg
Adapted from original image by Kumar’s Edit, CC BY 2.0 Deed

It’s been some time since I last blogged, largely because my focus has lain elsewhere in recent months writing long-form pieces for more traditional outlets. The most recent of these considers the question of trust in digital things, a topic spurred by the recent (and ongoing) scandal surrounding the Post Office Horizon computer system here in the UK which saw the false conviction for theft, fraud, and false accounting of hundreds of people. One of the things that came to the fore as a result of the scandal was the way that English law presumes the reliability of a computer system:

In effect, the ‘word’ of a computational system was considered to be of a higher evidential value than the opinion of legal professionals or the testimony of witnesses. This was not merely therefore a problem with digital evidence per se, but also the response to it. (McGuire and Renaud 2023: 453)

Continue reading

Discovery Machines

A model robot reading a kindle
Adapted from the original by Brian J. Matis (CC BY-NC-SA 2.0)

Michael Brian Schiffer is perhaps best-known (amongst archaeologists of a certain age in the UK at least), for his development of behavioural archaeology, which looked at the changing relationships between people and things as a response to the processual archaeology of Binford et al. (Schiffer 1976; 2010), and for his work on the formation processes of the archaeological record (Schiffer 1987). But Schiffer also has an extensive track record of work on archaeological (and behavioural) approaches to modern technologies and technological change (e.g., Schiffer 1992; 2011) which receives little attention in the digital archaeology arena, in part because despite his interest in a host of other electrical devices involved in knowledge creation (e.g., Schiffer 2013, 81ff) he has little to say about computers beyond observing their use in modelling and simulation or as an example of an aggregate technology constructed from multiple technologies and having a generalised functionality (Schiffer 2011, 167-171).

In his book The Archaeology of Science, Schiffer introduces the idea of the ‘discovery machine’. In applying such an apparatus,

Continue reading

HARKing to Big Data?

Aircraft Detection Before Radar
A 1920s aircraft detector

Big Data has been described as revolutionary new scientific paradigm, one in which data-intensive approaches supersede more traditional scientific hypothesis testing. Conventional scientific practice entails the development of a research design with one or more falsifiable theories, followed by the collection of data which allows those theories to be tested and confirmed or rejected. In a Big Data world, the relationship between theory and data is reversed: data are collected first, and hypotheses arise from the subsequent analysis of that data (e.g., Smith and Cordes 2020, 102-3). Lohr described this as “listening to the data” to find correlations that appear to be linked to real world behaviours (2015, 104). Classically this is associated with Anderson’s (in)famous declaration of the “end of theory”:

With enough data, the numbers speak for themselves … Correlation supersedes causation, and science can advance even without coherent models, unified theories, or really any mechanistic explanation at all. (Anderson 2008).

Such an approach to investigation has traditionally been seen as questionable scientific practice, since patterns will always be found in even the most random data, if there’s enough data and powerful enough computers to process it.

Continue reading

Dipping in Data Lakes

We’re becoming increasingly accustomed to talk of Big Data in archaeology and at the same time beginning to see the resurgence of Artificial Intelligence in the shape of machine learning. And we’ve spent the last 20 years or so assembling mountains of data in digital repositories which are becoming big data resources for mining in the pursuit of machine learning training data. At the same time we are increasingly aware of the restrictions that those same repositories impose upon us – the use of pre-cooked ‘what/where/when’ queries, the need to (re)structure data in order to integrate different data sources and suppliers, and their largely siloed nature which limits cross-repository connections, for example. More generally, we are accustomed to the need to organise our data in specific ways in order to fit the structures imposed by database management systems, or indeed, to fit our data into the structures predefined by archaeological recording systems, both of which shape subsequent analysis. But what if it doesn’t need to be this way?

Continue reading

Towards a digital ethics of agential devices

Image by Rawpixel CC0 1.0 via Creative Commons

Discussion of digital ethics is very much on trend: for example, the Proceedings of the IEEE special issue on ‘Ethical Considerations in the Design of Autonomous Systems’ has just been published (Volume 107 Issue 3), and the Philosophical Transactions of the Royal Society A published a special issue on ‘Governing Artificial Intelligence – ethical, legal and technical opportunities and challenges’ late in 2018. In that issue, Corinne Cath (2018, 3) draws attention to the growing body of literature surrounding AI and ethical frameworks, debates over laws governing AI and robotics across the world and points to an explosion of activity in 2018 with a dozen national strategies published and billions in government grants allocated. She also notes the way that many of the leaders in both debates and the technologies are based in the USA which itself presents an ethical issue in terms of the extent to which AI systems mirror the US culture rather than socio-cultural systems elsewhere around the world (Cath 2018, 4).

Agential devices, whether software or hardware, essentially extend the human mind by scaffolding or supporting our cognition. This broad definition therefore runs the gamut of digital tools and technologies, from digital cameras to survey devices (e.g. Huggett 2017), through software supporting data-driven meta-analyses and their incorporation in machine-learning tools, to remotely controlled terrestrial and aerial drones, remotely operated vehicles, autonomous surface and underwater vehicles, and lab-based robotic devices and semi-autonomous bio-mimetic or anthropomorphic robots. Many of these devices augment archaeological practice, reducing routinised and repetitive work in the office environment and in the field. Others augment work by developing data-driven methods which represent, store, and manipulate information in order to undertake tasks previously thought to be uncomputable or incapable of being automated. In the process, each raises ethical issues of various kinds. Whether agency can be associated with such devices can be questioned on the basis that they have no intent, responsibility or liability, but I would simply suggest that anything we ascribe agency to acquires agency, especially bearing in mind the human tendency to anthropomorphize our tools and devices. What I am not suggesting, however, is that these systems have a mind or consciousness themselves, which represents a whole different ethical set of questions.

Continue reading

Explainability in digital systems

Created via http://www.hetemeel.com/

Some time ago, I suggested that machine-learning systems in archaeology ought to be able to provide human-scale explanations in support of their conclusions, noting that many of the techniques used in ML were filtering down into automated methods used to classify, extract and abstract archaeological data. I concluded: “We would expect an archaeologist to explain their reasoning in arriving at a conclusion; why should we not expect the same of a computer system?”.

This seemed fair enough at the time, if admittedly challenging. What I hadn’t appreciated, though, was the controversial nature of such a claim. For sure, in that piece I referred to Yoshua Bengio’s argument that we don’t understand human experts and yet we trust them, so why should we not extend the same degree of trust to an expert computer (Pearson 2016)? But it transpires this is quite a common argument posited against claims that systems should be capable of explaining themselves, not least among high-level Google scientists. For example, Geoff Hinton recently suggested in an interview that to require that you can explain how your AI systems works (as, for example, the GDPR regulations do) would be a disaster:

Continue reading

Artificial Archaeologies

Adapted from original by Adam Purves CC BY 2.0

In his book Homo Deus, Yuval Noah Harari rather randomly chooses archaeology as an example of a job area that is ‘safe’ from Artificial Intelligence:

The likelihood that computer algorithms will displace archaeologists by 2033 is only 0.7 per cent, because their job requires highly sophisticated types of pattern recognition, and doesn’t produce huge profits. Hence it is improbable that corporations or government will make the necessary investment to automate archaeology within the next twenty years (Harari 2015, 380; citing Frey and Osborne 2013).

It’s an intriguing proposition, but is he right? Certainly, archaeology is far from a profit-generating machine, but he rather assumes that it’s down to governments or corporations to invest in archaeological automation: a very limited perspective on the origins of much archaeological innovation. However, the idea that archaeology is resistant to artificial intelligence is something that is worth unpicking.

Continue reading

Looking for explanations

miracle_cure
(US Food and Drug Administration – Public Domain)

In 2014 the European Union determined that a person’s ‘right to be forgotten’ by Google’s search was a basic human right, but it remains the subject of dispute. If requested, Google currently removes links to an individual’s specific search result on any Google domain that is accessed from within Europe and on any European Google domain from wherever it is accessed. Google is currently appealing against a proposed extension to this which would require the right to be forgotten to be extended to searches across all Google domains regardless of location, so that something which might be perfectly legal in one country would be removed from sight because of the laws of another. Not surprisingly, Google sees this as a fundamental challenge to accessibility of information.

As if the ‘right to be forgotten’ was not problematic enough, the EU has recently published its General Data Protection Regulation 2016/679 to be introduced from 2018 which places limits on the use of automated processing for decisions taken concerning individuals and requires explanations to be provided where an adverse effect on an individual can be demonstrated (Goodman and Flaxman 2016). This seems like a good idea on the face of it – shouldn’t a self-driving car be able to explain the circumstances behind a collision? Why wouldn’t we want a computer system to explain its reasoning, whether it concerns access to credit or the acquisition of an insurance policy or the classification of an archaeological object?

Continue reading