We sometimes underestimate the impact of digital data on archaeology because we have become so accustomed to the capture, processing, and analysis of data using our digital tools. Of course, archaeology is by no means alone in this respect. For example, Sandra Rendgren, who writes about data visualisation, infographics and interactive media, recently pointed to the creation of a new genre of journalism that has arisen from the availability of digital data and the means to analyse them (2018a). But this growth in reliance on digital data should lead to a re-consideration of what we actually mean by data. Indeed, Sandra Rendgren suggests that the term ‘data’ can be likened to a transparent fluid – “always used but never much reflected upon” – because of its ubiquity and apparent lack of ambiguity (2018b).
Although there has been a dramatic growth in the development of autonomous vehicles and consequent competition between different companies and different methodologies, and despite the complexities of the task, the number of incidents remains remarkably small though no less tragic where the death of the occupants or other road users is involved. Of course, at present autonomous cars are not literally autonomous in the sense that a human agent is still required to be available to intervene, and accidents involving such vehicles are usually a consequence of the failure of the human component of the equation not reacting as they should. A recent fatal accident involving a Tesla Model X (e.g. Hruska 2018) has resulted in some push-back by Tesla who have sought to emphasise that the blame lies with the deceased driver rather than with the technology. One of the company’s key concerns in this instance appears to be the defence of the functionality of their Autopilot system, and in relation to this, a rather startling comment on the Tesla blog recently stood out:
No one knows about the accidents that didn’t happen, only the ones that did. The consequences of the public not using Autopilot, because of an inaccurate belief that it is less safe, would be extremely severe. (Tesla 2018).
So here’s a thing. A while ago, I asked whether there was any way to quantify the extent to which archaeologists were citing their reuse of data. I used the Thomson Reuters/Clarivate Analytics Data Citation Index (DCI) as a starting point, but it didn’t go too well … Back then, the DCI indicated that 56 of the 476 data studies derived from the UK’s Archaeology Data Service repository had apparently been cited elsewhere in the Web of Science databases (the figure is currently 58 out of 515). But I also found that the citations themselves were problematic: the citation of the published paper/volume was frequently incomplete or abbreviated, many appeared to be self-citations from within interim or final reports, in some cases the citations preceded the dates of the project being referenced, and in many instances it was possible to demonstrate that the data had been cited (in some form or other) but this had not been captured in the DCI. At that point I concluded that the DCI was of little value at present. So what was going on?
Recent years have seen a flurry of publications and statements concerning the importance and value of the open science movement in archaeology. Examples include the collection of papers published in 2012 in World Archaeology (see Lake 2012), the volume on Open Source Archaeology edited by Andrew Wilson and Ben Edwards (2015), and, most recently, a series of papers by Ben Marwick (2016; Marwick et al 2017). The idea that publications, data, and methods (including code) should be freely accessible in order to make archaeological research more reproducible is evidently a ‘good thing’ and very much in vogue.
“Our very diverse work ranging from excavation, over lab tests, to interpretations is often only made available through a summarising publication that is rarely accessible to anyone other than institutions paying huge amounts of money. This is just not the way science works anymore. In such a system, how can we find out all the details of excavation results? How can we reproduce lab tests? How can we evaluate the empirical and historical background to a published interpretation in exhaustive detail? The answer is: we can’t.”
Rob Barrett has recently said something similar specifically in relation to 3D reconstruction. The value of opening up archaeological research seems undeniable, and the set of practices outlined by the new Open Science Interest Group (Marwick et al 2017, 12-13) put forward make a great deal of sense and are highly desirable. But there are some implicit underlying assumptions behind all this which don’t seem to have been addressed. They don’t detract from the importance of pursuing a truly open archaeology, but not recognising them risks not learning from past experience.
I’ve commented here and here about the question of data reuse (or more accurately, the lack of it) and the implications for archaeological digital repositories. It’s frequently argued that the key incentive for making data available for reuse is providing credit through citation. So how’s that going? I’ve not seen any attempt to actually quantify this, so out of curiosity I thought I’d have a go.
A logical starting point is Thomson Reuters Data Citation Index – according to its owners (it’s a licensed rather than public resource), this indexes the contents of a large number of the world’s leading data repositories, and, on checking, the UK’s Archaeology Data Service (ADS) appears among them. So far so good.
We often hear of the active archive, but what about an idle one? In a post on Digital Data Realities, I suggested that, although we might wish otherwise, our digital archaeological data repositories seemed relatively little-used. The Archaeology Data Service access statistics did not suggest a large uptake for the project archives it holds, and the ADS had not found it easy to attract entries to its Digital Data Reuse Awards in the past. In that light, I commented that it would be interesting to see how the OpenContext & Carleton Prize for Archaeological Visualization would get on. Well, the jury is now in, and the winner is … the ‘Poggio Civitate VR Data Viewer’, an impressive-looking data viewer, though as it requires an HTC Vive to use, I can sadly only watch the video rather than experience it myself …
However, as interesting are Shawn Graham’s reflections on the experience of organising the contest:
“We offered real money – up to a $1000 in prizes. We promoted the hang out of it. We made films, we wrote tutorials, we contacted professors across the anglosphere. We had very little uptake.”
(accompanied in his presentation by an image of tumbleweed) … Indeed, only the one winner was announced for the team prize – no individual or student prizes were awarded as was originally intended. So what’s going on?
We’re becoming increasingly accustomed to our digital technologies acting as gatekeepers – perhaps most obviously in the way that the smartphone acts as gatekeeper to our calendar and/or email. In fact, this technological gatekeeping functionality appears everywhere you look, whether it’s in the form of physical devices providing access to information, software interfaces providing access to tools, or web interfaces providing access to data, for example. A while ago, I mused about the way that archaeological data are increasingly made available via key gatekeepers, and that consequently “negotiating access is often not as straightforward or clear-cut as it might be – both in terms of the shades of ‘openness’ on offer and the restrictions imposed by the interfaces to those data.” Since writing that, I’ve essentially left that statement hanging. What was I thinking of?
Infrastructures are all around us. They make the modern world work – whether we’re thinking of infrastructures in terms of gas, electric or water supply, telephony, fibre networks, road and rail systems, or organisations such as Google, Amazon and others, and so on. Infrastructures are also what we are building in archaeology. Data distribution systems have increasingly become an integral part of the archaeological toolkit, and the creation of a digital infrastructure – or cyberinfrastructure – underpins the set of grand challenges for archaeology laid out by Keith Kintigh and colleagues (2015), for example. But what are the consequences and challenges associated with these kinds of infrastructures? What are we knowingly or unknowingly constructing?
Patrik Svensson (2015) has pointed to a lack of critical work and an absence of systemic awareness surrounding the developments of infrastructures within the humanities. While he points to archaeology as one of the more developed in infrastructural terms, this isn’t necessarily a ‘good thing’ in the light of his critique. As he says, “Humanists do not … necessarily think of what they do as situated and conditioned in terms of infrastructures” (2015, 337) and consequently:
“A real risk … is that new humanities infrastructures will be based on existing infrastructures, often filtered through the technological side of the humanities or through the predominant models from science and engineering, rather than being based on the core and central needs of the humanities.” (2015, 337).
The UK is suddenly wakening from the reality distortion field that has been created by politicians on both sides and only now beginning to appreciate the consequences of Brexit – our imminent departure from the European Union. But – without forcing the metaphor – are we operating within some kind of archaeological reality distortion field in relation to digital data?
Undoubtedly one of the big successes of digital archaeology in recent years has been the development of digital data repositories and, correspondingly, increased access to archaeological information. Here in the UK we’ve been fortunate enough to have seen this develop over the past twenty years in the shape of the Archaeology Data Service, which offers search tools, access to digital back-issues of journals, monograph series and grey literature reports, and the availability of downloadable datasets from a variety of field and research projects. In the past, large-scale syntheses took years to complete (for instance, Richard Bradley’s synthesis of British and Irish prehistory took four years paid research leave with three years of research assistant support in order to travel the country to seek out grey literature reports accumulated over 20 years (Bradley 2006, 10)). At this moment, there are almost 38,000 such reports in the Archaeology Data Service digital library, with more are added each month (a more than five-fold increase since January 2011, for example). The appearance of projects of synthesis such as the Rural Settlement of Roman Britain is starting to provide evidence of the value of access to such online digital resources. And, of course, other countries increasingly have their own equivalents of the ADS – tDAR and OpenContext in the USA, DANS in the Netherlands, and the Hungarian National Museum’s Archaeology Database, for instance).
But all is not as rosy in the archaeological digital data world as it might be.
[To interrupt the blogging hiatus, here’s the introduction to a recently published paper …]
Since the mid-1990s the development of online access to archaeological information has been revolutionary. Easy availability of data has changed the starting point for archaeological enquiry and the openness, quantity, range and scope of online digital data has long since passed a tipping point when online access became useful, even essential. However, this transformative access to archaeological data has not itself been examined in a critical manner. Access is good, exploitation is an essential component of preservation, openness is desirable, comparability is a requirement, but what are the implications for archaeological research of this flow – some would say deluge – of information?