Looking for explanations

miracle_cure
(US Food and Drug Administration – Public Domain)

In 2014 the European Union determined that a person’s ‘right to be forgotten’ by Google’s search was a basic human right, but it remains the subject of dispute. If requested, Google currently removes links to an individual’s specific search result on any Google domain that is accessed from within Europe and on any European Google domain from wherever it is accessed. Google is currently appealing against a proposed extension to this which would require the right to be forgotten to be extended to searches across all Google domains regardless of location, so that something which might be perfectly legal in one country would be removed from sight because of the laws of another. Not surprisingly, Google sees this as a fundamental challenge to accessibility of information.

As if the ‘right to be forgotten’ was not problematic enough, the EU has recently published its General Data Protection Regulation 2016/679 to be introduced from 2018 which places limits on the use of automated processing for decisions taken concerning individuals and requires explanations to be provided where an adverse effect on an individual can be demonstrated (Goodman and Flaxman 2016). This seems like a good idea on the face of it – shouldn’t a self-driving car be able to explain the circumstances behind a collision? Why wouldn’t we want a computer system to explain its reasoning, whether it concerns access to credit or the acquisition of an insurance policy or the classification of an archaeological object?

Continue reading

Digital Data Realities

The Cost of Digital Data
The Cost of Digital Data (Ainsley Seago via Wikimedia Commons) CC BY 4.0

The UK is suddenly wakening from the reality distortion field that has been created by politicians on both sides and only now beginning to appreciate the consequences of Brexit – our imminent departure from the European Union. But – without forcing the metaphor – are we operating within some kind of archaeological reality distortion field in relation to digital data?

Undoubtedly one of the big successes of digital archaeology in recent years has been the development of digital data repositories and, correspondingly, increased access to archaeological information. Here in the UK we’ve been fortunate enough to have seen this develop over the past twenty years in the shape of the Archaeology Data Service, which offers search tools, access to digital back-issues of journals, monograph series and grey literature reports, and the availability of downloadable datasets from a variety of field and research projects. In the past, large-scale syntheses took years to complete (for instance, Richard Bradley’s synthesis of British and Irish prehistory took four years paid research leave with three years of research assistant support in order to travel the country to seek out grey literature reports accumulated over 20 years (Bradley 2006, 10)). At this moment, there are almost 38,000 such reports in the Archaeology Data Service digital library, with more are added each month (a more than five-fold increase since January 2011, for example). The appearance of projects of synthesis such as the Rural Settlement of Roman Britain is starting to provide evidence of the value of access to such online digital resources. And, of course, other countries increasingly have their own equivalents of the ADS – tDAR and OpenContext in the USA, DANS in the Netherlands, and the Hungarian National Museum’s Archaeology Database, for instance).

But all is not as rosy in the archaeological digital data world as it might be.

Continue reading

Biggish Data

Big Data
Big Data 😉

Big Data is (are?) old hat …  Big Data dropped off Gartner’s Emerging Technologies Hype Cycle altogether in 2015, having slipped into the ‘Trough of Disillusionment’ in 2014 (Gartner Inc. 2014, 2015a). The reason given for this was simply that it had evolved and had become the new normal – the high-volume, high-velocity, high-variety types of information that classically defined ‘big data’ were becoming embedded in a range of different practices (e.g. Heudecker 2015).

At the same time, some of the assumptions behind Big Data were being questioned. It was no longer quite so straightforward to claim that ‘big data’ could overcome ‘small data’ by throwing computer power at a problem, or that quantity outweighed quality such that the large size of a dataset offset any problems of errors and inaccuracies in the data (e.g. Mayer-Schönberger and Cukier 2013, 33), or that these data could be analysed in the absence of any hypotheses (Anderson 2008).

For instance, boyd and Crawford had highlighted the mythical status of ‘big data’; in particular that it somehow provided a higher order of intelligence that could create insights that were otherwise impossible, and assigned them an aura of truth, objectivity and accuracy (2012, 663). Others followed suit. For example, McFarland and McFarland (2015) have recently shown how most Big Data analyses give rise to “precisely inaccurate” results simply because the sample size is so large that they give rise to statistically highly significant results (and hence the debacle over Google Flu Trends  – for example, Lazer and Kennedy 2015). Similarly, Pechenick et al (2015) showed how, counter-intuitively, results from Google’s Books Corpus could easily be distorted by a single prolific author, or by the fact that there was a marked increase in scientific articles included in the corpus after the 1960s. Indeed, Peter Sondergaard, a senior vice president at Gartner and global head of Research, underlined that data (big or otherwise) are inherently dumb without algorithms to work on them (Gartner Inc. 2015b). In this regard, one might claim Big Data have been superseded by Big Algorithms in many respects.

Continue reading

Let’s talk about digital archaeology

Andre Costopoulos lays down a series of provocations in his opening editorial for the new Digital Archaeology section of the Frontiers in Digital Humanities journal. So far, there doesn’t seem to have been much response – Twitter chatter, for example, simply draws attention to the article without comment (except perhaps in one instance where it may or may not be addressed tongue-in-cheek – such is the danger of social media!).

ME_463_StrawMan-640x199
Mimi and Eunice – (CC BY-SA 3.0)

He starts by saying simply:

“I want to stop talking about digital archeology. I want to continue doing archeology digitally … I would like to lay the groundwork for the journal as a place primarily to do archeology digitally, rather than as a place to discuss digital archeology”.

There’s certainly nothing wrong about a journal focussed on digital archaeological applications, but what’s wrong with talking about digital archaeology? He goes on:

Continue reading

A Digital Detox for Digital Archaeology?

Digital Detox sign
Adapted from original image by davitydave (CC BY 2.0)

Digital detox has been very much in the news of late, with celebrities from film stars to pop singers to video bloggers attempting to digitally detox for a host of different reasons. Ten years ago, Thomas Friedman, the New York Times op-ed columnist and three-time Pulitzer Prize winner, wrote about continuous partial attention – the consequence of our attempts to multitask when on the Internet or cellphone while watching television, typing an email or paper, and trying to hold a conversation with someone – he called it “the malady of modernity. We have gone from the Iron Age to the Industrial Age to the Information Age to the Age of Interruption”.

He was certainly not the first to draw attention to this – for example, in 1971 Herbert Simon, an American political scientist and Nobel Prize winner, wrote

“In an information-rich world, the wealth of information means a dearth of something else: a scarcity of whatever it is that information consumes. What information consumes is rather obvious: it consumes the attention of its recipients. Hence a wealth of information creates a poverty of attention and a need to allocate that attention efficiently among the overabundance of information sources that might consume it.” (Simon 1971, 40-1).

Continue reading

Open data and the transformation of archaeological knowledge

[To interrupt the blogging hiatus, here’s the introduction to a recently published paper …]

Open Access logoSince the mid-1990s the development of online access to archaeological information has been revolutionary. Easy availability of data has changed the starting point for archaeological enquiry and the openness, quantity, range and scope of online digital data has long since passed a tipping point when online access became useful, even essential. However, this transformative access to archaeological data has not itself been examined in a critical manner. Access is good, exploitation is an essential component of preservation, openness is desirable, comparability is a requirement, but what are the implications for archaeological research of this flow – some would say deluge – of information?

Continue reading

Unconscious Bias

stencil
Modified from the original by grahamc99. CC-BY-2.0

My employer has decided to send all those of us involved in recruitment and promotion on Unconscious Bias training, in recognition that unconscious bias may affect our decisions in one way or another. Unconscious bias in our dealings with others may be triggered by both visible and invisible characteristics, including gender, age, skin colour, sexual orientation, (dis)ability, accent, education, class, professional group etc.. That started me thinking – what about unconscious bias in relation to digital archaeology?

‘Unconscious bias’ isn’t a term commonly encountered within archaeology, although Sara Perry and others have written compellingly about online sexism and abuse experienced in academia and archaeology (Perry 2014, Perry et al 2015, for example). ‘Bias’, on the other hand, is rather more frequently referred to, especially in the context of our relationship to data. Most of us are aware, for instance, that as archaeologists we bring a host of preconceptions, assumptions, as well as cultural, gender and other biases to bear on our interpretations, and recognising this, seek means to reduce if not avoid it altogether. Nevertheless, there may still be bias in the sites we select, the data we collect, and the interpretations we place upon them. But what happens when the digital intervenes?

Continue reading

Shaping Boxes

Flight recorder black box
Flight data recorder black box:
image by Rameshng [CC BY-SA 3.0] via Wikimedia Commons
Bethany Nowviskie has written recently about black boxes:

“Nobody lives with conceptual black boxes and the allure of revelation more than the philologist or the scholarly editor. Unless it’s the historian—or the archaeologist—or the interpreter of the aesthetic dimension of arts and letters. Okay, nobody lives with black boxes more than the modern humanities scholar, and not only because of the ever-more-evident algorithmic and proprietary nature of our shared infrastructure for scholarly communication. She lives with black boxes for two further reasons: both because her subjects of inquiry are themselves products of systems obscured by time and loss (opaque or inaccessible, in part or in whole), and because she operates on datasets that, generally, come to her through the multiple, muddy layers of accident, selection, possessiveness, generosity, intellectual honesty, outright deception, and hard-to-parse interoperating subjectivities that we call a library.” (Nowviskie 2015 – her emphases)

Leaving aside the textual emphasis that is frequently the focus of digital humanities, these “multiple, muddy layers” certainly speaks to the archaeologist in me. The idea that digital archaeologists (and archaeologists using digital tools for that matter) work with black boxes has a long history – for instance, the black-boxing of archaeological multivariate quantitative analyses in the 1960s and 1970s was a not uncommon criticism at the time. During the intervening forty-odd years, however, it has become a topic that we rarely discuss. What are the black boxes we use? Where do they appear? Do we recognise them? What is their effect? Nowviskie talks of black boxes in terms of the subjects of enquiry – which as archaeologists we can certainly understand! – and the datasets about them, but, as she recognises, black boxing extends far beyond this.

Continue reading

A Post-Digital Archaeology?

Given the current state of digital archaeology, is it more properly referred to as post-digital archaeology? What does this mean? There’s a lot of confusion about the term ‘post-digital’, not least because it’s often used by techno-boosters in the sense of “what next?”, assuming that since everything is now digital, we’re looking to the next ‘big thing’ – a presumption that is questionable to say the least.

Continue reading