2 Leave a comment on paragraph 2 0 The historian sits down at her desk, flicking on the lamp. She begins to pore over a stack of badly photocopied court proceedings from late 18th century London, transcribing the text. As she works, she begins to notice interesting patterns in the language used to describe young female prisoners. ‘I wonder….’. She turns to the Old Bailey Online and begins to search. Soon, she has a corpus of a thousand court proceedings featuring women prisoners.  She downloads the complete transcriptions, and loads them into Voyant Tools. Moments later, she has a graph of key words, their collocations, and their frequencies over time. A suspicion grows. She turns to MALLET and begins to look for the underlying semantic structure in the records. The algorithm, after much exploration, seems to suggest that 23 topics account for the majority of the words in each text.

3 Leave a comment on paragraph 3 0 But what do these topics, these lists of words, mean? She begins to explore the relationship between the topics and the texts, uncovering a web of discourse, seemingly surrounding the moral duty of the state towards women prisoners. She takes this web and begins to explore its formal characteristics as a network – what words, what ideas, are doing the heavy semantic lifting? –  while at the same time, she runs the RezoViz tool on the corpus (part of Voyant Tools) to extract the named individuals and organizations in the document.  She begins to query the social network that she has extracted, and is able to identify sub-communities of women and warders, children and men, zeroing in on a smaller set of key individuals who tied the prison community together. Soon, she has a powerful, macroscopic sense of not just the discourses surrounding a century of women’s trials, but also of the key individuals, organizations, their connections. She looks at the clock; two hours have passed. Satisfied, she turns off her historical macroscope, her computer, and turns once again to the transcription at hand.

4 Leave a comment on paragraph 4 0 We live in an era where humanities scholars need to understand what digital media, their algorithms, assumptions, usage, and agency, are doing to the traditional projects of humanistic scholarship. The humanities and digital media – new media – go back decades, and have often informed each other’s development. Taking a more broad view of what ‘new media’ can mean, we note that the introduction of previous revolutions in communication technology and the ways they represent/construct human knowledge have similarly required new perspectives and new methods in response. Our scholar above describes one way historians might engage with so-called ‘big data’ in history. There are others. Between the three authors of this work, we have explored many different tools and perspectives in big data for historical and humanistic scholarship. This volume represents our view of what some of the most useful of these developing approaches are, how to use them, what to be wary of, and the kinds of questions and new perspectives our macroscope opens up.

5 Leave a comment on paragraph 5 0 We call this book the Historians’ Macroscope to suggest both a tool and a perspective. We are not implying that this is the way historians will ‘do’ history when it comes to big data; rather, it is but one piece of the toolkit, one more way of dealing with ‘big’ amounts of data that historians now have to grapple with. What is more, a ‘macroscope’, a tool for looking at the very big, deliberately suggests a scientist’s workbench, where the investigator moves between different tools for exploring different scales, keeping notes in a lab notebook. Similarly, an approach to big data for the historian (we argue) needs to be a public approach, with the historian keeping an open notebook so that others may explore the same paths through the information, while possibly reaching very different conclusions. This is a generative approach: big data for the humanities is not only about justifying a story about the past, but generating new stories, new perspectives, given our new vantage points and tools.

