¶ 1 Leave a comment on paragraph 1 0 Previous section: Putting Big Data to Good Use: Historical Case Studies
¶ 2 Leave a comment on paragraph 2 0 To know where we are today, and indeed, where we are going, we need to understand where we as a discipline came from. In this section, we provide a brief overview of the evolution of the digital humanities and digital history: the intellectual tradition that has led to the projects discussed earlier in this chapter. This is not an exhaustive history, but provides a basic sense of where our discipline has emerged in part; it does not capture the many diverse national and linguistic traditions that underlie the very diverse field of the broader digital humanities today. While today the digital humanities seems centered within universities, dominated by academics in one form or another, the field often traces its unlikely beginnings to the hopes and dreams of a Roman Catholic priest. Writing a history of such a diverse field is difficult – one that draws from so many different origin stories, from corpus linguistics, computer science, English literature, historical studies, archaeology, and so forth – but we anchor our work within the twin fields of digital humanities and quantitative history.
¶ 3 Leave a comment on paragraph 3 0 In 1946, Father Roberto Busa had a problem. He had just defended his doctoral dissertation at the Pontifical Gregorian University of Rome, in which he called for a comprehensive concordance of the works of St. Thomas Aquinas. A concordance provides a list of where a given word appears, in its context, everywhere in a given work: for example, if one took the Old Bailey sessions above and wanted to see the immediate context in which every incidence of the word “poison” appeared, one would use a concordance. This is relatively simple to do with the computer programs of today – we will discuss one easy way, AntConc, in Chapter Three, but in 1946, it was a tall order. Busa conceived of a series of cards, which would – he estimated – number thirteen million in total. It would be his Index Thomisticus, a new way to understand the works of St. Thomas Aquinas. Recognizing the tall order, at a 1948 conference in Barcelona, Spain, Busa appealed “[for] any information [fellow scholars could] supply about such mechanical devices as would serve to achieve the greatest possible accuracy, with a maximum economy of human labor.”
¶ 4 Leave a comment on paragraph 4 0 Busa’s subsequent search would bring him to the United States and into contact with International Business Machines Corporation, or IBM. Fortuitously for Busa, IBM had some machine-time to spare. Its long-time president, Thomas J. Watson, met with Busa and despite staff reservations that what Busa wanted was impossible, he agreed to help. Using mechanical punch card readers, Busa set out to produce a concordance. As the cards were limited to eighty characters each, this called for short lines. A test case of St. Thomas Aquinas’ poetry was carried out. This 1951 test represented a groundbreaking moment in the development of humanities computing. As recounted by Thomas N. Winter, the mechanical (not computational, yet) construction of a codex would require five steps: the transcription of phrases found in the text; multiplying cards by the number of words on each; breaking words down into entries (lemmas, or the roots of words, so that various forms of the same word would appear as one (i.e. “go” and “goes” represent the same concept); selecting and alphabetizing cards; and then publishing the final product in print. Importantly, the final product offered the following outcomes:
- ¶ 5 Leave a comment on paragraph 5 0
- Words and their frequency, forwards
- Words and their frequency, written backwards
- Words set out under their lemmata (eg. aemulis under aemulus aemula aemulum), with frequency
- The Lemmata
- Index of the words
- Keyword in context Concordance
¶ 6 Leave a comment on paragraph 6 0 Busa’s high standards and unwillingness to compromise with a lesser version meant that the Index was many years in the making, but it eventually appeared in printed form in 1974 and online in 2005. Busa had lofty dreams, which became reality. As IBM employee Paul Tasman, assigned to Busa, prophesied in 1957, “[t]he use of the latest data-processing tools developed primarily for science and commerce may provide a significant factor in facilitating future literary and scholarly studies.” This is now a reality.
¶ 7 Leave a comment on paragraph 7 0 As Susan Hockey recounts in her history of humanities computing, the sixties saw the rise of scholars interested in the opportunities offered by large-scale concordances. Scholars began with collections of texts, and subsequently moved into areas such as authorship attribution and quantitative approaches to literary style; notably, in 1964, statisticians Frederick Mosteller and David Wallace used computers to attempt to identify the authors of a dozen disputed Federalist Papers; an attempt generally deemed successful. Conferences and journals emerged, such as the Computers and the Humanities, accompanied by the establishment of research centres.
¶ 8 Leave a comment on paragraph 8 0 Historians would seem initially well positioned, by the 1950s and 1960s, to become involved with large-scale computational inquiries. The Annales school, today an inspiration of literary scholars such as the aforementioned Franco Moretti, aimed to dramatically expand the scope of historical inquiry. In particular, the work of Fernand Braudel is worth briefly discussing. Braudel, amongst the most instrumental historians of the twentieth century, pioneered a distant approach to history: his inquiries spanned large expanses of time and space, that of civilizations, the Mediterranean world, as well as smaller (only by comparison) regional histories of Italy and France. His approach to history did not necessitate disengagement with the human actors on the ground, seeing instead that beyond the narrow focus of individual events lay the constant ebb and flow of poverty and other endemic structural features. While his methodology does not apply itself well to rapidly changing societies (his focus was on the long-term slow change), his points around distant reading and the need for collaborative interdisciplinary research are a useful antecedent for today’s researchers.
¶ 9 Leave a comment on paragraph 9 0 Quantitative and computational history was on the rise by the late 1960s. Articles and special issues on the subject littered several contemporary journals, including The American Historical Review, The Journal of American History, The Journal of Contemporary History, and History and Theory. In 1965, thirty-five historians attended a three-week seminar on computing in history at the University of Michigan; by 1967, over 800 scholars were receiving a newsletter aimed at computing for history. Two AHA conferences on quantification in history were held in 1967, and many of the earliest issues of Computers and the Humanities featured the work of historians.
¶ 10 Leave a comment on paragraph 10 0 For historians, however, computational history became associated with demographic, population, and economic histories. For a time in the 1970s, it looked like history might move wholesale into quantitative histories, with the widespread application of math and statistics to the understanding of the past. By 1972, at least half a dozen new journals and magazines were devoted to some aspect of computing and history. Literature scholars pursued textual analysis; historians, to generalize a bit, preferred to count. A book like Michael Katz’s The People of Hamilton, Canada West, which traced economic mobility over decades using manuscript censuses, was a North American emblem of this form of work. These were fruitful undertakings, providing invaluable context to the more focused social history studies that fleshed out periods under study. A potential downside, however, was that computational history became associated with quantitative studies. This was not aided by some of the hyperbole that saw computational history as making more substantial “truth” claims, or the invocation of a “scientific method” of history. As mainstream historians increasingly questioned this strand of ‘objectivism’, itself a trend dating back to the 1930s, Cliometrics became estranged from the mainstream of the profession. This stigma would persist early into the 21st century, even while literary scholars pursued increasingly sophisticated forms of textual analysis, social networking, and online exploratory portals. This 1960s and 1970s explosion of digital work represented the first “wave” of computational history.
¶ 11 Leave a comment on paragraph 11 0 Yet, in part due to hubris, debates over seminal works such as Time on the Cross, and a more general move towards social history within some elements of the historical profession, computational history retreated in the 1970s. The first wave had come to an end. Yet it would re-emerge in the 1990s with the advent of personal computing, easy-to-use graphical user interfaces, and improvements in overall accessibility. This represented a second “wave” of computational history. Painstaking punchcards and opaque input syntax gave way to relatively easy to use databases, Geographical Information Systems (GIS), and even early online networks such as H-Net and USENET. Early conferences, like Hypertext and the Future of the Humanities, held at Yale in 1994, included some of the founders of modern digital history. The Journal of the Association for History and Computing (JAHC) was published between 1998 and 2010, growing out of annual meetings of the American Association for History and Computing, itself founded in Cincinnati in January 1996. Seeing digital methods as transforming both the creation and dissemination of history, the JAHC published forward-thinking articles on hypertext, digital teaching methods, barriers to adoption, and new means of representing the past (whether through sound, graphics, maps, or the Web). The back issues represent a snapshot of the cutting edge work going on. Crunching away at census manuscripts, or attempting to identify authorship, or counting words, the broad interdisciplinary scholarly field known as Humanities Computing emerged, bringing together historians, philosophers, literary scholars, and others under a broad computational tent.
¶ 12 Leave a comment on paragraph 12 0 One could spend pages defining humanities computing, and its eventual successor the digital humanities. Indeed, several other authors have. In a provocative essay, “What is Humanities Computing and What is it Not,” John Unsworth defined the field as “a practice of representation, a form of modeling or [...] mimicry. It is [...] a way of reasoning and a set of ontological commitments, and its representational practice is shaped by the need for efficient computation on the one hand, and for human communication on the other.” Simply using word processors, e-mail, or communicating by list-servs did not a humanist computationist make. To be ‘DH, requires a new method of thinking. The shift towards the digital humanities was not simply a shift in nomenclature, although there are elements of that as well. Patrik Svensson has traced the shift from humanities computing to the digital humanities, showing how the nomenclature change staked a new definition that was even more inclusive, and broad: inclusive of questions of design, the born-digital, new media studies, and more emphasis on tools with less emphasis on more straight-forward methodological discussions.
¶ 13 Leave a comment on paragraph 13 0 As recounted in Matthew Kirschenbaum’s take on the history of the digital humanities, the term also places different emphasis on different parts of the phrase. As the National Endowment for the Humanities Chief Information Officer put it to him, he “appreciated the fact that it seemed to cast a wider net than “humanities computing” which seemed to imply a form of computing, whereas “digital humanities” implied a form of human-ism.”
¶ 14 Leave a comment on paragraph 14 0 This expanded definition seemed to extend to digital history as well. As previously discussed, digital history has an uneasy relationship to digital humanities; some see the former as a subset of the latter, while others see them as overlapping but not hierarchically connected communities. Digital history, for one, sits closer to the public humanities than many of its counterparts. The group is also much less well represented at digital humanities conferences than those coming from literature and modern language backgrounds, leading some recent popular accounts of the digital humanities to ignore digital history entirely. Conversely, digital historians often do not trace their roots back to Father Busa. We do not plan on resolving those differences here, but instead will draw from every corner of the big tent of digital humanities in order to facilitate training new digital historians.
¶ 15 Leave a comment on paragraph 15 0 In short, the term “digital humanities” is difficult to define. A fun way to open a digital humanities or history course is to visit Jason Heppler’s succinctly-named website, “What is Digital Humanities?” Participants in the annual Day of DH, (which began with the University of Alberta but now rotates between universities every two years based on an open call for proposals), are annually asked to provide their own definitions. Between 2009 and 2012, Heppler has compiled 511 entries, all made available in raw data format. A visitor to his website can refresh it for a new definition: ranging from the short and whimsical (“[a]s a social construct,” “[t]aking people to bits”), to the long and comprehensive (“[d]igital Humanities is the critical study of how the technologies and techniques associated with the digital medium intersect with and alter humanities scholarship and scholarly communication”) to more specific definitions focused on making or digital preservation. Amongst such crowded and thoughtful conversation, we hesitate to add our own definition. A definition, however, is in order for the purposes of this book. We believe that the digital humanities are partly about understanding what digital tools have to offer, but also – and perhaps more importantly – an understanding of what the digital does and has done to our understanding of the past and ourselves. In this book, with this in mind, we peel back the layers of a particular approach to big data using specific tools such as topic modeling and network analysis. Yet we realize that they need to be critically studied, as they have come from divergent disciplines and domains. With an understanding of how the digital humanities have evolved, from Father Busa through to humanities computing, we seek to explore the implications of a new era: the challenge and opportunity of big data.
¶ 17 Leave a comment on paragraph 17 0  The annual digital humanities conference is testament to the diversity of the field; the DH 2014 Conference Call for Proposals was translated into more than twenty three languages alone. As historians, we acknowledge that ‘digital history’ can be viewed as having a rather different foundational narrative, emerging out of work in oral and public history (though our own personal trajectories are more from the digital humanities side of things than from those subfields of history). See for instance the post by Stephen Robertson, where he draws out the differences between ‘digital humanities’ and ‘digital history’. http://drstephenrobertson.com/2014/05/23/the-differences-between-digital-history-and-digital-humanities/
¶ 18 Leave a comment on paragraph 18 0  See Susan Hockey, “The History of Humanities Computing,” in A Companion to Digital Humanities, ed. Susan Schreibman, Ray Siemens, John Unsworth (Oxford: Blackwell, 2004), http://www.digitalhumanities.org/companion/view?docId=blackwell/9781405103213/9781405103213.xml&chunk.id=ss1-2-1.
¶ 24 Leave a comment on paragraph 24 0  The longue durée stands opposite the history of events, of instances, covering instead a very long time span in an interdisciplinary social science framework. In a long essay, Braudel noted that one can then see both structural crises of economies, and structures that constrain human society and development. For more, see Fernand Braudel, “History and the Social Sciences: The Longue Durée” in Fernand Braudel, On History, trans. Sarah Matthews (Chicago: University of Chicago Press, 1980), 27. The essay was originally printed in the Annales E.S.C., no. 4 (October-December 1958).
¶ 27 Leave a comment on paragraph 27 0  Michael Katz, The People of Hamilton, Canada West: Family and Class in a Mid-Nineteenth-Century City (Cambridge: Harvard University Press, 1975). See also A. Gordon Darroch and Michael D. Ornstein, “Ethnicity and Occupational Structure in Canada in 1871: The Vertical Mosaic in Historical Perspective,” Canadian Historical Review, 61.3 (1980): 305-333.
¶ 28 Leave a comment on paragraph 28 0  Computational historians in 1967 were already arguing against the common notion that quantitative history needed to be positivist history. Vern L. Bullough, “The Computer and the Historian–Some Tentative Beginnings,” Computers and the Humanities, 1.3 (1967).
¶ 29 Leave a comment on paragraph 29 0  Witness the debate over Robert William Fogel and Stanley L. Engerman, Time on the Cross: The Economics of American Negro Slavery (New York: W.W. Norton and Company, 1974), which was condemned for reducing the human condition of slavery to numbers. On the flip side, it also provided context in which to situate individual stories. The debate continues in countless historical methods classes today. For the general estrangement between the historical profession and cliometrics, see Ian Anderson, “History and Computing,” Making History: The Changing Face of the Profession in Britain, 2008, http://www.history.ac.uk/makinghistory/resources/articles/history_and_computing.html, (viewed 20 December 2012).
¶ 37 Leave a comment on paragraph 37 0  Stephen Richardson, “The Differences between Digital History and Digital Humanities,” drstephenrobertsom.com, 23 May 2014, http://drstephenrobertson.com/2014/05/23/the-differences-between-digital-history-and-digital-humanities/