Structure of the Book

The book is structured in three broad sections. In the first two chapters, we provide a general overview of the field. Chapter One introduces you to the era of Big Data and why we believe that it matters for historians. We begin by discussing what big data means for humanities researchers, survey the current state of the field by pointing towards a few major projects, and then provide a brief history of our own field from its unlikely origins with a priest at the Pontifical Gregorian University of Rome. This first chapter then concludes with a discussion of Big Data and the academy, as well as how we believe that historians are now embarking upon a "third wave" of computational engagement. Chapter Two continues this general introduction by focusing in on the more particular question of what the Digital Humanities, or DH, moment is. We provide an overview to several key terms, including open access, copyright, and data mining; argue that all historians are already digital even if they do not self identify as digital historians, and begin to gently show you how to build your own historian's toolkit. The section ends with a treatment on how to get your own historical data, gearing you up for the section that follows.

The second section of our book, an emphasis on hands-on textual analysis tools, kicks off with Chapter Three and our explanation of several data mining tools that can access the data that you began to grab at the end of Chapter Two. We start gently: word clouds and other off-the-shelf software that can help you quickly make sense of large amounts of text. Things then begin to ramp up with an exploration of regular expressions. We warn you here that they are challenging to learn, but think of it as both an introduction to what they are and a reference manual. You can do amazingly powerful things with them, and regular expressions are a useful thing to have for any historian who works with text to have in their back pocket.

One catch with many of the tools in Chapter Three is that you have to know what you are looking for. Accordingly, in Chapter Four, we provide an in-depth exploration of various ways to "topic model" your sources. This is one of the most exciting new tools in the digital humanities, and in short, reconstructs the various "topics" that make up the body of data, or corpus, that you are exploring. It can be a bit complicated, so we open with a "topic modelling by hand" exercise to help explain things. This is definitely a hands-on chapter, with some great rewards and payoff by the end.

Chapters Five, Six, and Seven kick off the rough third section of our book, which provides a strong emphasis on networks; as networks are both a kind of analysis and a powerful form of visualization, we spend some time discussing the basics of visualizations. We address one of the key fears that scholars have with big data: that it could mean losing the trees for the forest. We argue that network analysis allows historians to connect the micro and macro, situating individual actors within a complex interconnected economy. Network analysis has been one of the most fruitful ways that topic models, discussed in the previous chapter, have been visualized. For historians, network analysis fruitfully explores concepts of space and time. Chapter Five provides a basic breakdown of the concept and vocabulary of network analysis, which is a unique and transformative modeling technique. All of these can be performed with a spreadsheet program or other rudimentary software. Chapter Seven then explores more detailed topics and the opportunities offered by them, as well as several more hands-on tools to get right into network analysis.

In the conclusion, we point to a few more things that you might be interested in learning, notably around how to disseminate all the fantastic work that you have been able to do as a result of progressing through this book. We will also check on our good friend from the opening preface, to see how her work is going after a while experimenting with her own Macroscope.

This book is not an exhaustive introduction to the entire field of Digital History, nor could any work be. We are three scholars based within North America; two in Canada and one in the United States, and our examples are often drawn from the scholarly milieu with which we are most comfortable. This book mostly concerns itself with text. We make an effort to draw on diverse examples, but we want to recognize where we are both writing from and our bias towards English-language sources (something which our tools occasionally share, as noted). Furthermore, the book has a particular interest in textual analysis, manipulation, and networks: we do not, for example, deal with historical GIS or database theory. Quite a bit of work has been done on GIS, including the recent open-access publication, Historical GIS Research in Canada, released by the University of Calgary Press and available for download at http://uofcpress.com/books/9781552387085.

