Chapter 3 Footnotes & Links

These links all worked as of September 18 2015. Please ping us if you discover broken links.

1 Adam Crymble (5 August 2013), “Can We Reconstruct a Text from a Wordcloud?” Thoughts on Digital and Public History,

2 Stéfan Sinclair and Geoffrey Rockwell. — The Rhetoric of Text Analysis,

3 There are more tools available in Voyant by clicking on the “save” icon in the top-right side of the page in the blue “Voyant Tools: Reveal Your Texts” title bar. This icon opens a pop-up with five different export options. The first, “a URL for this tool and current data” will provide you with a direct URL to your corpus which you may then share with others or return to at a later time; the final option, “a URL for a different tool/skin and current data” will open another menu allowing you to select which tool you’d like to use.

If you selected “RezoViz” (a tool for constructing a network with organizations, individuals, and place names extracted from your texts), you would end up with a URL like this: The string of numbers is the corpus ID for your texts. If you know the name of another tool, you can type it in after /tool/ and before /?corpus. Edit September 18: That particular corpus no longer works because the Voyant server is periodically cleaned of materials. In which case, should use wish for something to be archived indefinitely, you will need to contact Voyant and asked for a corpus number to be archived.

4 See Shawn Graham, Guy Massie and Nadine Feurherm (2013), “The HeritageCrowd Project: A Case Study in Crowdsourcing Public History,” in Writing History in the Digital Age, Jack Dougherty and Kristen Mawrotszki (eds), Ann Arbor, MI: University of Michigan Press, available at

5 Jonathan Stray has written an excellent piece on using Overview as part of a “data journalism” workflow, many points of which are appropriate to the historian. See Jonathan Stray (14 March 2014), “You Got the Documents. Now What? — Learning — Source: An OpenNews Project,” Source,

6 The documentation for Overview may be found at; the software itself can be downloaded at September 18: Overview has recently updated its instructions; the latest are at


8 As in this example

9 Regex expressions are sometimes instantiated differently depending on which program you are working with. They simply do not work in Microsoft Word. For best results, try TextWrangler (on Mac) or Notepad++ (in Windows).

10 We have lodged a copy of this file also at

11 Notepad++ (for Windows) can be downloaded at TextWrangler (for Mac) can be found at

12 Remember, these are markers for “word boundaries.” See

13 This is described in more detail here

14 It’s worth pointing out that once you have cleaned data in CSV or TSV format, your data can be imported into a variety of other tools or be ready for other kinds of analysis. Many online visualization tools like Raw ( and Palladio ( accept and expect data in these formats.

15 Jonathan Stray (14 March 2014), “You Got the Documents. Now What? —Learning — Source: An OpenNews Project,” Source,

16 Since it is open source, you can make and maintain your own copy, in the event that the original “Tabula” website goes offline. Indeed, this is a habit you should get into.
(This is called “forking” on github; you create an account on github, login, then go to the repository you wish to copy. Click “fork” and you’ve got your own copy!)

17David W. Gill and Christopher Chippindale (October 1993), ‘Material and Intellectual Consequences of Esteem for Cycladic Figures,’ American Journal of Archaeology, 97(4), 601. Edit September 18: If you have access to JSTOR, you can download this article at

18 Another open source project, “Raw,” lets you paste your data into a box on a webpage and then render that data using a variety of different kinds of visualizations. You can download it at Raw does not send any data over the Internet; it performs all calculations and visualizations within your browser, so your data stay secure. It is possible (but not easy) to install Raw locally on your own machine, if you wish. Follow the links on the Raw website to its github code repository.