An experiment in writing in public, one page at a time, by S. Graham, I. Milligan, & S. Weingart

Voyant Tools

1 Leave a comment on paragraph 1 0 Previous section: AntConc

2 Leave a comment on paragraph 2 0 With your tongue whetted, you might want to have a more sophisticated way to explore large quantities of information. The suite of tools known as Voyant (previously known as Voyeur) provides this. It provides complicated output with simple input. Growing out of the Hermeneuti.ca project[1], Voyant is an integrated textual analysis platform. Getting started is quick. Simply navigate to http://voyant-tools.org/ and either paste a large body of text into the box, provide a website address, or click on the ‘upload’ button to load text or pdf files into the system.

3 Leave a comment on paragraph 3 0 3.7-voyant

[Insert Figure 3.7 The standard Voyant Tools interface screen]

5 Leave a comment on paragraph 5 0 Voyant works on a single document or on a larger corpus. For the former, just upload one file or paste the text in; for the latter, upload multiple files at that initial stage. After uploading, the workbench will appear as demonstrated in figure 3.7. The workbench provides an array of basic visualization and text analysis tools at your disposal. For customization or more options, remember that for each of the smaller panes you can click on the ‘gear’ icon to get advanced icons, including how you want to treat cases (do you want upper and lower case to be treated the same) and whether you want to include or exclude common stop words.

6 Leave a comment on paragraph 6 0 With a large corpus, you can do the following things:

  • 7 Leave a comment on paragraph 7 0
  • In the “Summary” box, track words that rise and fall. For example, with multiple documents uploaded in order of their year, you can see what words see significant increases over time, or significant decreases.
  • For each individual word, see how its frequency varies over the length of the corpus. Clicking on a word in the text box will generate a line chart in the upper right. You can control for case.
  • For each individual word, you can also see the “Keyword-in-Context” in the lower right hand – by default, three words to the left and the three words to the right.
  • Track the distribution of a word by clicking on it and seeing where it is located within the document alongside the left hand of the central text column.

8 Leave a comment on paragraph 8 0 If you press ctrl and click on multiple words, you can compare words in each of these windows.

9 Leave a comment on paragraph 9 0 These are all useful ways to interpret documents, and a low barrier to entering this sort of textual analysis works. Voyant is ideal for smaller corpuses of information or classroom purposes.

10 Leave a comment on paragraph 10 0 Voyant, however, is best understood – like Wordle, albeit far more sophisticated – as a “gateway drug” when it comes to textual analysis. This default version is hosted on the McGill University servers, which limits the ability to process very large datasets. They do offer a home server installation as well, under development at the time of writing, and so at this point we recommend that learning the basics of the Programming Historian can help you achieve similar things while learning some code along the way.

11 Leave a comment on paragraph 11 0 None of this however is to minimize the importance and utility of Voyant Tools, arguably the best research portal in existence. Even the most seasoned Big Data humanist can turn to Voyant for quick checks, or when they are dealing with smaller (yet still large) repositories. A few megabytes of textual data is no issue for Voyant, and the lack of programming expertise required is a good thing: even for old hands. We have several years of programming experience amongst us, and often use Voyant for both specialized and generalized inquiries: if a corpus is small enough, Voyant is the right tool to use.[2]

12 Leave a comment on paragraph 12 0 Installing Voyant-Tools on your own Machine: A Quick Aside

13 Leave a comment on paragraph 13 0 It is possible to install Voyant Tools on your own machine. You might wish to do this to keep control of your documents. It could be a condition of the ethics review at your institution, for instance, that all oral history interview files are stored on a local machine without access to the Internet. You might like to run text analysis on the transcriptions, but you cannot upload them to the regular Voyant-Tools server. Finally, you might have a very large quantity of data that crashes the Voyant-Tools online site, but that your own computer could handily work with. If you had Voyant-Tools on your own machine, this would not be a problem.

14 Leave a comment on paragraph 14 0 The instructions could change with newer versions of Voyant, but for the moment if you go to http://docs.voyant-tools.org/resources/run-your-own/voyant-server/, you will find all of the information and files that you need. In essence, Voyant-Tools installs itself on your machine as a ‘server’ – it will serve up files and the results of its analysis to you via your web browser, even though you are not in fact pulling anything over the Internet.

15 Leave a comment on paragraph 15 0 It is a very easy installation process. If you download the server software, unzip it and then execute the VoyantServer.jar file. A control console will open, and when you click ‘start server,’ your browser will also appear. In the control console, you can change how much memory Voyant-Tools can access to perform its operations. By default, it will use one gigabyte of memory. This should be enough for most of your text analysis, but if you get an error saying you need more memory – if you are putting in lots of data, for example – you can increase this. When you are done with Voyant, you can click ‘stop server’ and it will close.

16 Leave a comment on paragraph 16 0 In the browser, you will see this address in the address bar: This means that the page is being served to you locally, through port 8888. You do not need to worry about that, unless you are running other servers on your machine at the same time.

17 Leave a comment on paragraph 17 0 Next section: Clustering Data to Find Powerful Patterns with Overview

18 Leave a comment on paragraph 18 0 [1] Sinclair, Stéfan and Geoffrey Rockwell. Hermenutic.ca – The Rhetoric of Text Analysis http://hermeneuti.ca

19 Leave a comment on paragraph 19 0 [2] There are more tools available in Voyant, by clicking on the ‘save’ icon in the top-right side of the page in the blue ‘Voyant Tools: Reveal Your Texts’ title bar. This icon opens a pop-up with five different export options. The first, ‘a URL for this tool and current data’ will provide you with a direct URL to your corpus which you may then share with others or return to at a later time; the final option, ‘a URL for a different tool/skin and current data’ will open another menu allowing you to select which tool you’d like to use. If you selected ‘RezoViz’ (a tool for constructing a network with organizations, individuals, and place names extracted from your texts), you would end up with a URL like this: http://voyant-tools.org/tool/RezoViz/?corpus=1394819798940.8347 The string of numbers is the corpus ID for your texts. If you know the name of another tool, you can type it in after /tool/ and before /?corpus.

Page 36

Source: http://www.themacroscope.org/?page_id=639