An experiment in writing in public, one page at a time, by S. Graham, I. Milligan, & S. Weingart

Compiling several text files into a single CSV file

1 Leave a comment on paragraph 1 2 There may be times when you have a folder of text files that you wish to merge together into a single csv file. For instance, you might wish to use the Stanford Topic Modeling Tool on your data; this tool requires a csv file to import. Here is a simple script that will write each text file into a new line of a csv file. We found this script at the question and answer forum, Stack Overflow, and have modified it slightly.1  (nb, this only works on Windows. We’re looking for Mac, Linux options).

2 Leave a comment on paragraph 2 0 Open your preferred text editor. (We like Notepad++). Type in the following:

3 Leave a comment on paragraph 3 0 Save it with the file name text-to-csv.bat . The .bat extension is important, as it tells Windows that this is a batch file, meaning it is executable. Save the file in the folder where your text files are located. Before you double-click on it, to run it, open one of your text files and examine it. How many line breaks are in it? If it has three paragraphs, this script will only grab the first two paragraphs. In this line:

4 Leave a comment on paragraph 4 0
echo %%f, !line1!, !line2!, >> result.csv

5 Leave a comment on paragraph 5 0 add !line3!, just before the >> markers. When you run this script, it will create a csv file where the first column is the original text file’s name, the second column is the contents of line 1 (everything up to the first line break), the third column contains the contents of line 2 (everything up to the second line break), and so on.

6 Leave a comment on paragraph 6 0 It’s a handy little script. It can be downloaded from github gist.2

  1. 7 Leave a comment on paragraph 7 0
  2. posed by the user, ‘Dynamite Media’. http://stackoverflow.com/questions/20963773/reading-lines-from-txt-file-into-csv []
  3. https://gist.github.com/shawngraham/9120038 []
Page 68

Source: http://www.themacroscope.org/?page_id=437