The Distant Reader Workbook
Posted on January 31, 2020 in Distant Reader by Eric Lease Morgan
I am in the process of writing a/the Distant Reader workbook, which will make its debut at a Code4Lib preconference workshop in March. Below is both the “finished” introduction and table-of-contents.
Hands-on with the Distant Reader: A Workbook
This workbook outlines sets of hands-on exercises surrounding a computer system called the Distant Reader — https://distantreader.org.
By going through the workbook, you will become familiar with the problems the Distant Reader is designed to address, how to submit content to the Reader, how to download the results (affectionately called “study carrels”), and how to interpret them. The bulk of the workbook is about the later. Interpretation can be as simple as reading a narrative report in your Web browser, as complex as doing machine learning, and everything else in-between.
You will need to bring very little to the workbook in order to get very much out. At the very least, you will need a computer with a Web browser and an Internet connection. A text editor such as Notepad++ for Windows or BBEdit for Macintosh will come in very handy, but a word processor of any type will do in a pinch. You will want some sort of spreadsheet application for reading tabular data, and Microsoft Excel or Macintosh Numbers will both work quite well. All the other applications used in the workbook are freely available for downloading and cross-platform in nature. You may need to install a Java virtual machine in order to use some of them, but Java is probably already installed on your computer.
I hope you enjoy using the Distant Reader. It helps me use and understand large volumes of text quickly and easily.
Table of contents
I. What is the Distant Reader, and why should I care? A. The Distant Reader is a tool for reading B. How it works C. What it does II. Five different types of input A. Introduction B. A file C. A URL D. A list of URLs E. A zip file F. A zip file with a companion CSV file F. Summary III. Submitting "experiments" and downloading "study carrels" IV. An introduction to study carrels V. The structured data of study carrels; taking inventory through the manifest VI. Using combinations of desktop tools to analyze the data A. Introduction - The three essential types of desktop tools B. Text editors C. Spreadsheet/database applications D. Analysis applications i. Wordle and Wordle recipes ii. AntConc and AntConc recipes iii. Excel and Excel recipes iv. OpenRefine and OpenRefine recipes v. Topic Modeling Tool and Tool recipes VII. Using command-line tools to dig even deeper VIII. Summary/conclusion IX. About the author
As per usual these days, the “code” is available on GitHub.