What’s Eric Reading?
Posted on July 4, 2014 in Uncategorized by Eric Lease Morgan
I have resurrected an application/system of files used to archive and disseminate things (mostly articles) I’ve been reading. I call it What’s Eric Reading? From the original About page:
I have been having fun recently indexing PDF files.
For the pasts six months or so I have been keeping the articles I’ve read in a pile, and I was rather amazed at the size of the pile. It was about a foot tall. When I read these articles I “actively” read them — meaning, I write, scribble, highlight, and annotate the text with my own special notation denoting names, keywords, definitions, citations, quotations, list items, examples, etc. This active reading process: 1) makes for better comprehension on my part, and 2) makes the articles easier to review and pick out the ideas I thought were salient. Being the librarian I am, I thought it might be cool (“kewl”) to make the articles into a collection. Thus, the beginnings of Highlights & Annotations: A Value-Added Reading List.
The techno-weenie process for creating and maintaining the content is something this community might find interesting:
- Print article and read it actively.
- Convert the printed article into a PDF file — complete with embedded OCR — with my handy-dandy ScanSnap scanner.
- Use MyLibrary to create metadata (author, title, date published, date read, note, keywords, facet/term combinations, local and remote URLs, etc.) describing the article.
- Save the PDF to my file system.
- Use pdttotext to extract the OCRed text from the PDF and index it along with the MyLibrary metadata using Solr.
- Provide a searchable/browsable user interface to the collection through a mod_perl module.
Software is never done, and if it were then it would be called hardware. Accordingly, I know there are some things I need to do before I can truely deem the system version 1.0. At the same time my excitment is overflowing and I thought I’d share some geekdom with my fellow hackers.
Fun with PDF files and open source software.