LexisNexis hacks
Posted on December 18, 2017 in Uncategorized by Eric Lease Morgan
This blog posting briefly describes and makes available two Python scripts I call my LexisNexis hacks.
The purpose of the scripts is to enable the researcher to reformulate LexisNexis full text downloads into tabular form. To accomplish this goal, the researchers is expected to first search LexisNexis for items of interest. They are then expected to do a full text download of the results as a plain text file. Attached ought to be an example that includes about five records. The first of my scripts — results2files.py — parses the search results into individual records. The second script — files2table.py — reads the output of the first script and parses each file into individual but selected fields. The output of the second script is a tab-delimited file suitable for further analysis in any number of applications.
These two scripts can work for a number of people, but there are a few caveats. First, results2files.py saves its results as a set of files with randomly generated file names. It is possbile, albeit unlikely, files will get overwritten because the same randomly generated file names was… generated. Second, the output of files2table.py only includes fields required for a specific research question. It is left up to the reader to edit files2table.py for additional fields.
In short, your milage may vary.