Three data webinars

Between Monday, November 8 and Thursday, November 11 I participated in three data webinars — a subset of a larger number of webinars facilitated by the ICPSR, and this posting outlines what I learned from them.

Data Management Plans

The first was called “Data Management Plans” and presented by Katherine McNeill (MIT). She gave the briefest of histories of data sharing and noted the ICPSR has been doing this since 1962. With the advent of the recent National Science Foundation announcement requiring data curation plans the interest in curation has become keen, especially in the sciences. The National Institute of Health has had similar mandate for grants over $250,000. Many of these mandates only specify the need for a “what” when it comes to plan, and not necessarily the “how”. This is slightly different from the United Kingdom’s way of doing things.

After evaluating a number of plans from a number of places, McNeill identified a set of core issues common to many of them:

  • a description of the project and data
  • standards to be applied
  • short-term storage specifications
  • legal and ethical issues
  • access policies and provisions
  • long-term archiving stipulations
  • funder-specific requirements

How do and/or will libraries support data curation? She answered this question by listing a number of possibilities:

  • instituting an interdisciplinary librarian models
  • creating a dedicated data center
  • getting any (all) librarians up to speed
  • having the scholarly communications librarian lead the efforts
  • creating partnerships with other campus departments
  • participating in a national data service
  • getting funder support
  • activities through the local office of research
  • doing more inter-university collaborations
  • providing services through professional societies

Somewhere along the line McNeill advocated reading ICPSR’s “Guidelines for Effective Data Management Plans” which outlines elements of data plans as well as a number of examples

America’s Most Wanted

The second webinar was “America’s Most Wanted: Top US Government Data Resources” presented by Lynda Kellam (The University of North Carolina at Greensboro). Kellam is a data librarian, and this session was akin to a bibliographic instruction session where a number of government data sources were described:

  • Data.gov – has a lot of data from the Environmental Protection Agency; works a lot like ICPSR; includes “chatter” around data; includes “cool” preview function
  • Geospatial One Stop – a geographic information system portal with a lot of metadata; good for tracking down sources with a geographic interface
  • FactFinder – a demographic portal for commerce and census data; will include in the near future a more interactive interface
  • United States Bureau of Labor Statistics – lot o’ labor statistics
  • National Center for Education Statistics – includes demographics for school statistics and provides analysis online
  • DataFerrett – provides you with an applet to download, run, and use to analyze data

Students Analyzing Data

The final webinar I listened to was “Students Analyzing Data in the Large Lecture Class: Active Learning with SDA Online Analysis” by Jim Oberly (University of Wisconsin-Eau Claire). [5] As a historian, Oberly is interested in making history come alive for his students. To do this, he use ICPSR’s Analyze Data Online service, and this webinar demonstrated how. He began by asking questions about the Civil War such as “For economic reasons, would the institution of slavery have died out naturally, and therefore the Civil War would have been unnecessary?” Second, he identifying a data set (New Orleans Slave Sale Sample, 1804-1862) from the ICPSR containing information on the sale of slaves. Finally, he used ICPSR’s online interface to query the data looking for trends in prices. In the end, I believe he was not so sure the War could have been avoided because the prices of slaves seemed unaffected by the political environment. The demonstration was fascinating, and interface seemingly easy to use.

Summary

Based on these webinars it is an understatement to say the area of data is wide, varied, broad, and deep. Much of Library Land is steeped in books, but in the current environment books are only one of many manifestations of data, information, and knowledge. The profession is still grappling with every aspect of raw data. From its definition to its curation. From its organization to it use. From its politics to its economics.

I especially enjoyed seeing how data is being used online. Such is a growing trend, I believe, and represents a opportunity for the profession. The finding and acquisition of data sets is somewhat of a problem now, but such a thing will become less of a problem later. The bigger problem is learning how to use and understand the data. If the profession were to integrate functions for data’s use and understanding into its systems, then libraries have a growing responsibility. If the profession only seeks to enable find and access, then the opportunities are limited and short-lived. Find and access are things we know how to do. Use and understanding requires an adjustment of our skills, resources, and expertise. Are we up to the challenge?

Comments are closed.