Find your team, and get to work(flows)

Posted on June 30, 2020 by Abigail Shelton

By Peggy Griesinger and Mikala Narlock

Digital collections are deceptive; when users turn to online platforms to access content, the amount of work required to provide uninterrupted access to digital materials is not always obvious or clear. One aspect of this work is taken care of by our software developers and product owners, who work to ensure that the interface and user experience are as simple and effective as possible to our end-users. While that aspect of the work is handled, other library staff and faculty must then take on the not insignificant job of digitizing, describing, and making accessible library, archive, and museum materials before they can be accessed online. While it is beyond the scope of this blog post to discuss the nuances and particulars of digital collections writ large, we will briefly describe how digital collections are produced at Hesburgh Libraries, why workflows are so crucial, and how we are focusing on sustainability to ensure the long-term success of this type of work.

The Workflow Team hard at work.

About Digital Collection Management at Hesburgh Libraries

At Hesburgh Libraries, the process of creating digital collections is managed by a team of ‘Case Managers,’ individuals responsible for overseeing projects and ensuring that collections are processed efficiently and following standardized processes. Established in 2018, this low-tech approach to workflow management is based on project management principles to ensure the timely completion of requests. The case manager provides guidance and support for digital collections, and serves as a liaison between units. As Case Managers, we are tasked with working with subject selectors to identify collections, triage requests, and determine how best to digitize the content. Using modular workflow components, such as “Send to Conservation,” or “Route to Metadata Services,” we can build custom workflows to ensure all collections receive the appropriate care without stalling in a bottleneck area. Additionally, Case Managers serve as the primary contact for all project participants, and keep everyone apprised of progress and hindrances. Every digital collection, ranging in complexity and size from one diary to hundreds of boxes, is assigned a Case Manager to ensure there is always formal project oversight and that collections are not left behind in the event of personnel change. This team of case managers is called the Digital Collections Oversight Team (DCOT).

The oversight provided by DCOT is crucial to avoiding stumbling blocks in the complex process of creating digital surrogates for collection materials. Potential stumbling blocks include, but are not limited to: the digitization itself; additional processing requirements, such as in the case of archival collections; physical conservation needed to ensure materials are not damaged in the digitization process; additional descriptive information requiring the intervention of experienced catalogers; copyright review; and other concerns unique to particular collections, such as a need to be able to batch-process catalog records or transport materials to a vendor. Every collection calls for a slightly different workflow, but, at the same time, each project provides more experience and practice for our team to more comprehensively understand digital collections workflows at the Libraries. Given efforts at Hesburgh Libraries to identify a Libraries-wide digital collections platform, we are aware that the workflows we develop and document will be crucial for effectively creating and delivering digital content in ongoing and future efforts, including the MARBLE site.

A soldier climbing a mountain of books, title reads Knowledge Wins

A recently added World War I-era poster proclaiming the importance of libraries .

Workflows for MARBLE

In keeping with the collaborative spirit of the MARBLE project, we have focused on simplifying our workflows across departments and aligning processes between the library and the museum. In reducing the number of discrepancies between the workflows, we can automate processes and simplify the burden of labor on Case Managers and other stakeholders. We started this effort by sketching out anticipated workflows, taking into account variables such as: the existence of descriptive metadata, whether an item was a library or archival material type, and the custodial department responsible for the item. During this process, it became apparent that, despite anticipated discrepancies in workflows, many of our procedures were already aligned, even if we sometimes used different terms to explain related concepts. Additionally, the work of our developers has allowed us to align our existing workflows in a way that will enable each unit to continue using its own best practices while still automating the process of content ingest into MARBLE in a consistent and standardized manner. Building this ability on top of our existing systems will allow us to continue using workflows that are effective for each unit while at the same time easing the process of conciliating diverse collections so they can be accessed on a shared platform like MARBLE.

Ensuring Sustainability

Workflows for creating and managing digital content are critical to the success of any digital library. We expect demand for digital facsimiles to increase as traffic to the MARBLE site continues to grow, especially during this period of heavy reliance on online classes and virtual instruction. Our team is looking to grow sustainably, and ensure that many people, particularly those embedded in custodial departments, are able to upload content to the MARBLE site without confusion, significant efforts, and bottlenecks. Our overall focus is on making use of existing expertise, staffing, and workflows while also ensuring those disparate pieces work well together and lead to solutions, like the MARBLE project, that make the cultural heritage of Notre Dame institutions more widely accessible.

Documenting Decisions to Build Buy-In

Posted on April 28, 2020 by Abigail Shelton

by Jeremy Friesen

Introduction

In any long-running project at Hesburgh Libraries, our developer teams make countless decisions every day. Some decisions are big and some are small. — some affect a few people while others have an impact on the entire organization.

Inevitably, these decisions evolve over time.

Sometimes we have to adjust or even reverse a decision after gathering more information or gaining experience with a tool or software. We don’t sweat the ebb and flow — we welcome it. Embracing decision-making as an evolutionary process is one of our guiding principles for a healthy team culture.

We also realize that decisions are only as good as the documentation and communication processes that underpin them.

Photograph of a horse from various angles

Documenting our decisions from every angle helps us understand where we’re going and why. Image: Eadweard Muybridge, “Eagle” Walking, Free, plate 576 from Animal Locomotion, 1845-1904, albumen silver print. The Janos Scholz collection of 19th century photography, Snite Museum of Art, University of Notre Dame, 1981.031.543.

To this end, we use consistent documentation and transparent communication to serve as a two-way roadmap for new challenges, team discussions, and retrospectives in the midst of a rapidly changing landscape.

These decision documents also help to facilitate conversations with stakeholders and build enduring relationships with project partners.

The cases below illustrate how decision documents and transparent communications during the MARBLE project have contributed to team success and project impact.

A tale of two documented decisions

One of the goals of the MARBLE software development project funded by the Andrew W. Mellon Foundation is to create a unified discovery for digitized cultural heritage collections held by the Snite Museum of Art and Hesburgh Libraries at the University of Notre Dame.

Our immediate aim is to make these objects discoverable in the context of a collaborative digital collections platform. We also surmised that we may want to someday make the digitized objects discoverable through our general library catalog.

Given these aspirations, we decided to leverage the library-wide discovery system as our search and discovery interface. library-wide discovery system is a vendor-supplied search index software used by many libraries around the world as their primary catalog.

On the surface, this decision ran contrary to another goal of our project: to develop and release open-source software.

To reconcile these apparent contradictions and keep our road map intact, I wrote a decision document supporting the use of our library-wide discovery system and how we would proceed with delivering open-source software. We clarified that we would use an API to interact with our search index. (An API, or Application Programming Interface, is a protocol or specification that allows information to transfer from one system to another. Using an API is a simple and common practice in developing new software.)

In this case, we viewed the decision to interact with an API as a way to support other institutions or potential adopters that don’t use our discovery system. In other words, our solution would be built in such a way that another institution could connect with their search index of choice.

We shared our draft decision with project leadership, stakeholders, and developer teams to solicit feedback. From the feedback, we amended the draft document to reflect any new considerations, questions, and challenges.

With a decision document firm in hand, we began working on implementing our solution. We gathered help from other library-wide discovery system adopters. (Thank you, Northwestern Libraries!). We dove deeper into our usage of library-wide discovery system, expanding our expertise and understanding of a technology we have long used.

Then we hit a wall.

Our user interviews identified full-text search as a key desired feature. According to library-wide discovery system documentation, this functionality should have worked. But, it didn’t, and we entered into a “waiting on vendor response” holding pattern.

While waiting, one of our developers explored ElasticSearch as another option.. After only a few afternoons of work and testing, ElasticSearch proved to be a viable alternative. Within a week, we referenced our documents. We reassessed our prior decision to leverage our library-wide discovery system and chose to pivot towards ElasticSearch.

Pivoting on a decision takes balance and flexibility. Image: Edgar Degas, Study of a Ballet Dancer, ca. 1880-1885, brown conte crayon and pink chalk on paper. Gift of John D. Reilly ND’63, ’64 B.S., Snite Museum of Art, University of Notre Dame, 2004.053.004.

Again, I wrote up a decision document outlining the rationale, process, and lessons learned. For example:

We found that ElasticSearch allowed us to implement the full-text search feature.
ElasticSearch also performed faster searches
There existed open-source ReactJS components for facet rendering, something we were going to need to create in our previous approach.
Since ElasticSearch is open-source, our own developers can work out bugs instead of waiting on a vendor.
Our decision to explore our existing library-wide discovery system also produced useful outcomes in that we have a deeper understanding of how to better leverage our library-wide discovery system in our current workflows.
The quick swap from one system to another confirmed for us that we have a robust architecture.
Finally, we have postponed the goal of ensuring that all campus cultural heritage content is in our library search index, but our software design will make this work easier going forward.

Amazon Web Services: Are you being serverless?

Another problem we encountered during the development of the MARBLE project was choosing an International Image Interoperability Framework (IIIF) compliant image server.

Early in the project, we chose to implement Cantaloupe from the list of known server options. With that decision documented and shared, we built blueprints to deploy our Cantaloupe instance into Amazon Web Services (AWS) as a Fargate container.

This worked to get us started.

However, as we added more and more images to Cantaloupe, we encountered problems such as spikes in response times, incidents of high error rates, numerous restarts. We soon discovered the root cause: Cantaloupe’s architecture conflicts with AWS’s Fargate container implementation.

Our options were to move to a more expensive AWS service or look for something else and a possible contender emerged.

Our colleagues at Northwestern University, David Schober and Michael Klein, presented “Building node-iiif: A performant, standards-compliant IIIF service in < 500 lines of code” at Open Repositories 2019. After a quick conversation, they pointed us to their implementation, a serverless service.

Learning from our community is crucial to the development process.
Image: Flemish, The Lawyer’s Office, after Marinus van Reymerswaele, 1535-1590, oil on cradled panel. Gift of Dr. and Mrs. Dudley B. Kean, Snite Museum of Art, University of Notre Dame, 1954.005.

As has become our practice, we documented a plan to experiment with the serverless implementation.

We kept Cantaloupe running for our pre-beta site, while we tested and expanded on Northwestern’s implementation.
On October 8th, we made the decision to move away from Cantaloupe.
On November 7th, we switched from using Cantaloupe to using Northwestern’s IIIF-Serverless in our pre-beta instance. This was done without downtime or disruption to our site.
Based on our findings we believe we’ll be able to reduce our image server costs by two orders of magnitude.

You can see our archived image-server repository and a snapshot of the blueprints to build this out in AWS. Here is the code commit that moved our blueprints from Cantaloupe to Serverless. You can also look at our documentation evaluating migrating from Cantaloupe to serverless.

Conclusion

The key takeaway is that it’s worth taking the time to document decisions and have consistent communications.

It’s true that not every decision necessitates thorough documentation. However, the decisions that require widespread buy-in, impact a key tool or process, or re-orient project goals deserve an organization-wide commitment to this evolving decision-making process.

For me, decision documents should identify the problem that needs to be solved and includes context, considerations, and constraints. Teams should build decision documents by seeking the input of those with a significant stake in this problem.

Because we have taken the time to document milestones and decisions, our project is modeling how to have a more robust memory of a particular problem and attempted solutions. We are able to be visionary and more agile as we create solutions to meet stakeholder needs.

Simply said, decision documents make all the difference.

And, as a bonus, it was much easier to write this blog post. So, go forth and document!

The MARBLE team on the road

Posted on March 27, 2020 by Abigail Shelton

Last fall, the Snite Museum staff working on the MARBLE project presented at two of the biggest conferences in their fields: the Museum Computing Network (MCN) and the Association of Registrars and Collections Specialists (ARCS). What follows are their key takeaways from each conference as well as links to their presentation materials.

Museum Computing Network 2019
Abby Shelton

From November 5-8, 2019, hundreds of museum technologists converged on San Diego, California for the Museum Computing Network conference.

It seemed to me that attendees and panelists displayed an increased sense of fatigue towards emerging technology. The deep unease with some of the ways that GLAM institutions have used technology without thinking about important ethical considerations was palpable in conference presentations, Twitter conversations, and break-time discussions.

There was also an emphasis on nurturing a culture of diversity and inclusion, not only in organizational structures but in collections metadata, copyright policy, and social media content.

Three of the best sessions of the conference were the plenary panels held with leaders in the museum field to discuss pressing issues. Microsoft sponsored the first panel on emerging technology with representatives from the Cleveland Museum of Art, MIT Museum, and Cooper Hewitt. Panelists began by addressing the question of how to approach emerging technology, particularly in a situation where they would be called upon to advise museum leadership. Several suggested that when it comes to emerging technology, like artificial intelligence or virtual reality application, GLAM institutions don’t ask “why?” enough. As someone who thinks about user needs a lot, this mindset resonates. We should be able to provide a use case for every project that our institutions undertake. Instead of jumping into something because it’s flashy or new or proposed by a donor usually makes for a less-than-compelling visitor experience.

The other plenary panel that struck home for me was the Thursday morning discussion on data ethics, privacy, and security. As someone who works in a collections data adjacent role, I know that we have policy guidelines for some of our sensitive information: gift agreements from anonymous donors, insurance valuations, storage locations. But, what struck me about this panel was how little we have trained museum staff in data security and privacy best practices. I appreciated hearing about the Denver Art Museum’s data on-boarding for new staff and the annual re-training of staff on how to safely store and classify information. The panel made me wonder if academic museums in particular rely heavily on their parent institutions to train staff on data collection and handling, rather than educating staff about particular record-keeping best practices for museums.

Another major theme highlighted by several presentations was open access or the open GLAM movement. Two panels on domestic and international copyright law and one featuring institutions that have implemented open access policies underscored the importance of open access for community engagement and involvement in making and remaking cultural heritage collections. While it was exciting to hear about the open access work that the Yale University museums, the Getty Research Institute, and other large GLAM institutions are doing, the sessions certainly prompted questions of how open access could be feasible for small institutions. In my conversations with faculty and students at our campus, the need for reusable images is strong but we often have to balance that need against our capacity for copyright review.

Our MCN panel, l-r: Juliet Vinegra (Philadelphia Museum of Art), Katrina Wratschko (Philadelphia Museum of Art), Adrienne Figus (Smith College), Jessica Breiman (University of Utah), and Abby Shelton (Snite Museum of Art, MARBLE).
Photo credit: Laura Shea, Mount Holyoke College Art Museum

Association of Registrars and Collections Specialists 2019
Hanna Bertoldi

The 2019 Association of Registrars and Collections Specialists (ARCS) Conference was held in Philadelphia during the first week of November. The keynote, Joan Baldwin, kicked off the conference talking about leadership in museums. She spoke about problems in the museum industry that were not surprising to anyone in the room—lack of diversity, salary disparity, sexual harassment, and inflexibility to change. Baldwin concluded her talk by urging everyone in the room to be the change that they wish to see. Although she was talking about leadership, I think that this concept can apply to collections data as well. Throughout the conference, I was attracted to projects that used collections data to leverage change in their organization.

The San Francisco Arts Commission (SFAC) staff—in conjunction with the data scientists at Data-Science-SF—used their collections data to write a ten-year master plan for prioritizing care and detailing costs for maintaining their permanent collection. They used this plan to successfully advocate for more funding from the Capital Division. By presenting their collections data in a way that their funding organization understood, they were able to show how deferred maintenance on cultural assets results in greater future expenses.

In the case of SFAC, the project started with an inventory. The staff completed a wall-to-wall inventory to gather data for their cost analysis. This was a common point made during other presentations as well. Parks Canada inventoried objects while moving a collection of over 2 million items into an existing collections facility an hour away. Due to time constraints, this was completed with only the essential information, and further reconciliation was planned for the future. This method of inventory was not their preferred method, but was a necessary solution because of the backlog of bad information in the database.

I was struck by how many projects would have been smoother if they could have started with good data. In the first year and a half of the MARBLE project, one of the lessons that I have learned is the importance of an inventory, a notion supported in the book, Managing Previously Unmanaged Collections: A Practical Guide for Museums (2016) by Angela Kipp. An inventory creates a solid foundation on which a data clean-up project can firmly stand. My position as the Database Collections Coordinator presents a unique opportunity to leverage collections information, or lack thereof, to raise awareness about current needs of the organization and inspire a response. This is a small way that I can be a leader.

MARBLE Team Conference Presentations

In addition to attending these conferences, museum and library colleagues presented alongside peers in the cultural heritage field. At the Museum Computing Network, Abby Shelton joined Jessica Breiman (University of Utah), Adrienne Figus (Smith College), Katrina Wratschko (Philadelphia Museum of Art), and Juliet Vinegra (Philadelphia Museum of Art) to discuss collaboration between archives, libraries, and museum on digital projects. The panelists each presented from a unique institutional perspective on how to plan for an achievable project scope, how to sustain collaboration, and how to plan for post-grant work. You can find the slides and speaker notes linked below.

Museum Computing Network 2019 MARBLE slides [linked to the Philadelphia Museum of Art website]

Hanna Bertoldi, Jeremy Friesen, and Victoria Perdomo, all from the MARBLE team, presented on the project at the Association of Registrars and Collections Specialists (ARCS) conference. Their panel focused on the Snite Museum’s effort to remediate collections metadata and Hesburgh Libraries’ development work on the MARBLE technical infrastructure. While there was no time for questions during the conference, their panel has engendered a wonderful post-conference discussion among registrars who have reached out to the presenters.

Association of Registrars and Collections Specialists 2019 slides [linked to the ARCS website]

Student Profiles: Liam Maher assists with exhibition history records for the Snite

Posted on November 26, 2019 by Abigail Shelton

By Liam Maher

This is the first post in a series written by students who have worked on various parts of the MARBLE project.

Liam Maher, an alumnus of Notre Dame and current graduate student at the University of Oregon, assisted in the clean-up of exhibition history records for the Snite Museum of Art in preparation for publishing the collections online in the MARBLE platform. Liam’s process included verifying the paper exhibition files against digital records in the museum’s internal collection database while also consulting Hesburgh Libraries’ resources and publications.

Metadata is to museums as the skeleton is to the human form

When I tell people I do metadata cleanup for an art museum, I am usually met with quizzical looks. Few people think of computer systems, let alone organizing them, as integral to the functioning of an art museum.

A museum’s metadata, or information about its objects, is akin to the human skeleton—it gives form to the “body” of the museum and the many things contained therein. Like a misplaced or broken bone, messy data can result in lots of complications. And, inaccurate metadata, such as an incorrect artist attribution, improper dimensions, or wrong media, can spell disaster for an object and its maintenance.

Exhibition files staged for the summer project. Messy metadata comes in paper form too!

Nine decades of records reveal strong local, national and international connections

Over the summer, I worked with metadata that chronicles the history of exhibitions held at or organized by the Snite Museum of Art. The records span over a nearly ninety-year period, going back to the days when the Snite Museum of Art was simply known as the “Art Gallery.”

I digitized roughly 214 exhibitions from the Snite’s files. As I reviewed the files, it became abundantly clear that the Snite has been a valuable asset for the community, with connections to the Metropolitan Museum of Art, Museum of Modern Art, High Museum, Art Institute of Chicago, and the Louvre that go back as far as 1950.

Such ties have brought priceless works by artists of international acclaim to South Bend. The Snite has and continues to serve as a cultural bridge from Notre Dame to the global art community, providing access to some of the world’s greatest treasures.

Online collections expand global access and impact

The Snite hopes to continue this tradition by increasing its reach through its online presence.

Updated metadata for exhibitions and objects will enable the Snite to launch an online collections site. New features will allow visitors to view the collections more in-depth and immerse themselves in exhibitions from days-gone-by.

Both scholars and casual visitors to the Snite will appreciate being able to more fully understand the institution’s history and the part it has played in the global art scene.

Moving files at the Snite Museum…Messy metadata everywhere!

Looking back. Looking ahead.

When I first started this position, it was easy to get lost in the seemingly endless stacks of files, mysterious cabinets full of unsorted papers, and loose sheets with scribbled notes that somehow corresponded with one of the 900 exhibitions on file at the Snite.

Keeping our team’s long-term goal in mind, however, has been a helpful means for understanding the importance of digital scholarship.

Every digitized collection requires months of planning and development, usable and flexible metadata, and meticulous editing and updating information. The opportunity to participate in this process has given me valuable insight into how museums stay relevant in an increasingly digital age.

Maher presents his work to library and museum colleagues at a July 2019 forum.

Metadata connections

Posted on October 29, 2019 by Abigail Shelton

By Hanna Bertoldi

What is metadata?

I am constantly thinking about metadata, or the data that describes other data.

While most people may not often think about metadata, everyone who uses search tools benefits from clean, well-structured metadata.

Think about a music file. Its metadata might contain the artist’s name, song title, album title, and year it was released. Spotify uses this metadata to help you find the songs you want and also to suggest music you might like. Some of these terms may be hidden from you, such as terms that describe the mood or genre of a song, but they all work in the background to create easily browsable categories.

Figure 1: Spotify browse page

Museums and metadata

Like Spotify, museums also create and use metadata, but, instead of music, the metadata describes pieces of art.

As its best self, museum metadata has the potential to bridge gaps between cultural heritage institutions and provide access beyond virtual and physical barriers. This unbounded potential can only be leveraged by using accurate and standardized metadata.

But, when I think about museum metadata, I mostly think about how messy it is. (Apparently, Spotify thinks about this too). In the screenshot below, the fields “Creation Date” and “Century” reveal some examples of data inconsistency and redundancy. For instance, you can see that at one time items without an identifiable creation date were marked with “no date” whereas others were marked with the abbreviation “n.d.” In addition, the same date is repeated in both the “Creation Date” and “Century” fields for many of these objects. These kinds of metadata errors can prevent database users from finding objects that meet their search criteria.

Screenshot of database page showing lists of dates

Figure 2: Screenshot of creation date export from EmbARK, the museum’s collections management system

Organizing the “mess” with controlled vocabularies

Controlled vocabularies are a critical element of creating consistent metadata. A controlled vocabulary is an organized arrangement of words and phrases used to index content and to retrieve content by browsing and searching.

If we wanted to build an application where you could search every artwork in the world, every arts organization or collector would have to agree to use the same vocabulary, or a “controlled vocabulary,” to describe their objects. We couldn’t have the Getty Museum using “Claude Monet” while the Louvre was using “Oscar-Claude Monet” to describe the artist that painted water lilies in the nineteenth century. They would have to agree on the same version of the artist’s name.

Beginning in the 1980s, the Getty Research Institute started developing controlled vocabularies for cataloging visual arts and cultural heritage. The terms and associated information in the Getty Vocabularies are valued as authoritative because they are derived from published sources and represent current research and usage in art history and cultural heritage communities.

In short, the Getty Vocabularies provide a convenient system for mapping data, without spending a lot of wasted time reinventing what already exists.

Database screenshot showing list of keywords

Figure 3: Searching EmbARK, the museum’s collections management system, by subject terms

Applying the Getty Vocabularies to Snite Museum collections

Now, back to the messy data part.

The Snite Museum of Art has been thinking about subject retrieval for quite a while. Perhaps as early as 2010, Snite curators began cataloging works of art with subject terms or keywords. This metadata allows curators to search for things about an object, such as who or what is depicted. Unfortunately, these terms were used without standardization.

Building off of the curators’ previous work, my goal as Collections Database Coordinator was to transform these subject terms or keywords so that they fit within the Getty Vocabularies. On a superficial level, this process has helped to clean up things such as errors in spelling and inconsistencies with formatting and style. On a deeper level, the process created relationships among objects and increased discoverability.

Photograph of man pulling woman in a rickshaw

Figure 4: Unidentified photographer, Geisha in a Ricksha, Japan, ca. 1880-1890, albumen silver print with applied color. Snite Museum of Art, University of Notre Dame. Acquired with funds provided by Robert E. (ND ’63) and Beverly (SMC ‘63) O’Grady, 2008.054.002.

Such immediate improvements in discoverability are a result of the way Getty Vocabularies are structured: as hierarchical thesauri.

Getty Vocabularies are hierarchical to allow similar objects to be grouped together. For example, you want to discover all of the works in your collection that depict transportation.

You would search the term “vehicles” and return all objects that use the word “vehicles”
You would also return any objects with more specific terms that are organized underneath the term “vehicles” such as “milk carts,” “fire engines,” “wheelbarrows,” etc.

With the hierarchical organization, you would not have to think of or know all the various subsets of vehicles.

Getty Vocabularies are also thesauri because they connect variable terms that express the same concept. It doesn’t matter if you search for “ricksha,” “rickshaws,” or the French pousse-pousse — the computer will always know what you mean.

By leveraging the research that Getty has done, I can make the Snite collections more usable to a diverse audience.

Contributing to Getty Vocabularies to pave the way for MARBLE

Creating a thesaurus for the entirety of human knowledge is a big task, so Getty relies on contributions from the user community to expand their vocabularies.

I’ve come across many terms that weren’t part of the Getty Vocabularies in the Snite Museum’s database. These terms could have stayed “local,” which would have made the information only available to our staff. Our purpose, however, is to make the Museum’s collections searchable alongside the University Archives, Rare Books & Special Collections, and other cultural heritage materials on Notre Dame’s campus.

To make sure our collections are cataloged in a way that computers can understand, I needed to add these missing terms to the Getty Vocabularies.

Screenshot of Getty vocabulary entry for carrots

Figure 5: Record for carrots created from my submission

Using the Snite’s collections, I have contributed over 80 terms to the Getty Vocabularies within the past year. The terms are reviewed and published so that they can be used by others. Terms like these will be a core piece of the MARBLE website so that users can search across different kinds of collections through a single search portal.

MARBLE: Combining controlled vocabularies

The Getty Vocabularies are just one example of a controlled vocabulary. Other vocabularies are appropriate for cataloguing different kinds of materials, such as the Library of Congress subject terms for print materials or the National Institute for Health’s Medical Subject Headings for biomedical information. Although we don’t have many biomedical items slated for inclusion in the MARBLE site, we will need to accommodate the different controlled vocabularies used by art museum, library, and archival catalogers.

Our Metadata Team is currently working on a solution for combining the authoritative vocabularies used by the Hesburgh Libraries and the Snite Museum. We’ve considered everything from enforcing a singular controlled vocabulary across collections to experimenting with Linked Open Data solutions that would allow us to combine different vocabularies using emerging web technologies. Cleaning up the Snite’s subject terms is just the first step towards revealing metadata’s best self. Stay tuned for future posts on our progress.

While our immediate aim is to improve discoverability, our future aim is to provide some of the same services as Spotify. Users may not be aware that controlled vocabularies are working to suggest artworks or texts that might be of interest to them. My hope is that specific metadata clean up will seamlessly improve searching and browsing on a large scale.