Transforming Metadata into Linked Data

Bahnemann, Greta, Michael Carroll, Paul Clough, Mario Einaudi, Chatham Ewing, Jeff Mixter, Jason Roy, Holly Tomren, Bruce Washburn, and Elliot Williams. 2021. Transforming Metadata into Linked Data to Improve Digital Collection Discoverability: A CONTENTdm Pilot Project. Dublin, OH: OCLC Research. https://doi.org/10.25333/fzcv-0851.

This newly-released report provides insights from OCLC’s CONTENTdm Linked Data Pilot project with participation from five institutional libraries: The Huntington Library, the Cleveland Public Library, the Minnesota Digital Library, Temple University Libraries and the University of Miami Libraries. 

Digital Projects Librarian at the Huntington, Mario Einaudi was quoted in a promotional video associated with the project as describing the value of doing this type of work with linked data.  "Things do not exist in a vacuum. They are connected and interconnected in ways that sometimes we don't know or don't see. Having linked data allows for that connection, allows for that context to be formed, allows you to see things in new ways that you may not have thought of.”

Findings from the OCLC pilot suggest “that there is significant potential for improved discovery and more efficient data management when the materials that have been digitized are described using a shared data model, where headings are associated with linked data entities and relationships and when the entities and relationships are brought together into a single aggregation.” Doing so however requires “substantial and shared resource commitments from a decentralized community of practitioners” using well-designed data transformation tools and decentralized workflows. 

Among the conclusions from the report: 

“The project confirmed key aspects of the linked data value proposition, that cultural material discovery and data management can be significantly improved when the materials are described using a shared and extensible data model, when metadata string-based headings are transformed to linked data entities and relationships, and when those entities and relationships are brought together into a single discovery system. In this environment, the technology works in service to both the staff, who can more easily and accurately impart the expertise they have about the collections they steward, and to the researcher, who can see more robust connections between— and context about—the cultural materials that make up CONTENTdm collections.”  

“Several of the prototype applications developed during the pilot point the way to advantageous additions to the CONTENTdm toolkit. In particular, the Image Annotator encourages domain experts to enrich material descriptions, and the Field Analyzer helps CONTENTdm users make sense of the variations in field definitions and uses across their collections (a prerequisite for more holistic data rationalization and transformation)”

The full text of the 75-page report may be downloaded here.