The century’s old tradition of the library’s mission continues to resonate in the information profession, even in today’s fast-moving development of mobile technology.1 The centrality of this mission is indisputably integral to the user’s research experience.
In the last two decades, information professionals have been under pressure to remain relevant in the world of web data.2 Information professionals, in particular those who provide bibliographic description, have had to rethink and retrain themselves in the face of a new data service model for the records that they create and curate.
Library communities initiated several projects that attempted to respond to the shifting information landscape and remain relevant to their mission.3 On May 13, 2011, the Library of Congress (LC) issued a statement on transforming the bibliographic framework.4 Zepheira5 was engaged to spearhead the process of rethinking bibliographic control beyond the MARC communication format in a way that could extend to a wider bibliographic framework—content agnostic,6 and able to support traditional bibliographic, authority, and holdings data, in addition to aligning them with services that go beyond traditional information structures, both physical and virtual. For practitioners—in this case, cataloging professionals—to begin working in this new environment, a change in their understanding of the anatomy of a record must occur. A record consists of various components—author, title, publisher, physical description, etc. To think and work with each component as data instead of text strings is the basis of the revolution. Data can be recognized by machine methods, and connections between data can be made among any resources containing an identifier. These data can be organized or regarded as an assertion or a set of assertions about a resource. These assertions state a named relationship between resources.
BIBFRAME (Bibliographic Framework) seeks to serve as the foundation for the future of bibliographic description. Because its approach to system platforms is agnostic, it hopes to integrate with the wider information community beyond libraries and organizations. The data model employs a linked data conceptual design and language (RDF/XML7) that is common for web architecture. A framework consisting of a web of data will leverage the web as an architecture that allows the assembly and reassembly of data defined in higher or granular levels. This model enables the integration of existing bibliographic standards and provides a roadmap toward the development of alternative approaches to information service. The structure organizes data in the following classes: Creative Work, Instance, Authority, and Annotation.
The relationship model is based on FRBR (Functional Requirement for Bibliographic Records)8 and RDA (Resource Description and Access)9 elements and is expressed as an entity with properties and attributes that show assertion(s) between two links of person, family, corporate body, concept, place, etc. The assertion relays a meaningful interpretation. Figures 1 and 2 illustrate, via an application profile, the concept of entity in the FRBR/RDA environment, and a possible alignment to BIBFRAME classes.10
Serialization of BIBFRAME RDF model is not locked in such a way that the modeling would impede communication and interoperability of the data. Several models were put in place for demonstration, such as RDF/XML, Turtle,11 and N-Triple12 in the hope that data points can connect seamlessly. Thus the model design is optimized and serves as a network central that advances data analytics and transforms research simply because it makes interconnectivities among things commonplace.
When the BIBFRAME initiative surfaced in 2012, its design characteristics struck a chord with the George Washington University Libraries (GW) administration: customization, openness, productivity, shareability, and resource development. They also recognized that GW staff could make an important contribution by participating in the initiative.13 By being an early experimenter (EE), GW Libraries had a unique opportunity to contribute and establish a new standard that would benefit researchers navigating the information sphere. An institutional commitment to be involved on this scale challenged both the lead participants and library staff members, who were called upon to contribute a portion of their skills and talents to the project. It was a journey for our small group that helped solidify our professional beliefs.
GW’s data were created, contributed, and collected over a long period of time, and were migrated from various platforms. Given that situation, it would be unrealistic to expect data consistency throughout the lifecycle, and the possibility existed that these data might be erroneous. The analyses of GW’s bibliographic data conducted in its consortial knowledge base, Voyager,14 validated this assumption.15
The BIBFRAME Initiative established an ambitious roadmap16—the creation of a test set to be funneled through the Library of Congress and Zepheira pipelines in October 2012. A draft for local adaptive process was prepared by December 2012, and data modeling feedback occurred in January 2013.
Compared to other early experimenters, GW’s smaller size allowed it to more easily get a team ready to meet the established benchmarks.17 However, it required a completely different mindset for catalogers, who view their work of describing, recording, and classifying a library item from a holistic angle, with an endpoint being the creation of a bibliographic record. Programmers, on the other hand, interpret a record differently: as data. BIBFRAME’s approach is to dissect a record into data components, treating text as data, which can connect with other data in many different ways, on many different levels of granularity.
Learning to asses datasets, from analyses to selection and then transformation, was an excellent opportunity to build staff confidence. At that time, neither cataloging staff nor programmers at GW had needed to immerse themselves regularly in RDF/XML vocabulary and data structure. Possessing both a limited technical and programming skill set, and competing, existing library priorities, GW narrowed its data focus. Staff worked on transforming selective datasets, and examined the results with an eye both on the current “clinical” process and on using this data as building blocks for the future.
GW’s modeling used simple bibliographic records of a monographic nature. Data contained mixed publication and creation date ranges, but excluded records describing multiple versions and complex holdings locations. Authority files were considered out-of-scope for this initial phase. Extracted data were placed in Washington Research Library Consortium (WRLC) servers for testing. Figure 3 illustrates the dissection of MARC data and its transformation to the proposed BIBFRAME vocabulary.
Aligning tasks closely with existing skills and interests of library staff encouraged GW to envision what it would be like to transition to the BIBFRAME environment. The process recognized the value of building upon simple and less complicated scenarios first, reinforcing staff confidence in order to prepare them for the more complex endeavors ahead. Throughout the process, a learning environment was established, and new relationships among staff were forged and nurtured. Finally, collaborations with other early experimenters helped to discover and plan for skills improvements in addition to strengthening GW’s commitment to service within its traditional confines and beyond, encompassing GW’s faculty and students in an expanding circle of benefit as library staff continue their engagement in future collaborative projects.
BIBFRAME Next Steps
By Autumn 2013, early experimenters had completed drafting of more than a dozen point papers.18 Some topics have more than one draft available for public comment.19 Refinements to the initial pages of the Vocabulary Navigator20 help to apply Work, Instance, Authorities, and Annotation relationships to MARC 21.21 Transformation tools that have been in place on the BIBFRAME website22 will become generally available. The group has also begun preparing use cases for public review.23 Annotations are inserted into BIBFRAME classes to help the end user better understand the intended and potential usages, as shown in Figure 4.
GW invested a great deal of effort and resources in the BIBFRAME project. Library administration’s attitude allowed that even if the result ended in an abrupt termination of the project, staff would have gained valuable lessons by participating in the process. In the overall scheme of things, the investment of resources—staff, equipment, time, and skills—will eventually pay off, if not in this direction, then in another venue. BIBFRAME opens up the library world in more ways than one could imagine. The information world, in particular the library world, has been transformed by information exploded out of the book into many formats, some of them as yet unimagined. GW staff, as one of the EEs, had a taste of this shifted change which prepared them to accommodate new approaches.
The proposed BIBFRAME vocabularies and data modeling were tested. Some appeared to have passed and validated the original goal. Stakeholders from diverse information communities actively participated in data modeling and refinements.24 Its adaptability can be extended beyond MARC 21 to UKMARC, UNIMARC, etc. However, replacing MARC format completely as a feature cataloging system is monumental. Any replacement system, whether implemented in the current environment or deployed in a cloud, may take a few months or even years. Prediction is hard. BIBFRAME has made a good start. More awaits.
Jackie Shieh (email@example.com) is Coordinator, Resource
Description Group at the George Washington University Libraries. ORCID: http://orcid.org/0000-0003-3214-8846