Headshot of NISO Managing Director, Todd Carpenter

February 2011

There is a point in the process of developing and selling products when standardization makes perfect sense. Prior to that point, there are individual or ad hoc approaches, workarounds, and other tech community "hacks." Essentially, what organizations are doing is trying to make old processes and strategies work for the new environment. However as product usage grows and customers demand integration and interoperability, the old processes hacks, workarounds, and shortcuts begin to cost more and are not well-suited to the new production reality. At that moment, it begins to make sense to invest in standardizing those things that cause the most friction and add little differentiation value to the end product.

We are now entering that phase with the widespread adoption of digital book content. With the modest scale of e-book sales in the past decade, having e-book production as an appendage of the print-book production workflow was a rational business decision. It made sense to use print-book identifiers and our print-based model for managing and distributing content. Retooling library management systems for massive e-book collections or the availability of repositories like Hathi Trust or Google Books seemed unnecessary. However, as the scale of e-book availability and use is rapidly accelerating, the pain of getting by with workarounds is increasing exponentially.

This is most apparent in the issues facing the ISBN system. While the standard itself is clear that each different version of a book or e-book should get its own ISBN for unambiguous identification, publishers and suppliers are applying the standard in an inconsistent way that is making unambiguous identification increasingly difficult. One of the problems with e-books is that we don't have a good understanding of what makes one e-book different from another. Is it the file format? Is it the rendering of the file and its appearance on different screens? Is it the rights a user has to access the file? Is it the digital rights management? The source file distributed by the publisher may include all the functionality, but what the end user receives may have only some of the functionality included or turned on or it may have been stripped or enhanced by a third party in the supply chain.

The ISBN system and the publisher community more generally do not yet have good responses to these questions. The International ISBN Agency conducted a survey last year and released revised guidelines on the use of ISBNs for e-books. Simultaneously, the Book Industry Study Group (BISG) carried out a similar Identification of E-Books Research Project studying identification practices for e-books in the US. The working group within BISG (on which I serve) is reviewing the report and its recommendations. A good summary of the report, issues, and the committee meeting where the report was discussed is summarized below.

There is a need though to think more holistically about the standards we have for managing print materials and where these identification and metadata structures begin to totter under the weight of the options for digital content. For example, how will we deal with content "chunks" that are re-packaged on the fly and what will our content management systems look like that manage those materials? How can we tie together the various manifestations for collection management? Are there ways to better manage the metadata and ensure its accuracy?

NISO plays a critical role in this work, both in representing US interests to the ISO technical committee that develops and revises the ISBN standard, as well as acting as Secretariat for that committee. We have also been working closely with BISG and the International ISBN Agency to educate the community about the problems and potential solutions. And we provide a place to bring together the affected content producers, libraries, and technology companies in a way that the interests of each can be represented equally.

This is just one prominent example of where the pain of using hacks to solve e-book problems is driving the community to work toward consensus. There will be many more. If we as a community are unable to address these challenges, we risk the collapse of vital systems that underpin so much of the exchange of information. I am confident though that we are well placed to solve these challenges, primarily because the pain and cost of not doing so will certainly be too much to bear.

Todd Carpenter’s Signature

Todd Carpenter

Managing Director

NISO Reports

February Webinar: Back from the Endangered List: Using Authority Data to Enhance the Semantic Web

Librarian use of authority files dates back to Callimachus and the Great Library of Alexandria around 300 BC. With the evolution of powerful computerized searching and retrieval systems, authority data appears to have outlived its usefulness. However, the semantic web provides an opportunity to use authority data to enable computers to do the search, aggregate, and combine information on the Web.

Join the Back from the Endangered List: Using Authority Data to Enhance the Semantic Web webinar on February 9, 2011, from 1:00 - 2:30 p.m. (Eastern Time) to learn about the amazing services that can result when the rich data included in name authority files, and other standardized vocabularies are linked via the Semantic Web.

Speakers and topics are:

  • The Virtual International Authority File (VIAF)Jeff Young, Software Architect, OCLC Research, will explain how Linked Data tools, VIAF, and its contributors illustrate the potential interplay between centralized and decentralized interoperability of authority information.

  • Authorities as Linked Data HubsRichard Wallis, Technology Evangelist, Talis, will discuss how what we collectively refer to as authorities have the potential-if published openly, simply, and soon-to become hubs for the linking of library and non-library information across the Web of Data.

  • The Getty Authority FilesMurtha Baca, Head, Digital Art History Access, will provide real-life examples of authority data in art and architecture applications at the Getty.

For more information and to register, visit the event webpage.

March Webinar: Patrons, ILL, and Acquisitions

Patron-Driven Acquisitions (PDA) is emerging as a new library collection development model and challenging existing business and service models for vendors and publishers. PDA is moving beyond individual projects and becoming yet another model to build and maintain library collections. What guidelines and standards will be required to support PDA? NISO's March webinar Patrons, ILL, and Acquisitions, to be held on March 9 from 1:00 - 2:30 p.m. (Eastern time), will provide perspectives from three libraries on this new acquisition model.

The three speakers for this webinar—Nancy Gibbs (Duke University), Peter Spitzform (University of Vermont), and Lynn Wiley (University of Illinois Urbana-Champaign)—will:

  • Present an overview of the introduction and evolution of PDA.

  • Describe the kinds of PDA—both print and electronic—that have developed (e.g., ILL requests and loading records into an OPAC based on library-defined parameters, including approval plans), and whether best practices are in place yet.

  • Describe the mechanics of workflow (vendor systems, ILS and ILL systems, publisher data), and discuss whether existing standards support PDA, or if there are standards that need to be developed.

  • Discuss the long-term effects on budgets (which can be spent very quickly), collection development (will the print or electronic book collection remain relevant?), interlibrary loan (will ILL borrowing decrease significantly if items are purchased rather than borrowed?), and publishing models.

For more information and to register, visit the event webpage.

NISO/DCMI to Cosponsor Webinar Series Beginning in March with Metadata Harmonization: Making Standards Work Together

NISO and DCMI (Dublin Core Metadata Initiative) are happy to be partnering once again to bring to you three joint NISO/DCMI webinars in 2011. And we are offering a discount package if you would like to participate in all three webinars. Buy all three, and get NISO's June 8th webinar, Linking the Semantic Web Together, free. This is a 25% discount in total registration price! To register for the whole series, visit the NISO/DCMI Webinar Packages webpage.

The first joint webinar to be held on March 16 from 1:00 - 2:30 p.m. (Eastern) will feature Metadata Harmonization: Making Standards Work Together. Metadata plays an increasingly central role as a tool enabling the large-scale, distributed management of resources. However, metadata communities that have traditionally worked in relative isolation have struggled to make their specifications interoperate with others in the shared web environment.

This webinar explores how metadata standards with significantly different characteristics can productively coexist and how previously isolated metadata communities can work towards harmonization. The webinar presents a solution-oriented analysis of current issues in metadata harmonization with a focus on specifications of importance to the learning technology and library environments, notably Dublin Core, IEEE Learning Object Metadata, and W3C's Resource Description Framework. Providing concrete illustrations of harmonization problems and a roadmap for designing metadata for maximum interoperability, this webinar will provide a bird's-eye perspective on the respective roles of metadata syntaxes, formats, semantics, abstract models, vocabularies, and application profiles in achieving metadata harmonization.

Speakers for the Metadata Harmonization webinar:

  • Mikael Nilsson, a PhD in media technology from the Royal Institute of Technology in Stockholm, has extensive expertise in metadata standardization and interoperability, particularly at the crossroads between the Dublin Core, World Wide Web Consortium, and IEEE Learning Object Metadata communities.

  • Thomas Baker, Chief Information Officer of the Dublin Core Metadata Initiative, was recently co-chair of the W3C Semantic Web Deployment Working Group and currently co-chairs a W3C Incubator Group on Library Linked Data.

To register for the Metadata Harmonization webinar, visit the event webpage.

SERU Survey – Respond by February 4

SERU logo

NISO is interested in learning more about how libraries and publishers are currently using SERU: Shared E-Resource Understanding. You are invited to respond to a very brief (4 question) online survey that seeks to identify and confirm those interested in and using SERU. Following this survey, the SERU Standing Committee plans to engage the community in discussions about the potential use of SERU for e-books and feedback on the ONIX-PL version. Please respond by Friday Feb. 4.

For an update on the activities of the SERU Standing Committee, join the free NISO open teleconference on February 14 from 3-4 p.m. eastern time. Just call 877-375-2160 and when prompted, enter the Conference Code: 17800743#

The NISO SERU (Shared E-Resource Understanding) Standing Committee provides maintenance and support for NISO RP-7-2008, SERU: A Shared Electronic Resource Understanding. The committee has given to the Business Information Topic Committee a proposal to minimally revise the SERU document in order to allow for easier use with e-books. This primarily entails adjusting current language that specifically references subscriptions to allow for broader application of SERU, and includes a new paragraph around ILL. If approved, the standing committee will be reaching out to stakeholders for early vetting in the next month, followed by a draft release for public comment expected in late spring/early summer.

Hold the Date for the NISO Forum on Mobile Technologies in Libraries

The NISO one-day in-person forum on Mobile Technologies in Libraries that was previously announced for fall has been moved to May 20, 2011 in Philadelphia. More information about the forum and the agenda will be posted shortly to the event webpage.

New Specs & Standards

ANSI/ARMA 18-2011, Implications of Web-Based, Collaborative Technologies in Records Management

This new American National Standard provides requirements and best practice recommendations related to policies, procedures, and processes for an organization's use of internally facing or externally directed (public or private), web-based, collaborative technologies such as wikis, blogs, mashups, and classification (tagging) sites. (It does not address e-commerce, e-mail, instant messaging, or workflow solutions). Adherence to ARMA International's Generally Accepted Recordkeeping Principles® (GARP®) is also supported and encouraged by advice contained within this publication.

ISO/IEC 19788-1:2011, Information technology – Learning, education and training – Metadata for learning resources – Part 1: Framework

This new international standard specifies metadata elements and their attributes for the description of learning resources and resources directly related to learning resources. It provides principles, rules and structures for the specification of the description of a learning resource; and identifies and specifies the attributes of a data element as well as the rules governing their use. The standard is information-technology-neutral. Five more parts to this standard are under development: Dublin Core elements, Basic application profile, Technical elements, Educational elements, and Availability, distribution, and intellectual property elements.

LC Network Development and MARC Standards Office, MARC Code Lists Available as Linked Data

The Library of Congress (LC) web service Authorities and Vocabularies provides access to LC authority and vocabulary data as Linked Data. The vocabulary data are published in RDF using the SKOS/RDF Vocabulary and are available for bulk download. Newly added to the site are: MARC List for Countries, MARC List for Geographic Areas, and MARC List for Languages. The MARC Countries entries include references to their equivalent ISO 3166 codes. The MARC Languages have been cross referenced with ISO standards 639-1, 639-2, and 639-5, where appropriate.

LITA Standards Task Force, Strategic Directions White Paper

The LITA Executive Committee approved the creation of a LITA Standards Task Force in March, 2010. The Task Force was charged to: 1) Explore and recommend strategies and initiatives LITA can implement to become more active in the creation and adoption of new technology related standards that align with the library community; and 2) Propose an organizational structure that will support and sustain LITA's increased involvement in the standards arena both within ALA and beyond. This white paper includes the results of a survey of the members and the Task Force's recommendations on shorter term and longer term strategies. Comments are requested; they can be posted online or sent by e-mail to Yan Han, Chair of the LITA Standards Task Force.

PREMIS Editorial Committee, PREMIS Data Dictionary for Preservation Metadata version 2.1

This incremental revision of the PREMIS Data Dictionary includes corrections of errors, clarifications of some semantic units, changes for consistency, and the addition of a few semantic units that resulted from requests to the PREMIS Editorial Committee. This revision is considered non-substantial in that there are no major changes that affect existing PREMIS descriptions, so is an incremental version 2.1. Both the full data dictionary and the schema are revised.

Unicode Consortium, Unicode 6.0.0, Chapters 1-6

Version 6.0 is the first major version of the Unicode Standard to be published solely in online format. Published chapters include: Introduction, General Structure, Conformance, Character Properties, Implementation Guidelines, Writing Systems and Punctuation, European Alphabetic Scripts. Version 6 adds 2,088 characters, adds new properties and data files, amends the text of the Standard with many changes to the core specification, corrects character properties for existing characters, and provides format improvements. Not yet published are the chapters on Middle Eastern scripts, South Asian scripts, East Asian scripts, additional modern scripts, archaic scripts, symbols, and special areas and format characters. Version 6 of the Unicode Standard is synchronized with the forthcoming second edition of 10646: ISO/IEC 10646:2011, which represents the republication of ISO/IEC 10646:2003 plus the rolled-up content additions from amendments 1 through 8.

W3C Launches RDF Working Group

The newly launched the W3C RDF Working Group is charged with updating the cornerstone standard for the Semantic Web: the Resource Description Framework (RDF). The scope of work is to extend RDF to include some of the features that the community has identified as both desirable and important for interoperability based on experience with the 2004 version of the standard, but without having a negative effect on existing deployment. Some of the anticipated features include JSON and Turtle serializations, and a standard model and semantics for multiple graphs and graphs stores. The working group is co-chaired by David Wood (Talis Inc.) and Guus Schreiber (Vrije Universiteit).

Media Stories

eBook Identifier Confusion Shakes Book Industry
Go to Hellman [blog], January 19, 2011; by Eric Hellman

The book publishing industry is undergoing earthquake-like shifts in the move from print to digital formats. While the International Standard Book Identifier (ISBN) has long provided the structure that holds the book supply chain together, it is also being shaken by e-book distribution. In a Book Industry Study Group (BISG) meeting on e-book identification, held on January 13, Michael Cairns (Information Media Partners) presented the results of a study he did for BISG on current practices in assigning ISBNs to e-books. From his interviews of 75 experts from all areas in the supply chain, Cairn reported no consistent practice or policy in e-book ISBN assignment and no great interest in fixing the situation. Companies have already patched their workflows and information systems to work-around the issues. A recent move to the "agency model" of e-book distribution-where the publishers set the prices and retailers get fixed commissions-has given some impetus to e-book ISBN assignment as the publishers use the ISBN to distinguish between different enhanced or non-enhanced versions of an e-book and their related pricing. E-books that hadn't previously been assigned separate ISBNs suddenly needed them. The business model of licensing e-books also creates problems: if the same e-book has multiple licensing terms, thus creating multiple "products," does each product require a separate ISBN? Brian Green, the ISBN International Executive Director, reported on a study they had commissioned that was more international in scope and had similar findings to Cairn's study. ISBN International has released an updated FAQ on their recommendations for e-book ISBN assignment. A BISG committee will be working to develop agreement on a common approach to the problem.
(Link to Web Source)

Web Scale Discovery What and Why?
Web Scale Discovery Services, Chapter 1, Library Technology Reports, 47(1), January 2011; by Jason Vaughan

Defined for this report as "a service capable of searching across a vast range of preharvested and indexed content quickly and seamlessly" through a web interface, web scale discovery services are the latest evolution in a long history of search and discovery that includes the ILS, Z39.50, commercial federated search tools, OpenURL link resolvers, and next-generation library catalogs. Key characteristics of web scale discovery include: a centralized index at the article level to both local and remotely hosted content, a single "Google-like" search box as well as advanced searching, quick delivery of relevancy-ranked results, and greater flexibility and openness in the information architecture. Among the reasons for libraries to adopt web scale discovery are: competition from services on the Internet, perception that library systems are harder to use and less comprehensive, and too many library-acquired resources that aren't discovered and used. Five Web scale discovery services from major vendors were identified by the author: OCLC WorldCat Local, Serials Solutions Summon, Ebsco Discovery Services, Innovative Interfaces Encore Synergy, and Ex Libris Primo Central; all but Encore Synergy are profiled in other chapters of this report. Each of the profiled discovery services exposes contracted providers' content and open access content as well as a library's local records, digitized content, and institutional repository. A large amount of citation-level content is available to the end user even where the library does not license the full-text-or the searches can be limited to only content that the user can access in full-text. Most web scale discovery services are less than a year old. This report endeavors to give libraries an overview of this quickly evolving and competitive marketplace. (Link to Web Source)

NISO Note: NISO voting members mentioned in this article (or whose products were mentioned) are: Ebsco, Ex Libris, IEEE, Innovative Interfaces, National Library of Medicine, OCLC, ProQuest, Reed Elsevier, and Serials Solutions.

Author Identifier Overview
GobbledyGook [PLoS Blog], posted ; by Martin Fenner

The benefits for having unique identifiers for authors include: the ability to find research collaborators, enabling institutions to highlight faculty activities, simplifying peer review and the publishing workflow, improving grant submission and tracking of grant-funded research, identifying scholarly society members' publications. Several author ID systems already exist, some new, some a decade old. These include: AuthorClaim (Open Library Society), LATTES (Brazil National Council for Scientific and Technological Development), VIAF (OCLC and 15 national libraries), NARCIS (Royal Netherlands Academy of Arts and Sciences), arXiv author ID (Cornell University Library), Scopus author ID (Elsevier), Names Project (Mimas, British Library), Researcher ID (Thomson Reuters), ORCID, and PubMed author ID (National Library of Medicine). The forthcoming International Standard Name Identifier (ISNI) is broader in scope, covering all creators, not just scholarly work authors. Open ID is a de facto standard for internet identification and authentication. There are three important aspects of an author ID system: identity, reputation, and trust. One approach to the identity aspect is to give all students an identifier upon graduation, which Brazil and the Netherlands do with science students. Another is to assign an ID when someone creates a scholarly work, as arXiv and Scopus do. More difficult issues are how to retrospectively assign IDs to authors who have written in the past (and may even be deceased), authors in countries where many people have the same name, e.g., China and Korea, and universal identifiers that cross disciplines and nationalities. An identifier by itself isn't especially meaningful without related metadata that provides the reputation of the author through a profile proxy. This may involve the use of other identifiers, such as the DOI, to collect or point to an author's output. In some cases, though, such as in peer review, the identifier used must also keep the author's identity secret, which would preclude the use of a universal public identifier. Trust underlies both identify and reputation. Authors have to trust the identifier system and its privacy controls as well as its openness, and the organization that runs the service. Other users have to trust what is contained in the author's profile, which is not possible if the profile is totally author supplied and unverified. Although much progress is being made in author identification, many challenges still remain.
(Link to Web Source)

NISO Note: NISO as Secretariat for ISO TC46/SC9 has overseen the development of the International Standard Name Identifier (ISNI) standard, ISO 27729. The final version of the standard has been approved and it will be published as soon as the arrangements for the ISNI Registration Authority have been finalized with ISO.

An Interview with Ted Koppel, Auto-Graphics, Regarding Standards
The Charleston Advisor, v. 12, no. 3, pp. 61-64; by George Machovec

In the five years since his last interview, Koppel identifies several standards or initiatives that have made or show promise of making a substantial impact on improving library systems and interoperability. These include the Standardized Usage Statistics Harvesting Initiative (SUSHI) that allows automated retrieval in a standard format of COUNTER reports; Knowledge Base and Related Tools (KBART), a recommended practice for improving the quality of OpenURL link resolver information; the DAISY Authoring and Interchange Framework, a revision to the Digital Talking Book standard that provides major updates in digital authoring; ERMI-2, a gap analysis being conducted to identify the work that needs to be undertaken in phase 2 of electronic resource management standardization. Some standards like ISBN, ISSN, and Z39.50 are examples of those that have "stood the test of time" and OpenURL, Dublin Core, and NCIP are also among the stalwarts. Among those that Koppel thinks may not stand the test of time are FRBR (Functional Requirements for Bibliographic Records), RDA (Resource Description and Analysis), and the Institutional Identifier initiative (I2) that may get integrated into the international ISNI standard. The CORE standard, discussed in a previous interview, was slowed down by the recession and vendor's unwillingness to take on its implementation at that time. It was issued as a recommended practice and will be revisited for standardization once some implementation experience has been gained. ONIX is actually several standards: ONIX for Books, ONIX for Serials, and ONIX-PL (publication licenses) that as yet have relatively low adoption levels. The ISSN-L linking identifier that was added to the standard in its last revision has also not had much visible uptake as yet. But linking with both ISSN and ISBN are going to become far more important with a need for providing resolution services, which brings up many issues around governance and how costs are recovered. Some new standards Koppel would like to see are a global library identifiers standard address number (SAN) and more data exchange standards that cross over application types. But the critical issue is compliance; if everyone interprets and implements a standard in a nonstandard way, then it destroys the interoperability the standard was designed to accomplish. "One of my major concerns these days is ensuring compliance, performance, and interoperability into every standard we develop." (Link to Web Source)

NISO Note: SUSHI, KBART, the DAISY standard revision, ERM gap analysis, Z39.50, OpenURL, Dublin Core, NCIP, CORE, , and SAN are all NISO standards or initiatives. NISO is also the secretariat of the ISO committee responsible for the ISBN, ISSN, and ISNI standards. Ted Koppel was NISO's first co-chair of the Content and Collection Management Topic Committee.

Supporting Science through the Interoperability of Data and Articles
D-Lib Magazine, v.17, no.1/2, January/February 2011; by IJsbrand, Jan Aalbersberg, and Ove Kähler

Elsevier is using several methods to add value to scientific articles through external resources including: dataset linking, entity linking, and contextualization. Dataset linking takes a full-text article on the SciVerse ScienceDirect platform and links it to the underlying research data that is hosted on another platform. Currently, Elsevier has established bidirectional linking with the PANGEA and the Cambridge Crystallographic Data Center (CCDC) data repositories. Entity linking connects concepts within an article to related datasets. This approach requires that the article be tagged to identify concepts and that the data repository support entity-related URLs. Elsevier currently asks authors to do markup of their articles to link to data entities in the NCBI GenBank, the Worldwide Protein Databank, the CCDC, Molecular INTeractions Database, or the Universal Protein Resource Knowledgebase. They also began using two automated text-mining tools to do the entity markup. Contextualization linking would allow readers to tell when within the article either what data is in the repository or even see the data while still within the article. Elsevier uses an application to extract article-specific data in real-time. They are currently doing this between the Earth and Ocean Sciences journals and a PANGEA/Google Maps application and with a 3D Protein Viewer that links between the Journal of Molecular Biology and the Protein Data Bank. Organizations like DataCite are playing a role in getting data repositories to provide the mechanisms for publishers to link their articles to datasets.
(Link to Web Source)

NISO Note: Reed Elsevier is a NISO voting member. NISO's working group on Supplemental Journal Article Materials is developing a Recommended Practice for publisher inclusion, handling, display, and preservation of supplemental journal article materials. Call in to the free open teleconference on March 14 from 3:00 - 4:00 p.m. Eastern for an update on this working group. For two additional recent articles on this topic, see Open and Accessible Supplemental Data: How Librarians Can Solve the Supplemental Data Arms Race and Abelard and Héloise: Why Data and Publications Belong Together.