Home | About NISO | Blog

Archive for the ‘library’ Category

Introduction to NISO webinar on ebook preservation

Wednesday, May 23rd, 2012

Below are my welcoming remarks to the NISO webinar on Heritage Lost?: Ensuring the Preservation of Ebooks on May 23rd.

“Good afternoon and welcome to the second part of this NISO Two-Part Webinar on Understanding Critical Elements of E-books: Acquiring, Sharing, and Preserving.  This part is entitled Heritage Lost? Ensuring the Preservation of E-books.

Perhaps it is due to the fact that electronic journals were adopted much earlier and more rapidly, that we are more familiar with the archiving and preservation of e-journal content than e-book content. However, just as it did in the late 1990s after e-journals became prevalent, so too the topic of preservation of e-books is now rising up in the minds of people deeply concerned with the long-term preservation of cultural materials.

That is not to say that no one is considering these issues.  Some of the bigger digitization projects involve libraries and as such include preservation as part of their mission.  I’m thinking in particular about the Internet Archive, Portico and the HaithiTrust in this regard, but there are certainly others.  Today we’ll here from two of these groups and what they are doing to support

Another big preservation issue that is frequently overlooked is the model of distribution that many publishers are moving toward, which is a license model rather than a sale model.  I won’t get into either the legal or business rationale for this shift, but I do want to focus on this shift’s implications for preservation and in particular publishers.  An important analogy that I make to publishers is that of renting a house versus selling a house.  When a publisher sells a house (in this case a book), it passes on all the responsibility for the house and it’s upkeep onto the new owner.  Now if a person rents that same house, the responsibility for fixing the leaking roof, for painting the walls and repairing the broken windows generally falls back to the landlord who is renting the house.  Obviously, there is money to be made and the terms of the lease impact who is responsible for what, but in general, the owner is still the primary person responsible for the major upkeep of the house.

In the case of the sale of a book, the publisher is no longer responsible for that item and its preservation onto the new owner, say the library.  It is then up to the library to ensure that the book doesn’t fall apart, that the cover stays clean, or the pages don’t rip.  However, as we move to a license environment, the long-term responsibility of upgrading file formats, of continuing to provide access and functionality falls back to the publisher.  The publisher is the landlord, renting e-books to the publishing community.  And this responsibility requires a great deal more effort than simply hosting the file.  The publishers will eventually need to repaint, to refurbish, to fix the broken plumbing to speak on this digital collection.  I expect that this will be no small feat, and something that few publishers are prepared to address.

The Library of Congress has begun thinking about this problem from the perspective of their demand deposit requirement related to copyright registration for LC’s own collection.  While they are at the moment focused on electronic-only journals, one can envision a scenario where electronic-only books are not that far away.  LC has not explicitly discussed e-book preservation and their current work is only focused on e-journals.  However, the problems that LC is facing is illustrative of the larger issues that they likely will face.  There are standards for journal article formatting using XML, such as the soon to be released Journal Article Tag Suite or (JATS), formerly the NLM DTD.  This project developed by the National Library of Medicine in the US was specifically focused on developing an archival tagging model for journal article content distribution and preservation.  There is no similar model for books that is widely adopted.  If the variation of journal markup is significant, the same complexity for book content is some exponential increase over that.

No archive can sustain a stream of ingest from hundreds or thousands of publishers without standards.  It is simply unmanageable to accept any file in any format from thousands of publishers.    And this is of course, where standards comes in, although this isn’t the forefront of either of our presentations today, it does sit there in the not so distant background.

And there has been a great deal of focus over the past year on the adoption of the new EPUB 3.0 specification. This is a great advancement and it will certainly help speed adoption of e-books and their overall interoperability with existing systems.  However, it should be clear that EPUB is not designed as an archival format.  Many of the things that would make EPUB 3 archival exist within the structure but their inclusion by publishers is optional, not mandatory.  In the same way that accessibility and archiving functionality is possible within PDF files, but it is functionality that most publishers don’t take advantage of or implement.  We as a community, need to develop profiles of EPUB for preservation that publishes can target, if not for their distribution, at least for their long-term preservation purposes both internally and externally.

It will be a long-term project that we will be engaged in.  And it is something that we need to focus concerted attention on, because preservation isn’t the first thing on content creator’s minds.  However, we should be able to continue to press the issue and make progress on these issues.

Mandatory Copyright Deposit for Electronic-only Materials

Thursday, April 1st, 2010

In late February, the Copyright Office at the Library of Congress published a new rule that expands the requirement for the mandatory deposit to include items published in only in digital format.   The interim regulation, Mandatory Deposit of Published Electronic Works Available Only Online (37 CFR Part 202 [Docket No. RM 2009–3]) was released in the Federal Register.  The Library of Congress will focus its first attention on e-only deposit of journals, since this is the area where electronic-only publishing is most advanced.  Very likely, this will move into the space of digital books as well, but it will likely take sometime to coalesce.

I wrote a column about this in Against the Grain last September outlining some of these issues that this change will require.  A free copy of that article is available here.  The Library of Congress is aware, and will become painfully more so when this stream of online content begins to flow their way.  To support an understanding about these new regulations, LC hosting a forum in Washington in May to discuss publisher’s technology for providing these data on a regular basis.  Below is the description about the meeting that LC provided.

Electronic Deposit Publishers Forum
May 10-11, 2010
Library of Congress — Washington, DC

The Mandatory deposit provision of the US Copyright Law requires that published works be deposited with the US Copyright Office for use by the Library of Congress in its collection.  Previously, copyright deposits were required only for works published in a physical form, but recently revised regulations now include the deposit of electronic works published only online.  The purpose of this workshop is to establish a submission process for these works and to explore technical and procedural options that will work for the publishing community and the Library of Congress.

Discussion topics will include:

  • Revised mandatory deposit regulations
  • Metadata elements and file formats to be submitted

Space for this meeting is very limited, but if you’re interested in participating in the meeting, you should contact the Copyright Office.

  • Proposed transfer mechanisms
  • Open Source isn’t for everyone, and that’s OK. Proprietary Systems aren’t for everyone, and that’s OK too.

    Monday, November 2nd, 2009

    Last week, there was a small dust up in the community about a “leaked” document from one of the systems suppliers in the community about issues regarding Open Source (OS) software.  The merits of the document itself aren’t nearly as interesting as the issues surrounding it and the reactions from the community.  The paper outlined from the company’s perspective the many issues that face organizations that choose an open source solution as well as the benefits to proprietary software.  Almost immediately after the paper was released on Wikileaks, the OS community pounced on its release as “spreading FUD {i.e., Fear Uncertainty and Doubt}” about OS solutions.  This is a description OS supporters use for corporate communications that support the use and benefits of proprietary solutions.

    From my mind the first interesting issue is that there is a presumption that any one solution is the “right” one, and the sales techniques from both communities understandably presume that each community’s approach is best for everyone.  This is almost never the case in a marketplace as large, broad and diverse as the US library market.  Each approach has it’s own strengths AND weaknesses and the community should work to understand what those strengths and weaknesses are, from both sides.  A clearer understanding and discussion of those qualities should do much to improve both options for the consumers.  There are potential issues with OS software, such as support, bug fixing, long-term sustainability, and staffing costs that implementers of OS options need to consider.  Similarly, proprietary options could have problems with data lock-in, interoperability challenges with other systems, and customization limitations.   However, each too has their strengths.  With OS these include and openness and an opportunity to collaboratively problem solve with other users and an infinite customizability.  Proprietary solutions provide a greater level of support and accountability, a mature support and development environment, and generally known fixed costs.

    During the NISO Library Resources Management Systems educational forum in Boston last month, part of the program was devoted to a discussion of whether an organization should build or buy LRMS system.  There were certainly positives and downsides described from each approach.  The point that was driven home for me is that each organization’s situation is different and each team brings distinct skills that could push an organization in one direction or another.  Each organization needs to weigh the known and potential costs against their needs and resources.  A small public library might not have the technical skills to tweak OS systems in a way that is often needed.  A mid-sized institution might have staff that are technically expert enough to engage in an OS project.  A large library might be able to reallocate resources, but want the support commitments that come with a proprietary solution.  One positive thing about the marketplace for library systems is the variety of options and choices available to management.

    Last year, during the Charleston Conference during a discussion of Open Source, I made the comment that, yes, everyone could build their own car, but why would they.  I personally don’t have the skills or time to build my own, I rely on large car manufacturers to do so for me.  When it breaks, I bring it to a specialized mechanic who knows how to fix it.  On the other hand, I have friends who do have the skills to build and repair cars. They save lots of money doing their own maintenance and have even built sports cars and made a decent amount of money doing so.  That doesn’t make one approach right or wrong, better or worse.  Unfortunately, people frequently let these value judgments color the debate about costs and benefits. As with everything where people have a vested interest in a project’s success, there are strong passions in the OS solutions debate.

    What make these systems better for everyone is that there are common data structures and a common language for interacting.  Standards such as MARC, Z39.50, and OpenURL, among others make the storage, discovery and delivery of library content more functional and more interoperable.  As with all standards, they may not be perfect, but they have served the community well and provide an example of how we can as a community move forward in a collaborative way.

    For all of the complaints hurled at the proprietary systems vendors (rightly or wrongly), they do a tremendous amount to support the development of voluntary consensus standards, which all systems are using.  Interoperability among library systems couldn’t take place without them.  Unfortunately, the same can’t be said for the OS community.  As Carl Grant, President of Ex Libris, made the point during the vendor roundtable in Boston, “How many of the OS support vendors and suppliers are members of and participants in NISO?”  Unfortunately, the answer to that question is “None” as yet.  Given how critical open standards are to the smooth functioning of these systems, it is surprising that they haven’t engaged in standards development.  We certainly would welcome their engagement and support.

    The other issue that is raised about the release of this document is its provenance.  I’ll discuss that in my next post.

    Upcoming Forum on Library Resource Management Systems

    Thursday, August 27th, 2009

    In Boston on October 8-9, NISO will host a 2-day educational forum, Library Resource Management Systems: New Challenges, New Opportunities. We are pleased to bring together a terrific program of expert speakers to discuss some of the key issues and emerging trends in library resource management systems as well as to take a look at the standards used and needed in these systems.

     

    The back end systems upon which libraries rely have become the center of a great deal of study, reconsideration and development activity over the past few years.  The integration of search functionality, social discovery tools, access control and even delivery mechanisms to traditional cataloging systems are necessitating a conversation about how these component parts will work together in a seamless fashion.  There are a variety of approaches, from a fully-integrated system to a best-of-breed patchwork of systems, from locally managed to software as a service approaches.  No single approach is right for all institutions and there is no panacea for all the challenges institutions face providing services to their constituents.  However, there are many options an organization could choose from.  Careful planning can help to find the right one and can save the institution tremendous amounts of time and effort.  This program will provide some of the background on the key issues that management will need to assess to make the right decision.

     

    Registration is now open and we hope that you can join us.