Home | About NISO | Blog

NISO Standards Bearer Blog

How the Information World Connects

Introduction to NISO webinar on ebook preservation

May 23rd, 2012

Below are my welcoming remarks to the NISO webinar on Heritage Lost?: Ensuring the Preservation of Ebooks on May 23rd.

“Good afternoon and welcome to the second part of this NISO Two-Part Webinar on Understanding Critical Elements of E-books: Acquiring, Sharing, and Preserving.  This part is entitled Heritage Lost? Ensuring the Preservation of E-books.

Perhaps it is due to the fact that electronic journals were adopted much earlier and more rapidly, that we are more familiar with the archiving and preservation of e-journal content than e-book content. However, just as it did in the late 1990s after e-journals became prevalent, so too the topic of preservation of e-books is now rising up in the minds of people deeply concerned with the long-term preservation of cultural materials.

That is not to say that no one is considering these issues.  Some of the bigger digitization projects involve libraries and as such include preservation as part of their mission.  I’m thinking in particular about the Internet Archive, Portico and the HaithiTrust in this regard, but there are certainly others.  Today we’ll here from two of these groups and what they are doing to support

Another big preservation issue that is frequently overlooked is the model of distribution that many publishers are moving toward, which is a license model rather than a sale model.  I won’t get into either the legal or business rationale for this shift, but I do want to focus on this shift’s implications for preservation and in particular publishers.  An important analogy that I make to publishers is that of renting a house versus selling a house.  When a publisher sells a house (in this case a book), it passes on all the responsibility for the house and it’s upkeep onto the new owner.  Now if a person rents that same house, the responsibility for fixing the leaking roof, for painting the walls and repairing the broken windows generally falls back to the landlord who is renting the house.  Obviously, there is money to be made and the terms of the lease impact who is responsible for what, but in general, the owner is still the primary person responsible for the major upkeep of the house.

In the case of the sale of a book, the publisher is no longer responsible for that item and its preservation onto the new owner, say the library.  It is then up to the library to ensure that the book doesn’t fall apart, that the cover stays clean, or the pages don’t rip.  However, as we move to a license environment, the long-term responsibility of upgrading file formats, of continuing to provide access and functionality falls back to the publisher.  The publisher is the landlord, renting e-books to the publishing community.  And this responsibility requires a great deal more effort than simply hosting the file.  The publishers will eventually need to repaint, to refurbish, to fix the broken plumbing to speak on this digital collection.  I expect that this will be no small feat, and something that few publishers are prepared to address.

The Library of Congress has begun thinking about this problem from the perspective of their demand deposit requirement related to copyright registration for LC’s own collection.  While they are at the moment focused on electronic-only journals, one can envision a scenario where electronic-only books are not that far away.  LC has not explicitly discussed e-book preservation and their current work is only focused on e-journals.  However, the problems that LC is facing is illustrative of the larger issues that they likely will face.  There are standards for journal article formatting using XML, such as the soon to be released Journal Article Tag Suite or (JATS), formerly the NLM DTD.  This project developed by the National Library of Medicine in the US was specifically focused on developing an archival tagging model for journal article content distribution and preservation.  There is no similar model for books that is widely adopted.  If the variation of journal markup is significant, the same complexity for book content is some exponential increase over that.

No archive can sustain a stream of ingest from hundreds or thousands of publishers without standards.  It is simply unmanageable to accept any file in any format from thousands of publishers.    And this is of course, where standards comes in, although this isn’t the forefront of either of our presentations today, it does sit there in the not so distant background.

And there has been a great deal of focus over the past year on the adoption of the new EPUB 3.0 specification. This is a great advancement and it will certainly help speed adoption of e-books and their overall interoperability with existing systems.  However, it should be clear that EPUB is not designed as an archival format.  Many of the things that would make EPUB 3 archival exist within the structure but their inclusion by publishers is optional, not mandatory.  In the same way that accessibility and archiving functionality is possible within PDF files, but it is functionality that most publishers don’t take advantage of or implement.  We as a community, need to develop profiles of EPUB for preservation that publishes can target, if not for their distribution, at least for their long-term preservation purposes both internally and externally.

It will be a long-term project that we will be engaged in.  And it is something that we need to focus concerted attention on, because preservation isn’t the first thing on content creator’s minds.  However, we should be able to continue to press the issue and make progress on these issues.

NISO response to the National Science Board on Data Policies

January 18th, 2012

Earlier this month, the National Science Board (NSB) announced it was seeking comments from the public on the report from the Committee on Strategy and Budget Task Force on Data Policies, Digital Research Data Sharing and Management.  That report was distributed last December.

NISO has prepared a response on behalf of the standards development community, which was submitted today.  Here are some excerpts of that response:

The National Science Board’s Task Force on Data Policies comes at a watershed moment in the development of an infrastructure for data-intensive science based on sharing and interoperability. The NISO community applauds this effort and the focused attention on the key issues related to a robust and interoperable data environment.

….

NISO has particular interest in Key Challenge #4: The reproducibility of scientific findings requires that digital research data be searchable and accessible through documented protocols or method. Beyond its historical involvement in these issues, NISO is actively engaged in forward-looking projects related to data sharing and data citation. NISO, in partnership with the National Federation of Advanced Information Services (NFAIS), is nearing completion of a best practice for how publishers should manage supplemental materials that are associated with the journal articles they publish. With a funding award from the Alfred P. Sloan Foundation and in partnership with the Open Archives Initiative, NISO began work on ResourceSync, a web protocol to ensure large-scale data repositories can be replicated and maintained in real-time. We’ve also had conversations with the DataCite group for formal standardization of their IsCitedBy specification. [Todd Carpenter serves] as a member of the ICSTI/CODATA task force working on best practices for data citation and NISO is looking forward to promoting and formalizing any recommendations and best practices that derive from that work.

….

We strongly urge that any further development of data-related best practices and standards take place in neutral forums that engage all relevant stakeholder communities, such as the one that NISO provides for consensus development. As noted in Appendix F of the report, Summary Notes on Expert Panel Discussion on Data Policies, standards for descriptive and structural metadata and persistent identifiers for all people and entities in the data exchange process are critical components of an interoperable data environment. We cannot agree more with this statement from the report of the meeting: “Funding agencies should work with stakeholders and research communities to support the establishment of standards that enable sharing and interoperability internationally.”

There is great potential for NSF to expand its leadership role in fostering well-managed use of data. This would include not only support of the repository community, but also in the promulgation of community standards. In partnership with NISO and using the consensus development process, NSF could support the creation of new standards and best practices. More importantly, NSF could, through its funding role, provide advocacy for—even require—how researchers should use these broad community standards and best practices in the dissemination of their research. We note that there are more than a dozen references to standards in Digital Research Data Sharing and Management report, so we are sure that this point is not falling on unreceptive ears.

The engagement of all relevant stakeholders in the establishment of data sharing and management practices as described in Recommendation #1 is critical in today’s environment—at both the national and international levels. While the promotion of individual communities of practice is a laudable one, it does present problems and issues when it comes to systems interoperability. A robust system of data exchange by default must be one grounded on a core set of interoperable data. More often than not, computational systems will need to act with a minimum of human intervention to be truly successful. This approach will not require a single schema or metadata system for all data, which is of course impossible and unworkable. However, a focus on and inclusion of core data elements and common base-level data standards is critical. For example, geo-location, bibliographic information, identifiers and discoverability data are all things that could be easily standardized and concentrated on to foster interoperability. Domain-specific information can be layered over this base of common and consistent data in a way that maintains domain specificity without sacrificing interoperability.

One of the key problems that the NSB and the NSF should work to avoid is the proliferation of standards for the exchange of information. This is often the butt of standards jokes, but in reality it does create significant problems. It is commonplace for communities of interest to review the landscape of existing standards and determine that existing standards do not meet their exact needs. That community then proceeds to duplicate seventy to eighty percent of existing work to create a specification that is custom-tailored to their specific needs, but which is not necessarily compatible with existing standards. In this way, standards proliferate and complicate interoperability. The NSB is uniquely positioned to help avoid this unnecessary and complicating tendency. Through its funding role, the NSB should promote the application, use and, if necessary, extension of existing standards. It should aggressively work to avoid the creation of new standards, when relevant standards already exist.

The sharing of data on a massive scale is a relatively new activity and we should be cautious in declaring fixed standards at this state. It is conceivable that standards may not exist to address some of the issues in data sharing or that it may be too early in the lifecycle for standards to be promulgated in the community. In that case, lower-level consensus forms, such as consensus-developed best practices or white papers could advance the state of the art without inhibiting the advancement of new services, activities or trends. The NSB should promote these forms of activity as well, when standards development is not yet an appropriate path.

We hope that this response is well received by the NSB in the formulation of its data policies. There is terrific potential in creating an interoperable data environment, but that system will need to be based on standards and rely on best practices within the community to be fully functional. The scientific community, in partnership with the library, publisher and systems provider communities can all collectively help to create this important infrastructure. Its potential can only be helped by consensus agreement on base-level technologies. If development continues in a domain-centered path, the goal of interoperability and delivering on its potential will only be delayed and quite possibly harmed.

The full text PDF of the entire response is available here.  Comments from the public related to this document are welcome.

What is a book today?

November 5th, 2011

One sign of the profound implications of some forms for technology are the types of questions that they force one to ask after considering its implications. Last week at the Internet Archive Books in Browsers meeting one of the questions that kept arising reflected on the impact of the move to digital distribution in our information exchange environment. Eric Hellman first posed this question in his presentation about his forthcoming service GlueJar, although several asked similar versions of the same question:

“What is a book [in a digital context]?”

What made a physical thing a book in the analog context is no longer what makes a thing a book today. Certainly, there still are physical things we call books. But in our new digital age, books are much more complex things. Books are no longer just the text. Nor are they even constrained by a certain text length or form, given the potential provided by a linking environment. Using semantic markup and linked data opportunities, there are no longer constraints about even the content of the item. During his talk Networked books and networked reading, Kevin Kelly described the ultimate goal of every Wikipedia entry should be one where every word or phrase in a text is hyperlinked to another richer document.

There are even more complex content forms, which might still be variations on the single digital media form of a book. The text of a file might be rendered with text to speech technology so that a digital text file is now possibly also an audio book. This could be an author or dramatic reading included as part of the package, or simply machine processed reading. As translation tools improve even a text expression of an item (to use FRBR terminology) could also be a simultaneous expression in other languages, simply rendered onto the screen. Even the idea that a book is a self-contained thing that could be packaged, distributed and preserved is an open question, as we consider the possibility of a book that links outside to a live streaming embedded video.

All of these questions about what is a book and what are the implications of a networked multimedia experience that we are now faced with, present a variety of challenges for the information distribution community. For the editors and publishers, who are lovingly and carefully creating their content, which enhancements of the content-expression does one focus the most attention on. This directly goes to the cost of content creation and the potential return for the authors and publishers. Hugh McGuire during his presentation said that Ebooks should be no more difficult to create than a website, which is fine but of course putting together a great website isn’t an easy piece of work either.

From a cataloging and library perspective, where do those various elements get described and stored within a catalog? How best to expose those forms and how they will be indexed and ultimately served to patrons and users is again another challenge both for search and discovery services. Finally, the preservation of this digital content stream is nearly painful in its consideration. This is due, in part to the fact that these different media forms are developing and changing at different timescales by different constituencies. The expertise in text preservation, video preservation, and software preservation are not usually held by the same people.

Often the technical questions drive deeper philosophical conversations about the meaning and impact of the changes at hand. It is likely that with ebooks we have entered that phase, where the questions and the answers to those questions will drive much of our forward momentum for decades to come.

I was asked at the end of a panel discussion at the Frankfurt Tools of Change meeting what will we be working on five years from now. I punted on the question, not that there aren’t a variety of things that we need to accomplish. However, I expect that we will likely be trying to answer some of these deep philosophical questions we are beginning to pose now. The answers to these questions are not obvious, but their implications are profound. In all likelihood, we’ll be working our way through these questions for many years to come.

There are so many angles and issues to address, purely from a standards and technology perspective. This does not take into account the other vexing problems of cultural nor business challenges that ebook present us. This is absolutely not to say that we are spinning our wheels and being unproductive. These questions and issues are difficult, complex and interwoven. Coming to some resolution will take time, energy, attention to detail and sustained commitment.

Reflecting on Mr. Jobs

October 6th, 2011

I don’t have a cool Steve Jobs story. Just a deep love of the products he and his partners and his teams over the years (who are often disregarded, though no one works alone) created.

My first introduction to the computer was in Boy Scouts when the parents of my friends all had computers. I grew up in Rochester, NY, where Kodak ruled the city and nearly everyone I knew had a father that worked for Kodak. They were mostly engineers and they tinkered with computers in their spare time over the weekends. My real love for computers began on an Apple. My fifth grade math teacher had an Apple ][ in the corner and I recall spending far more time learning to program on the Apple than I did learning math that year. My middle school had a computer lab, where some of my geekier friends and I learned basic programming and how to have fun with the computers. About that time, the Macintosh came out and I recall riding my bike up to the only computer store about 4 miles from my house to gawk at the interface and play with the mouse. Despite my prodding, my parents, who were not engineers by any stretch, refused to buy into—literally or figuratively—my desire for a computer. In retrospect, given the $1,995 price (at the time, which would be roughly $4,500 today adjusted for inflation), I can completely understand the reticence to purchase such a very expensive toy for their son. While I tinkered with electronics through my early high school years, I spent more time using glorified electric typewriters than I did using computers.

I flirted again with Macs in college, although my roommate (whose computer I used) and most of my friends had PCs. I worked a bit on computers, but I was too busy with other things to get deeply involved. Once I got out into “the real world”, I was often thrown tech projects and database projects because I just loved working with computers. It would be some 15 years later before I had my own Apple computer. More or less, I’ve never turned back.

Like many in the technology world, I can’t say I ever had any direct interaction with Mr. Jobs, or even secondary contact. But he and his company did have a tremendous impact on me. Probably if I hadn’t had one of those Apple ][ computers to play with, I’d never have been as interested in those funny boxes, or programming, or data systems, which is where I later ended up. I might not be where I am today. I’m sure there are many others who share my appreciation. Whether we’ve lost our Edison, as some have stated, I’ll leave it up to history. My thinking is that was probably Steve Wozniak, who’s gotten somewhat short shrift of late. But we might have lost a Carnegie. Regardless, I’m saddened by the news. I feel like a bit of my childhood died yesterday.

Why are there so many standards?

July 20th, 2011

One of the most often quipped complaints about standards is that there are so many to choose from.  Standards have a way of proliferating without control, much like summer weeds in the garden.

One of my favorite formualtions of this conundrum is a quote from Connie Morella, former congresswoman and former ambassador to the Organization of Economic Cooperation and Development. She spoke at an ANSI’s World Standards Day awards dinner in 2006, when she received ANSI’s Ronald H. Brown Standards Leadership Award. During her speech she said, “Standards are like toothbrushes. Everybody wants one but nobody wants to use anybody else’s.”

Thanks to @ljndawson for the pointer to this XKCD cartoon, which summarizes this problem quite well.

<br /> Why are there so many standards?

Why are there so many standards?

There are a variety of reasons for this. Some are more reasonable than others. One reason that I have been spending a lot of time considering lately is that different communities create their own specifications because they are not aware of developments taking place in adjacent communities, don’t see the overlap and common goals of the two (or more) related specifications. One of the things we are trying to achieve in the space of ebooks with the recently launched special interest group, is to help foster cross-community discussion and collaboration. Hopefully, we can avoid the problem described in the cartoon.

American National Standard Safety Requirements for Dry Martinis ANSI K100.1-1974

April 1st, 2011

We all realize the critical role that standards play in everyday life, even if we don’t recognize their application in our busy schedules.  This is true even of the most obvious activities.  I expect that most bartenders are unaware of the American National Standard Safety Requirements for Dry Martinis ANSI K100.1-1974.  While this standard may be in either stable or continuous maintenance state, because of it apparently unchanged state since 1974.  This standard was a revision of the original and groundbreaking 1966 standard, which is still available from unauthorized archive sources.   The standard committee, led by Gilbey Gordon Booth, was convened under the authority of the Water Conservation League, a now defunct industry non-profit representing organizations such as The American Society of Bar Supporters, the Gin Council of America, the Standard Stirrers of the United States, the Olive Institute, and the Vermouth Council.

The Scope of the standard is described as:

“This Standard on dry martini cocktails includes nomenclature, size, ingredients, proportions, mixing methods, and test procedures. It applies to martini cocktails prepared for personal consumption, for distribution in bars, restaurants, and other places of public gathering, and to cocktails served in the home or offices of business and social acquaintance.”

As per NISO’s policy of providing standards of significant value to the community, we are providing a link to the copy of the standard free of charge.  Authorized copies of the standard are still available for delivery from the IHS Standard Store free of charge.

When is a new thing a new thing?

June 10th, 2010

I recently gave a presentation at the National Central Library in Taiwan at a symposium on digital publishing and international standards that they hosted. It was a tremendous meeting and I am grateful to my hosts, Director General Karl Min Ku and his staff for a terrific visit.  One of the topics that I discussed was the issue of the identification of ebooks. This is increasingly becoming an important issue in our community and I am serving on a BISG Working Group to explore thes issues. Below are some notes from one slide that I gave during that presentation, which covers one of the core questions: At what point do changes in a digital file qualify it as a new product?  The full slide deck is here. I’ll be expanding on these ideas in other forums in the near future, but here are some initial thoughts on this question.

——-

In a print world, what made one item different from another was generally it’s physical form. Was the binding hardcover or soft-cover? Was the type regular or large-size for the visually impaired, or even was it printed using Braille instead of ink? Was the item a book or a reading of the book, i.e. an audio book, was about as far afield as the form question had gone prior to the rise of the internet in the mid 1990s. In a digital environment, what constitutes a new item is considerably more complex. This poses tremendous issues regarding the supply chain, identification, and collections management in libraries.

This is a list of some of the defining characteristics for a digital text that are distinct from those in a print environment.  Each poses a unique challenge to the management and identification of digital items.

  • Encoding structure possibilities (file formats)
  • Platform dependencies (different devices)
  • Reflowable (resize)
  • Mutable (easily changed/updated)
  • Chunked (the entire item or only elements)
  • Networkable (location isn’t applicable)
  • Actionable/interactive
  • Linkable (to other content)
  • Transformable (text to speech)
  • Multimedia capable
  • Extensible (not constrained by page)
  • Operate under license terms (not copyright)
  • Digital Rights Management (DRM)

Just some of these examples pose tremendous issues for the supply chain of ebooks when it comes to fitting our current business practices, such as ISBN into this environment.

One question is whether the form of the ebook which needs a new identifier is the file format. If the publisher is distributing a single file format, say an epub file, but then in order for that item go get displayed onto a Kindle, it needs to be transformed into a different file format, that of the Kindle, at what point does the transformation of that file become a new thing? Similarly, if you wrap that same epub file with a specific form of digital rights management, does that create a new thing? From an end-user perspective, the existence and type of DRM could render a file as useless to the users as it would be if you supplied a Braille version to someone who can’t read Braille.

To take another, even thornier question, let’s consider location. What does location mean in a network environment. While I was in Taiwan, if I wanted to buy a book using my Kindle from there, where “am I” and where is the transaction taking place? Now in the supply chain, this makes a tremendous amount of difference. A book in Taiwan likely has a different ISBN number, assigned to a different publishers, because the original publisher might not have worldwide distribution rights. The price might be different, even the content of the book might be slightly different-based on cultural or legal sensitivities. But while I may have been physically located in Taiwan, my Amazon account is based in Maryland, where I live and where my Kindle is registered. Will Amazon recognize me as the account holder in the US or the fact of my present physical location in Taiwan, despite the fact that I traveled back home a week later and live in the US? Now, this isn’t even considering where the actual transaction is taking place, which could be a server farm somewhere in California, Iceland or Tokyo.  The complexity and potential challenges for rights holders and rights management could be tremendous.

These questions about when is a new thing a new thing are critically important question in the identification of objects and the registration and systems that underlie them. How we manage this information and the decisions we take now about what is important, what we should track, and how should we distinguish between these items will have profound impacts on how we distribute information decades into the future.

Mandatory Copyright Deposit for Electronic-only Materials

April 1st, 2010

In late February, the Copyright Office at the Library of Congress published a new rule that expands the requirement for the mandatory deposit to include items published in only in digital format.   The interim regulation, Mandatory Deposit of Published Electronic Works Available Only Online (37 CFR Part 202 [Docket No. RM 2009–3]) was released in the Federal Register.  The Library of Congress will focus its first attention on e-only deposit of journals, since this is the area where electronic-only publishing is most advanced.  Very likely, this will move into the space of digital books as well, but it will likely take sometime to coalesce.

I wrote a column about this in Against the Grain last September outlining some of these issues that this change will require.  A free copy of that article is available here.  The Library of Congress is aware, and will become painfully more so when this stream of online content begins to flow their way.  To support an understanding about these new regulations, LC hosting a forum in Washington in May to discuss publisher’s technology for providing these data on a regular basis.  Below is the description about the meeting that LC provided.

Electronic Deposit Publishers Forum
May 10-11, 2010
Library of Congress — Washington, DC

The Mandatory deposit provision of the US Copyright Law requires that published works be deposited with the US Copyright Office for use by the Library of Congress in its collection.  Previously, copyright deposits were required only for works published in a physical form, but recently revised regulations now include the deposit of electronic works published only online.  The purpose of this workshop is to establish a submission process for these works and to explore technical and procedural options that will work for the publishing community and the Library of Congress.

Discussion topics will include:

  • Revised mandatory deposit regulations
  • Metadata elements and file formats to be submitted

Space for this meeting is very limited, but if you’re interested in participating in the meeting, you should contact the Copyright Office.

  • Proposed transfer mechanisms
  • ISTC and Ur-Texts

    April 1st, 2010

    Tuesday, I attended a meeting on the International Standard Text Code (ISTC), organized by the Book Industry Study Group (BISG) in Manhattan.  The meeting was held in conjunction with the release of a white paper on the ISTC by Michael Holdsworth entitled ISTC: A Work in Progress. This is a terrific paper and worthy of reading for those interested in this topic and I commend it to you all, if you haven’t seen it.  The paper provides a detailed introduction to the ISTC and what role this new identifier will play in our community.

    During the meeting as I was tweeting about the standard, I got into a brief twitter discussion with John Mark Ockerbloom at the University of Pennsylvania Library.  Unfortunately as wonderful as Twitter is for instantaneous conversation, it is not at all easy to communicate nuance.    For that, a longer form is necessary, hence this blog post.

    As a jumping off point, let us start with the fact that the ISTC has a fairly good definition about what it is identifying: the text of a work as a distinct abstract item that may be the same or different across different products or manifestations.  Distinguishing between those changes can be critical, as is tying together the various manifestations for collection development, rights and product management reasons.

    One of the key principles of the ISTC is that:

    “If two entities share identical ISTC metadata, they shall be treated as the same textual work and shall have the same ISTC.”

    Where to draw this distinction is quite an interesting point.  As John pointed out in his question to me, “How are works with no definitive original text handled? (e.g. Hamlet) Is there an #ISTC for some hypothetical ur-Hamlet?”  The issue here is that there are multiple “original versions” of the text of Hamlet. Quoting from Wikikpedia: “Three different early versions of [Hamlet] have survived: these are known as the First Quarto (Q1), the Second Quarto (Q2) and the First Folio (F1). Each has lines, and even scenes, that are missing from the others.”

    In this case, the three different versions would each have three different ISTCs assigned to them, since the text of the versions is different.  They could be noted as related to the other ISTCs (as well as the cascade of other related editions) in the descriptive metadata fields.  Hamlet is a perfect example of where the ISTC could be of critical value, since those who have an interest in the variances between the three different versions would want to know which text is the basis of the copy of Hamlet they are purchasing, since there are significant differences between the three copies.

    Perhaps most stringent solution in keeping with the letter of the standard might be that the First Quatro, have been the first known to published, since it was the first to appear in the Stationers’ Register in 1602 although it likely was not published until summer or fall 1603.  The Second Quarto and First Folio were published later—in 1604 and 1623 respectively.  Although the first Quatro is often considered “inferior” to later versions, assigning it the “Source” ISTC would be no different than if it were published today, and subsequently re-published as a revision (which would be assigned a related ISTC).  While there has been controversy about the source text of Hamlet that probably began not long after the day it was published and has certainly grown as the field of scholarship around Shakespeare has grown, for the purposes of identification and linking does the “Ur-text” matter?

    Certainly, a user would want to know that this is the canonical version, be that the Second Quatro or First Folio versions.  The critical point is that we identify things differently when there are important reasons to make the distinctions.  In the case of Hamlet, there is a need to make the distinction.  Which copy is considered “original” and which is a derivative isn’t nearly as important as making the distinction.

    It is valuable to note the description in the ISTC User’s Manuel in the section on Original works and derivations.  Quoting from the Manuel:

    7.1    What is an “original” work?

    For the purposes of registration on the ISTC database, a work may be regarded as being “original” if it cannot be adequately described using one or more of the controlled values allowed for the “Derivation Type” element (specified elsewhere in this document).

    A work is considered to be “original” for registration purposes unless it replicates a significant proportion of a previously existing work or it is a direct translation of the previously existing one (where all the words may be different but the concepts and their sequence are the same). It should be noted that this is a different approach from that used by FRBR2, which regards translations as simply different “expressions” of the same work.

    The “Source ISTC” metadata field is an optional one and is “Used to identify the original work(s) from which this one is derived (where appropriate). It is recommended that these are provided whenever possible.”  In the case of the three Hamlet “original versions” this field would likely be left blank, since there is no way to distinguish between the “Original” and the “Derivation”.  Each of the three versions could be considered “Original”, but this would get messy if one were not noted as original.   There is a “Derivation type” metadata field with restricted values, although “Unspecified” is one option.  Since there isn’t necessarily a value in the “original” distinction, there isn’t a point arguing about which is original.  In the real world, what will likely be the “original” will be the first version that receives the assignment.

    This same problem will likely be true of a variety of other texts, especially from distant historical periods.   A focus on core principles, that we distinguish what is important, that disambiguation is important, and avoiding the philosophical arguments surrounding “original” versus “derivative”, just as the ISTC community is trying to avoid “ownership” of the record, will help to serve the entire community.

    There is a lot more information about the ISTC provided by NISO. Members and subscribers can read the article that Andy Weissberg VP of Identifier Services & Corporate Marketing at Bowker wrote in Information Standards Quarterly last summer, The International Standard Text Code (ISTC): An Overview and Status Report. For non-subscribers, Andy Weissberg also presented during the 2009 NISO-BISG Changing Standards Landscape forum prior to ALA’s Annual conference in Chicago.  You can view his presentation slides or watch the video from that meeting.

    The International ISTC Agency Ltd is a not-for-profit company, limited by guarantee and registered in England and Wales. Its sole purpose is to implement and promote the ISO 21047 (ISTC) standard and it is operated by representatives of its founding members, namely RR Bowker, CISAC, IFRRO, and Nielsen Book Services.

    The first edition of “ISO 21047 Information and Documentation – International Standard Text Code (ISTC)” was published by ISO in March 2009. It is available for purchase in separate English and French versions either as an electronic download or printed document from ISO.

    9 Ways that librarians can support standards adoption

    February 15th, 2010

    Last week, I was at the Electronic Resources in Libraries conference in Austin, Texas.  This is the fifth meeting of ER&L and the meeting has grown tremendously, becoming an important destination for librarians and publishers focused on electronic content.  There is a growing energy around this conference that reminds me a lot of the Charleston conference back in the 1990s–or perhaps earlier, but that’s when I first attended Charleston.  The organizer of the meeting, Bonnie Tijerina, Electronic Resources Librarian at the UCLA Library, is full of drive and energy, and will I expect continue to be a force in the library community for many years to come.  So too are the team of people who stand with Bonnie in making this entire project happen, most of whom wandered about the meeting in t-shirts emblazoned with a welcoming and helpful Texas “Howdee!” in large letters across the chest.

    Generally, a relaxed meeting with a capped attendance of ~350 people and a tight schedule of only a few competing sessions, ER&L also involves a lot of participant engagement.  Participants are encouraged to contribute to the conversation via the conference wiki and blog.  Also, the first day included a lightening talk opportunity for anyone to take the stage for five minutes to discuss whatever project they wanted to share.

    I took the opportunity to stand up and discuss briefly an important issue for the library community: the adoption of standards by vendors and publishers.  There is often a chicken and egg problem with the development of systems interoperability standards.  When two parties need to exchange data, both sides of that exchange need to see the value of investing in implementation.  Implementation has to serve the interest of both communities.  In the case of library systems, the interests of the library staff are usually tied to improving end-user access, reducing data entry, more efficient services or better analysis.  For the vendor, this might include simply better customer service and keeping current customers happy, building in response to RFP requests, or possibly a competitive advantage over other systems offerings.  The problem is that in an era when development resources are tight–and they are always tight, only more so now–developing interchange functionality to make the system the supplier has developed work with another system (which was generally not developed by the same supplier) doesn’t often compete well in the list of development priorities.

    How can the library community engage to help this situation?  During my brief talk at ER&L I listed a few ways that librarians can encourage adoption of technical standards by their vendors, such as systems suppliers and publishers:

    1) Educate yourself about the different initiatives that are ongoing in the community. NISO offers a series of educational events throughout the year, ranging from webinars to in-person events.  Also, many of these events are free, such as the Changing Standards Landscape Forum at ALA and the monthly Open Teleconference Series.  Subscribing to NISO’s free Newsline or our magazine, ISQ are also ways to keep abreast of the work ongoing at NISO and elsewhere in the community.

    2) Build compliance language into your RFPs and contracts.  A customer never has more power over the vendor than right before she/he is about to purchase something.  While price is often the first thing people think about when negotiating a contract for a system, there are other important elements tied to service levels that should also be considered.    Does the system conform with existing standards — and what exactly is meant by “conformance”.  Conformance can mean different things to different organizations.  Be as clear as you can be about what your needs are from the outset can avoid problems later.  NISO will be updating the NISO RFP Guide later this spring, which will help in this process.

    3) Regularly speak with the product managers or account executives at your suppliers.  The product managers are there to provide input and feedback to their development teams.  Usually, they are a solid source for the company about customer needs and expectations.  They can often advocate for your needs within the company.  However, you need to be realistic about what they can achieve, which is why #8 below is an important channel too!

    4) Participate in user group meetings and discussion groups:  Every successful company will reach out to its customers for feedback and input, especially when new products, services or platform upgrades are under consideration.  Be mindful of exactly what your needs and concerns are.  This is where your work on Education point #1 above) can be so valuable.

    5) Serve on Library Advisory Boards: Most publishers and systems vendors have advisory boards of librarians who provide regular feedback about community conditions and development needs.

    6) Open Source Development – A variety of libraries are working on development of new systems and services using Open Source tools and methods.  Building in interoperability standards into these systems is a great way to leverage communities to push adoption by proprietary vendors, which often require interoperability with proprietary systems for them to work properly.  In addition, Open Source provides a public forum for the testing and improvement of existing standards.

    7) Find out if your suppliers are engaging in standards development work.  All of the rosters of NISO working groups are available online.  Look through them and see which of your suppliers is participating.  If you find a group that you feel would benefit your library, reach out to your suppliers.  Press them to engage if they are not.

    8 ) Go to the top – Contacting the executive leadership at supplier companies is a great way to get action on your needs.  Often, the product managers don’t control the development pipeline at an organization—although they are useful as a first and regular point of contact (see #4 above).  The executives can often control a wide variety of resources to get a project moving forward, if you can convince them it is valuable to their customers.  Reaching to the executives is never a bad idea and can usually bring results if your requests are focused and actionable.

    9) Get involved yourself. – There are many ways that you can engage in standards and best practice works.  You can engage directly with NISO or via any of the variety of mirror groups that exist as part of ALA, ARL, LITA, NFAIS, SLA or MLA.  In addition to building your own skills, you will be able to speak more authoritatively about your needs, the more engaged you are.  Also, it provides you an opportunity for your needs to be built in to the standards or best practices from the outset. You will be amazed at how similar the issues you face are with others in the community.