Home | About NISO | Blog

Archive for the ‘publishing’ Category

O’Reilly Tools of Change – We Bid you Adieu

Wednesday, May 15th, 2013

Two weeks ago, Tim O’Reilly surprised a large segment of the online publishing world when he announced on the Tools of Change blog that O’Reilly Media would be folding the Tools of Change (TOC) Conference series and blog and disbanding the community that it had fostered over the past 7 years.

I have spoken at several of these meetings over the past few years and I have found TOC to be one of the more vibrant and forward-looking conferences in the industry. There were many reasons for this success but my sense was that it had to do partly with the technology focus of O’Reilly’s publishing program and the tech-savviness of the O’Reilly staff. The group that organized the meeting had a finger on the pulse of the rapidly changing technology impacting and transforming the publishing world. A lot of people contributed to this event, but of particular note was the adroit leadership of Kat Meyer and Joe Weikert, who seemed abreast of each emerging technology and each new entrant in the industry. It often seemed lees like a conference than equal parts rock concert and evangelical sermon about the future of our industry. The high-level leadership and commitment to the notion of advancing the landscape of publishing derived from Tim O’Reilly himself. This is evidenced by O’Reilly’s writings and the talks he has given about the value of TOC in transforming the publishing industry.

The criticism of O’Reilly for pulling the plug on a well-attended, seemingly profitable, and passionate community event was—not-surprisingly—lightening fast and filled with stunned shock. One vocal and influential commenter was Brian O’Leary, an industry consultant and writer who was a frequent speaker at TOC. On his blog, O’Leary took O’Reilly to task for lacking the commitment to support the community and the meeting that O’Reilly and his staff fostered. O’Leary made the strong point:

“But once you’ve helped make a community, you have an obligation to nurture and sustain it. If you decide you want to do something else with your resources, you still have to provide for its care and feeding. You don’t shut everything down without making an attempt to at least provide for its welfare.”

In response to O’Reilly’s decision, one friend commented that a meeting is not a community and I couldn’t agree more. It brings the community together, but for the most part the topics, the ideas, and the strategies develop from the bottom up, not from the top down. Nearly every meeting, even TOC and other for-profit meetings, is organized by program committees or advisory boards. The TOC Program Committee included an impressive list of industry thought leaders. Based on the meeting scope, vision, and mission, the leadership groups identify and build a program to serve the expected attendee community. In the case of TOC, this mix coalesced exceptionally well.

Similarly, for years there had been a vibrant community that met around the London Online conference in early December each year. Over the years, that conference seemed to be more and more like an exhibit hall where vendors put up booths to talk to each other without talking to many customers, since the program seemed to attract a relatively modest number of attendees compared to the size of the exhibit hall, particularly in the last few years. The meeting was terminated in 2011. There wasn’t quite the same outcry when that meeting disbanded, but the rationale seemed to be similar at the time.

While the approaches to for-profit meetings and those  are organized by communities of interest (i.e., professional societies) are similar, there are some important distinctions worth mentioning. The challenge of centering a community of interest around a for-profit business is that the effort is only as stable as the corporate underpinnings. A professional society exists to educate and there are often policies and admonitions to speakers not to specifically promote their services in presentations—although this is often breached, it is generally not the fault of the organizers. TOC held an explicit program track for sponsors and exhibitors, and significant sponsors were often given prime presentation slots. In my personal opinion, theses sessions often fell flat in the same way that an advertisement during a football game or your favorite television show is often when you refresh your beverage, talk with your friends, or press the fast-forward button on the DVR. In fairness, there is a great deal of sponsorship, advertising and promotion that takes place at association meetings as well; these events are frequently the source of a large portion of an association’s income. One can’t fault the association or a corporation for doing as much as possible to make an event profitable. But association events tend not to have blatant promotional presentations (with emphasis on “blatant”) and encourage such marketing be left in the exhibit area.

It is also true that no organization, for-profit or non-profit, can sustain a recurring money-losing event. What makes all the difference between the two organization types, and what was highlighted in the case of the decision to cease hosting TOC, is that an association exists to support the community, while a for-profit company does not. An association hosts a meeting to bring a community together to promote its mission, be that education, current awareness, or to advance the mission of the group. A corporation hosts a meeting to generate as large a profit as possible (directly or indirectly from meeting follow-up sales). Serving a community is not the mission of the corporation, generally speaking, although it might be a tactic to generate profits. Corporations do not—and should not—undertake such efforts in the same way or for the same purposes as non-profit entities. Importantly, I don’t want to argue that one approach is right and one is wrong, or that one is inherently better or worse. But we shouldn’t be surprised or even feel betrayed when a corporation on its whim decides to stop supporting an event or a group.

The community of digital publishing specialists continues to exist. While there may be a gap in where that group gathers regularly, perhaps putting faith in a for-profit group to bring the community together is a mistake. I look at the successful efforts of the ER&L community, the Code4Lib community, the Force11 (Beyond the PDF) and the THATCamp groups that have successfully organized energy around their respective spaces.. Not surprisingly, these communities are all centered on non-profit centers.

I am looking forward to seeing where our community of publishing technologists coalesces in the wake of TOC. Wherever and however that is, I expect it will be equally compelling because there is a need to share and advance the innovative ideas that are transforming publishing. These changes will continue to advance every year and those who are leading the way need a gathering place. When it does coalesce, I plan to be there. Hopefully, we will see you there too.

Introduction to NISO webinar on ebook preservation

Wednesday, May 23rd, 2012

Below are my welcoming remarks to the NISO webinar on Heritage Lost?: Ensuring the Preservation of Ebooks on May 23rd.

“Good afternoon and welcome to the second part of this NISO Two-Part Webinar on Understanding Critical Elements of E-books: Acquiring, Sharing, and Preserving.  This part is entitled Heritage Lost? Ensuring the Preservation of E-books.

Perhaps it is due to the fact that electronic journals were adopted much earlier and more rapidly, that we are more familiar with the archiving and preservation of e-journal content than e-book content. However, just as it did in the late 1990s after e-journals became prevalent, so too the topic of preservation of e-books is now rising up in the minds of people deeply concerned with the long-term preservation of cultural materials.

That is not to say that no one is considering these issues.  Some of the bigger digitization projects involve libraries and as such include preservation as part of their mission.  I’m thinking in particular about the Internet Archive, Portico and the HaithiTrust in this regard, but there are certainly others.  Today we’ll here from two of these groups and what they are doing to support

Another big preservation issue that is frequently overlooked is the model of distribution that many publishers are moving toward, which is a license model rather than a sale model.  I won’t get into either the legal or business rationale for this shift, but I do want to focus on this shift’s implications for preservation and in particular publishers.  An important analogy that I make to publishers is that of renting a house versus selling a house.  When a publisher sells a house (in this case a book), it passes on all the responsibility for the house and it’s upkeep onto the new owner.  Now if a person rents that same house, the responsibility for fixing the leaking roof, for painting the walls and repairing the broken windows generally falls back to the landlord who is renting the house.  Obviously, there is money to be made and the terms of the lease impact who is responsible for what, but in general, the owner is still the primary person responsible for the major upkeep of the house.

In the case of the sale of a book, the publisher is no longer responsible for that item and its preservation onto the new owner, say the library.  It is then up to the library to ensure that the book doesn’t fall apart, that the cover stays clean, or the pages don’t rip.  However, as we move to a license environment, the long-term responsibility of upgrading file formats, of continuing to provide access and functionality falls back to the publisher.  The publisher is the landlord, renting e-books to the publishing community.  And this responsibility requires a great deal more effort than simply hosting the file.  The publishers will eventually need to repaint, to refurbish, to fix the broken plumbing to speak on this digital collection.  I expect that this will be no small feat, and something that few publishers are prepared to address.

The Library of Congress has begun thinking about this problem from the perspective of their demand deposit requirement related to copyright registration for LC’s own collection.  While they are at the moment focused on electronic-only journals, one can envision a scenario where electronic-only books are not that far away.  LC has not explicitly discussed e-book preservation and their current work is only focused on e-journals.  However, the problems that LC is facing is illustrative of the larger issues that they likely will face.  There are standards for journal article formatting using XML, such as the soon to be released Journal Article Tag Suite or (JATS), formerly the NLM DTD.  This project developed by the National Library of Medicine in the US was specifically focused on developing an archival tagging model for journal article content distribution and preservation.  There is no similar model for books that is widely adopted.  If the variation of journal markup is significant, the same complexity for book content is some exponential increase over that.

No archive can sustain a stream of ingest from hundreds or thousands of publishers without standards.  It is simply unmanageable to accept any file in any format from thousands of publishers.    And this is of course, where standards comes in, although this isn’t the forefront of either of our presentations today, it does sit there in the not so distant background.

And there has been a great deal of focus over the past year on the adoption of the new EPUB 3.0 specification. This is a great advancement and it will certainly help speed adoption of e-books and their overall interoperability with existing systems.  However, it should be clear that EPUB is not designed as an archival format.  Many of the things that would make EPUB 3 archival exist within the structure but their inclusion by publishers is optional, not mandatory.  In the same way that accessibility and archiving functionality is possible within PDF files, but it is functionality that most publishers don’t take advantage of or implement.  We as a community, need to develop profiles of EPUB for preservation that publishes can target, if not for their distribution, at least for their long-term preservation purposes both internally and externally.

It will be a long-term project that we will be engaged in.  And it is something that we need to focus concerted attention on, because preservation isn’t the first thing on content creator’s minds.  However, we should be able to continue to press the issue and make progress on these issues.

When is a new thing a new thing?

Thursday, June 10th, 2010

I recently gave a presentation at the National Central Library in Taiwan at a symposium on digital publishing and international standards that they hosted. It was a tremendous meeting and I am grateful to my hosts, Director General Karl Min Ku and his staff for a terrific visit.  One of the topics that I discussed was the issue of the identification of ebooks. This is increasingly becoming an important issue in our community and I am serving on a BISG Working Group to explore thes issues. Below are some notes from one slide that I gave during that presentation, which covers one of the core questions: At what point do changes in a digital file qualify it as a new product?  The full slide deck is here. I’ll be expanding on these ideas in other forums in the near future, but here are some initial thoughts on this question.

——-

In a print world, what made one item different from another was generally it’s physical form. Was the binding hardcover or soft-cover? Was the type regular or large-size for the visually impaired, or even was it printed using Braille instead of ink? Was the item a book or a reading of the book, i.e. an audio book, was about as far afield as the form question had gone prior to the rise of the internet in the mid 1990s. In a digital environment, what constitutes a new item is considerably more complex. This poses tremendous issues regarding the supply chain, identification, and collections management in libraries.

This is a list of some of the defining characteristics for a digital text that are distinct from those in a print environment.  Each poses a unique challenge to the management and identification of digital items.

  • Encoding structure possibilities (file formats)
  • Platform dependencies (different devices)
  • Reflowable (resize)
  • Mutable (easily changed/updated)
  • Chunked (the entire item or only elements)
  • Networkable (location isn’t applicable)
  • Actionable/interactive
  • Linkable (to other content)
  • Transformable (text to speech)
  • Multimedia capable
  • Extensible (not constrained by page)
  • Operate under license terms (not copyright)
  • Digital Rights Management (DRM)

Just some of these examples pose tremendous issues for the supply chain of ebooks when it comes to fitting our current business practices, such as ISBN into this environment.

One question is whether the form of the ebook which needs a new identifier is the file format. If the publisher is distributing a single file format, say an epub file, but then in order for that item go get displayed onto a Kindle, it needs to be transformed into a different file format, that of the Kindle, at what point does the transformation of that file become a new thing? Similarly, if you wrap that same epub file with a specific form of digital rights management, does that create a new thing? From an end-user perspective, the existence and type of DRM could render a file as useless to the users as it would be if you supplied a Braille version to someone who can’t read Braille.

To take another, even thornier question, let’s consider location. What does location mean in a network environment. While I was in Taiwan, if I wanted to buy a book using my Kindle from there, where “am I” and where is the transaction taking place? Now in the supply chain, this makes a tremendous amount of difference. A book in Taiwan likely has a different ISBN number, assigned to a different publishers, because the original publisher might not have worldwide distribution rights. The price might be different, even the content of the book might be slightly different-based on cultural or legal sensitivities. But while I may have been physically located in Taiwan, my Amazon account is based in Maryland, where I live and where my Kindle is registered. Will Amazon recognize me as the account holder in the US or the fact of my present physical location in Taiwan, despite the fact that I traveled back home a week later and live in the US? Now, this isn’t even considering where the actual transaction is taking place, which could be a server farm somewhere in California, Iceland or Tokyo.  The complexity and potential challenges for rights holders and rights management could be tremendous.

These questions about when is a new thing a new thing are critically important question in the identification of objects and the registration and systems that underlie them. How we manage this information and the decisions we take now about what is important, what we should track, and how should we distinguish between these items will have profound impacts on how we distribute information decades into the future.

Mandatory Copyright Deposit for Electronic-only Materials

Thursday, April 1st, 2010

In late February, the Copyright Office at the Library of Congress published a new rule that expands the requirement for the mandatory deposit to include items published in only in digital format.   The interim regulation, Mandatory Deposit of Published Electronic Works Available Only Online (37 CFR Part 202 [Docket No. RM 2009–3]) was released in the Federal Register.  The Library of Congress will focus its first attention on e-only deposit of journals, since this is the area where electronic-only publishing is most advanced.  Very likely, this will move into the space of digital books as well, but it will likely take sometime to coalesce.

I wrote a column about this in Against the Grain last September outlining some of these issues that this change will require.  A free copy of that article is available here.  The Library of Congress is aware, and will become painfully more so when this stream of online content begins to flow their way.  To support an understanding about these new regulations, LC hosting a forum in Washington in May to discuss publisher’s technology for providing these data on a regular basis.  Below is the description about the meeting that LC provided.

Electronic Deposit Publishers Forum
May 10-11, 2010
Library of Congress — Washington, DC

The Mandatory deposit provision of the US Copyright Law requires that published works be deposited with the US Copyright Office for use by the Library of Congress in its collection.  Previously, copyright deposits were required only for works published in a physical form, but recently revised regulations now include the deposit of electronic works published only online.  The purpose of this workshop is to establish a submission process for these works and to explore technical and procedural options that will work for the publishing community and the Library of Congress.

Discussion topics will include:

  • Revised mandatory deposit regulations
  • Metadata elements and file formats to be submitted

Space for this meeting is very limited, but if you’re interested in participating in the meeting, you should contact the Copyright Office.

  • Proposed transfer mechanisms
  • ISTC and Ur-Texts

    Thursday, April 1st, 2010

    Tuesday, I attended a meeting on the International Standard Text Code (ISTC), organized by the Book Industry Study Group (BISG) in Manhattan.  The meeting was held in conjunction with the release of a white paper on the ISTC by Michael Holdsworth entitled ISTC: A Work in Progress. This is a terrific paper and worthy of reading for those interested in this topic and I commend it to you all, if you haven’t seen it.  The paper provides a detailed introduction to the ISTC and what role this new identifier will play in our community.

    During the meeting as I was tweeting about the standard, I got into a brief twitter discussion with John Mark Ockerbloom at the University of Pennsylvania Library.  Unfortunately as wonderful as Twitter is for instantaneous conversation, it is not at all easy to communicate nuance.    For that, a longer form is necessary, hence this blog post.

    As a jumping off point, let us start with the fact that the ISTC has a fairly good definition about what it is identifying: the text of a work as a distinct abstract item that may be the same or different across different products or manifestations.  Distinguishing between those changes can be critical, as is tying together the various manifestations for collection development, rights and product management reasons.

    One of the key principles of the ISTC is that:

    “If two entities share identical ISTC metadata, they shall be treated as the same textual work and shall have the same ISTC.”

    Where to draw this distinction is quite an interesting point.  As John pointed out in his question to me, “How are works with no definitive original text handled? (e.g. Hamlet) Is there an #ISTC for some hypothetical ur-Hamlet?”  The issue here is that there are multiple “original versions” of the text of Hamlet. Quoting from Wikikpedia: “Three different early versions of [Hamlet] have survived: these are known as the First Quarto (Q1), the Second Quarto (Q2) and the First Folio (F1). Each has lines, and even scenes, that are missing from the others.”

    In this case, the three different versions would each have three different ISTCs assigned to them, since the text of the versions is different.  They could be noted as related to the other ISTCs (as well as the cascade of other related editions) in the descriptive metadata fields.  Hamlet is a perfect example of where the ISTC could be of critical value, since those who have an interest in the variances between the three different versions would want to know which text is the basis of the copy of Hamlet they are purchasing, since there are significant differences between the three copies.

    Perhaps most stringent solution in keeping with the letter of the standard might be that the First Quatro, have been the first known to published, since it was the first to appear in the Stationers’ Register in 1602 although it likely was not published until summer or fall 1603.  The Second Quarto and First Folio were published later—in 1604 and 1623 respectively.  Although the first Quatro is often considered “inferior” to later versions, assigning it the “Source” ISTC would be no different than if it were published today, and subsequently re-published as a revision (which would be assigned a related ISTC).  While there has been controversy about the source text of Hamlet that probably began not long after the day it was published and has certainly grown as the field of scholarship around Shakespeare has grown, for the purposes of identification and linking does the “Ur-text” matter?

    Certainly, a user would want to know that this is the canonical version, be that the Second Quatro or First Folio versions.  The critical point is that we identify things differently when there are important reasons to make the distinctions.  In the case of Hamlet, there is a need to make the distinction.  Which copy is considered “original” and which is a derivative isn’t nearly as important as making the distinction.

    It is valuable to note the description in the ISTC User’s Manuel in the section on Original works and derivations.  Quoting from the Manuel:

    7.1    What is an “original” work?

    For the purposes of registration on the ISTC database, a work may be regarded as being “original” if it cannot be adequately described using one or more of the controlled values allowed for the “Derivation Type” element (specified elsewhere in this document).

    A work is considered to be “original” for registration purposes unless it replicates a significant proportion of a previously existing work or it is a direct translation of the previously existing one (where all the words may be different but the concepts and their sequence are the same). It should be noted that this is a different approach from that used by FRBR2, which regards translations as simply different “expressions” of the same work.

    The “Source ISTC” metadata field is an optional one and is “Used to identify the original work(s) from which this one is derived (where appropriate). It is recommended that these are provided whenever possible.”  In the case of the three Hamlet “original versions” this field would likely be left blank, since there is no way to distinguish between the “Original” and the “Derivation”.  Each of the three versions could be considered “Original”, but this would get messy if one were not noted as original.   There is a “Derivation type” metadata field with restricted values, although “Unspecified” is one option.  Since there isn’t necessarily a value in the “original” distinction, there isn’t a point arguing about which is original.  In the real world, what will likely be the “original” will be the first version that receives the assignment.

    This same problem will likely be true of a variety of other texts, especially from distant historical periods.   A focus on core principles, that we distinguish what is important, that disambiguation is important, and avoiding the philosophical arguments surrounding “original” versus “derivative”, just as the ISTC community is trying to avoid “ownership” of the record, will help to serve the entire community.

    There is a lot more information about the ISTC provided by NISO. Members and subscribers can read the article that Andy Weissberg VP of Identifier Services & Corporate Marketing at Bowker wrote in Information Standards Quarterly last summer, The International Standard Text Code (ISTC): An Overview and Status Report. For non-subscribers, Andy Weissberg also presented during the 2009 NISO-BISG Changing Standards Landscape forum prior to ALA’s Annual conference in Chicago.  You can view his presentation slides or watch the video from that meeting.

    The International ISTC Agency Ltd is a not-for-profit company, limited by guarantee and registered in England and Wales. Its sole purpose is to implement and promote the ISO 21047 (ISTC) standard and it is operated by representatives of its founding members, namely RR Bowker, CISAC, IFRRO, and Nielsen Book Services.

    The first edition of “ISO 21047 Information and Documentation – International Standard Text Code (ISTC)” was published by ISO in March 2009. It is available for purchase in separate English and French versions either as an electronic download or printed document from ISO.

    Did the iPad start a publishing revolution yesterday or not? Wait and see

    Thursday, January 28th, 2010

    For Apple and Steve Jobs, yesterday might have been a game-changing day for Apple and -by extension- the entire media world.  I’m not sure the world shook in the way that he had hoped, but its possible that in the future we may look back on yesterday as a bigger day than how we view it was today.  Such is often the nature of revolutions.

    Since very few people have had an iPad in their hands yet, the talk of its pros and cons seems to me premature.  As with previous devices, it will be more and also less than the hype of its first debut.  As people begin to use it, as developers push the boundries of its capabilities, it will mature and improve.  It was wholly unrealistic to presume that Apple (or any other company launching a new product) would make the technological or political leaps necessary to create the “supreme device” that will replace all existing technology.

    A lot of people have made points about the iPad missing this or that technology.  Apple will almost certainly release an iPad 2.0 sometime in early 2011, dropping its price points and adding functionality — both as the underlying (interestingly not OLED display, which has been falsely reported) display technology becomes cheaper and based on, in some small ways, customer demand for functionality.  In this regards, think of copy & paste on the iPhone. As for some software gaps, such as lack of Adobe Flash support, while some have made the point that this is because of the iPhone OS,  I think these are driven by a desire to lock people into apps and inhibit browser-based free, or possibly paid, web-based services. It is in Apple’s interest to lock people into proprietary software/apps, which are written specifically for their device.

    From a standards perspective, the iPad could be both a good or bad thing.  Again it is too soon to tell, but very initial reactions are worrying.  That the iPad will support .epub as a file format is good on its face.  However, it is very likely that the iPad will contain Apple-specific DRM, since there isn’t at the moment an industry standard.  Getting content into (and out of, for those who want to move away from the iPad) that DRM will be the crucial question.  As far as I am aware, Apple has been publicly silent on that question.  I expect that some of the publishes who agreed to content deals likely discussed this in detail, but those conversatins were likely limited to a very small group of executives all bound by harsh NDAs.  (I note that McGraw Hill was allegedly dropped from the announcement because of comments made by its CEO Tuesday on MSNBC.)

    Also on the standards front, there was an excellent interview last night on the NPR news show Marketplace, during which author Josh Bernoff, also of Forrester Research, made the point that the internet was splintering into a variety of device specific applications.  The move toward applications in the past two years might reasonably be cause for concern.  It definitely adds to cost for content producers to create multiple contents for multiple platforms. I can’t say that I completely agree with his assessment, however.  The fact that there are open platforms available in the market place and that competition is forcing developers to open up their systems, notably the Google Android phone OS as well as the introduction of the Amazon Kindle Development Kit last week.

    What is most interesting about this new product is its potential.  No one could have predicted three years ago the breadth and depth of the applications that have been developed for the iPhone.  Unleashing that creativity on the space of ebooks will very likely prove to be a boon for our community.  Specifically, this could provide publishers with an opportunity to expand the functionality of the ebook.

    Often, new technology is at first used to replicate the functionality of the old technology.  In the case of books, I’m referring to the technology of paper. We are only now beginning to see people begin to take advantage of the new digital technology’s possibilities.    Perhaps the launch of Amazon’s new development kit and the technology platform of the iPad will spur innovative thinking about how to use ebooks and enhancing the functionality of digital content’s ability to also be an interactive medium.  The one element of the presentation yesterday that really caught my eye in this regard is the new user interface for reading the New York Times. This seemed the most innovative application of the iPad.  Hopefully in the coming months and years we will see a lot more of that experimentation, user interface design and multi-media intergration.

    If that takes place than yesterday might have been a big day in the development of ebooks and information distribution.  If not, the jokes about the name will be all that we’ll recall about this new reader.

    Problems with a “Kindle in Every Backpack”

    Wednesday, July 15th, 2009

    Interestingly, on the heels of last week’s ALA conference in Chicago, the Democratic Leadership Council (DLC) has released a proposal: “A Kindle in Every Backpack: A Proposal for eTextbooks in American Schools”.  This influential DC-based think tank promotes center-left-leaning policies related to education, trade, pro-business tax and economic reform and health care according to their website.  The report was issued by Tom Freedman, a policy analyst and lobbyist, who had worked as a policy adviser to the President in the Clinton administration as and former Press Secretary and Policy Director for Senator Schumer (D-NY).  Unfortunately, this is the kind of DC policy report that approaches these issues from a 30,000-foot level from an expert who, by the looks of his client list, has no experience with the media or publishing industries and therefore comes to the wrong conclusion.  This perspective leads to a report is light on understanding the business impacts, the pitfalls of the technology at this stage, and the significant problems that would be caused by leaping at once behind still maturing technology.

    The report does make several good points about the value of e-texts. The functionality, the reduction in manufacturing costs, the up-date-ability of digital versions, the environmental impact and savings of digital distribution, all make the move to ebooks very compelling. I do agree that this is the general direction that textbooks are headed.  However, before we jump headfirst into handing out ebook readers (especially the Kindle) to every child, there’s much more to this topic than Freedman’s report details.
    While a good idea from some perspectives, Freedman misses the trees through the forest.  First of all, while I am incredibly fond of my Kindle, it is not perfectly suited for textbooks.  Here are several concerns I have at this stage, in no particular order.  Many of these topics were themes we covered in the NISO / BISG Forum last Friday on the Changing Standards Landscape for Ebooks.  We’ll be posting video clips of the presentations later this summer.  NISO is also hosting a webinar on ebooks next month.
    The business models for ebook sales are very early in their development.  Many critical questions, such as license terms, digital rights management, file formats and identification still need to be explored, tested and tweaked.  It took more than a decade for the business model for electronic journals to begin to mature and ebooks are only at the outset of these changes.  Even a year later, the market of e-journals is still a tenuous one, still tied in many ways to print.  It will be at least a decade before these same models mature for ebooks, which is a larger and in many ways a more complex market.

    While a print book might be inefficient from the perspective of distribution, storage and up-to-date content, print has the distinct advantage in that it also lasts a long time.  Freedman’s report notes that many school texts are outdated.  A 2008 report from the New York Library Association that Freedman cites highlights that “the average age of books in school libraries ranges from 21 to 25 years old across the six regions of the state surveyed, with the average book year being 1986.”  That NYLA report also found that “the average price of an elementary school book is $20.82 and $23.38 for secondary school books.” So if one text were purchased once and used for 20+ years, the cost per year, per student is less than $1.00.  I seriously doubt that publishers would be willing to license the texts for so little on an annual ongoing subscription basis.  That would reduce the textbook market from $6 billion per year less than $1 billion (presuming if the 56 million k-12 students were each given an e-book reader with 6 books at $2 per book, which is more than twice the current cost/year/book detailed in the NYLA report.) The problem is that the textbook publishes can’t survive on this reduced revenue stream and they know it.
    I don’t want to quibble, but the data source that Freedman uses for his cost estimates is simply not accurate for manufacturing as a percentage of overall costs of goods sold.  Therefore his estimate of the potential costs savings is way off the mark.  Freedman claims that the savings by moving to digital distribution would be in the range of 45% and is simply wrong.  Anyone who has dealt with the transition from print to electronic distribution of published information knows this.  To paraphrase NBC Universal’s Jeff Zucker, the publishing world cannot sustain itself moving from “analog dollars to digital pennies.”  The vast majority of costs of a product are not associated with the creation and distribution of the physical item.  Much like most product development, most people are shocked to realize that it costs less than a fraction of a penny to “manufacture” the $3.00 soda that they purchase at a restaurant, or only $3 to manufacture the $40 HDMI cable.
    Physical production of books (the actual paper, print and binding) represents only about 10-15% of the retail price.  Had Freedman actually understood the business (or Tim Conneally, who wrote the cited article) he would have understood the flaw in the following statement “32.7% of a textbook’s cost comes from paper, printing, and editorial costs.”  The vast majority of the 32.7% are not manufacturing costs, they are editorial and first copy costs, which do not go away in a digital environment.  Unless people are willing to read straight ASCII-text, a book still needs to be edited, formatted, laid out, tagged for production, images included, etc.  This is especially true in textbooks.  These costs do not go away in a digital environment and will continue to need support.  If the industry is functioning on $6 billion in annual revenues, reducing marginal costs by even 20% wouldn’t allow it to survive on less than half of present revenues.  This is a problem that many publishers are finding with current Kindle sales and has been the subject of a number of posts, conjecture and controversy.
    A much better analysis of the actual costs of manufacturing a book than what Freedman uses in his paper is available on the Kindle Review Blog. Even though this analysis is focused on trade publication, the cost splits are roughly equivalent in other publishing specialties.
    Costs are another issue inhibiting the wide adoption of ebook reader technology.  Once the cost-per-reading device decreases to a point where they cost less than $100 per device they will likely begin to become as ubiquitous as the iPod is today. However, they will have to drop well below $100 before most school districts will begin handing them to students. I doubt that it will take place in the next 3-5 years.  The reasons for this are myriad.
    At the moment, the display technology of e-book readers is still developing and improving.  This is one of the main reasons that manufacturing of the Kindle was slow to meet the demand even into its first year of sales.  Although increased demand from a program such as proposed would significantly boost manufacturing capabilities, it is still an open question as to whether e-ink is the best possible technology, although it is very promising – and one that I’m fond of.  Would it make sense for the government through a “Kindle for every child” program to determine that Kindle and its technology are most appropriate, simply because it was first to market with a successful product (BTW, I’m sure Sony wouldn’t agree with this point.) Even recently with several hundred thousand devices produced, the manufacturing costs for the Kindle are reported to be about $185 per device.  It would take a 75% reduction in direct costs to make a sub-$100 price feasible.  That won’t happen quickly, even if the government through millions of dollars Amazon’s way.
    Other issues that I have about this idea are more specific to the Kindle itself and its functionality.  The note-taking feature is clumsy and when compared to writing in the margins or highlighting on paper falls short – although it is better than other devices (particularly the 1st gen Kindle).  Other devices, with touch-screen technology appear to handle this better, although these too are several years from mass-market production.   Even still, they would likely be more costly than the current e-ink technology.
    At the moment, Kindle is also only a black and white reading device.  Some color display technologies are being developed, but the power drain is too significant for any but small, iPhone-sized devices to support long-term use (such as the week or more between Kindle charges).  The fact that the display on the Kindle is only in black and white would pose significant problems for textbooks where color pictures are critical, especially in the sciences.  This goes back to the underlying costs of the systems noted above.
    Also, the Kindle is a closed proprietary system completely controlled by Amazon.  While fine for the moment in its current, trade book market space, it would be unlikely for a whole new class of publishers (K-12 textbook publishers) to hand their entire business model over to a single company, who many publishers already consider to be too powerful.
    The rendering of graphics, charts and forms is clumsy on the Kindle, in part because of the Kindle format, but more because the file format standards are still improving in this area.  Also, the reflowable file format, EPUB, is still maturing and how it handles complex format issues, as would be the case in textbooks, are still being improved.  EPUB, an open standard for publishing ebooks, isn’t even naively supported by the Kindle, which relies on its own proprietary file format.
    While a great grand vision, I’m sorry to say that even the pilot described in Freedman’s paper isn’t a good idea at this stage.  The ebook market needs time to work through the technical and business model questions.  The DLC report he wrote presumes that dumping millions of dollars in a program of this type will push the industry forward more rapidly. My sense of the energy in the ebooks market is already palpable and would progress regardless of any government intervention.   The unintended consequences of this suggestion would radically and irrevocably alter the publishing community in a way that would likely lead to diminished service, even if it were to be successful.  Eventually, educational textbook publishers will move in the direction of ebooks, as will most other publishers.  Digital versions will not replace print entirely, but they will supplant it in many cases and textbooks are likely one segment where it will.  However, it will take a lot more time than most people think.

    Atlantic Records posts more digital sales than CDs

    Tuesday, December 2nd, 2008

    Late last week, one of the largest music labels announced that its sales of digital files exceeded the revenue generated by CDs.  As reported in the New York TimesAltantic Records saw 51% of its sales generated by digital sales.  This was significantly more than Atlantic’s parent company, Warner Music Group, which reported only 27% of its total sales from digital distribution. 

    It should come as no surprise that digital music is quickly replacing physical media.  One need only think of the weight and mess of thousands of CDs, versus a nearly unlimited amount on an iPod or streaming on demand.  The question is when will other media follow?  Some magazines are slowly getting rid of print in favor of online.  It will be some time before display technology exceeds the user experience of print on paper.  In some ways scholarly journal publishing is already headed down this path.  The rest of publishing is slower to adapt.  However, several tipping points will likely be reached fairly soon. 

    * – Display technology needs to improve, so that the user experience is comparable to print

    * - Standardization around some from of reader, or at least a common file format working on different devices

    * - A Napster-like social movement among the broader tech-savvy early adopters (not regarding free distribution, necessarily) which pushes e-books and the like to digital.

    * – A breadth and depth of available content to make the purchase of the reader worthwhile.

    * – Mass production of readers so that they are no longer $300+  

    * – Preservation strategies need to be improved 

     

    Many of these issues are consensus based and awaiting either standards or adoption of existing standards. 

    Interested in the details of the Google Settlement?

    Wednesday, November 19th, 2008

    Jonathan Band, a DC-based intellectual property lawyer, has produced an excellent distillation of the Google Library/Publisher/Author’s Guild settlement.  For those who are interested but not committed to reading the full 141 pages and 15 attachments, Jonathan’s summary is readable and a much more manageable 21 pages.  Thanks and congratulations to Jonathan for a great summary.

    Magazine publishing going digital only — PC Magazine to cease print

    Wednesday, November 19th, 2008

    Another magazine announced today that they will cease publication of a print edition. In an interview with the website PaidContent.org, the CEO of Ziff Davis Jason Young, announced that PC Magazine will cease distribution of their print edition in January.

    PC Magazine is just one of several mass-market publications that are moving to online only distribution. Earlier this week, Reuters reported that a judge has approved the reorganization of Ziff Davis, which is currently under Chapter 11 bankruptcy protection. There was some speculation about the future of Ziff Davis’ assets.

    From the story:

    The last issue will be dated January 2009; the closure will claim the jobs of about seven employees, all from the print production side. None of the editorial employees, who are now writing for the online sites anyway, will be affected.

    Only a few weeks ago, the Christian Science Monitor announced that it would be ending print distribution. The costs of producing and distributing paper has always been a significant expense for publishers and in a period of decreasing advertising revenues, lower circulation, and higher production costs, we can expect that more publications will head in this direction.

    Within the scholarly world, in particular, I expect that the economics will drive print distribution to print-on-demand for those who want to pay extra, but overall print journals will quickly become a thing of the past. I know a lot of people have projected this for a long time. ARL produced an interesting report written by Rick Johnson last fall on this topic, but it appears we’re nearing the tipping point Rick described in that report.

    This transition makes all the more critical the ongoing work on preservation, authenticity, reuse, and rights particularly as they relate to the differences between print and online distribution.