Home | About NISO | Blog

Archive for the ‘government’ Category

NISO response to the National Science Board on Data Policies

Wednesday, January 18th, 2012

Earlier this month, the National Science Board (NSB) announced it was seeking comments from the public on the report from the Committee on Strategy and Budget Task Force on Data Policies, Digital Research Data Sharing and Management.  That report was distributed last December.

NISO has prepared a response on behalf of the standards development community, which was submitted today.  Here are some excerpts of that response:

The National Science Board’s Task Force on Data Policies comes at a watershed moment in the development of an infrastructure for data-intensive science based on sharing and interoperability. The NISO community applauds this effort and the focused attention on the key issues related to a robust and interoperable data environment.

….

NISO has particular interest in Key Challenge #4: The reproducibility of scientific findings requires that digital research data be searchable and accessible through documented protocols or method. Beyond its historical involvement in these issues, NISO is actively engaged in forward-looking projects related to data sharing and data citation. NISO, in partnership with the National Federation of Advanced Information Services (NFAIS), is nearing completion of a best practice for how publishers should manage supplemental materials that are associated with the journal articles they publish. With a funding award from the Alfred P. Sloan Foundation and in partnership with the Open Archives Initiative, NISO began work on ResourceSync, a web protocol to ensure large-scale data repositories can be replicated and maintained in real-time. We’ve also had conversations with the DataCite group for formal standardization of their IsCitedBy specification. [Todd Carpenter serves] as a member of the ICSTI/CODATA task force working on best practices for data citation and NISO is looking forward to promoting and formalizing any recommendations and best practices that derive from that work.

….

We strongly urge that any further development of data-related best practices and standards take place in neutral forums that engage all relevant stakeholder communities, such as the one that NISO provides for consensus development. As noted in Appendix F of the report, Summary Notes on Expert Panel Discussion on Data Policies, standards for descriptive and structural metadata and persistent identifiers for all people and entities in the data exchange process are critical components of an interoperable data environment. We cannot agree more with this statement from the report of the meeting: “Funding agencies should work with stakeholders and research communities to support the establishment of standards that enable sharing and interoperability internationally.”

There is great potential for NSF to expand its leadership role in fostering well-managed use of data. This would include not only support of the repository community, but also in the promulgation of community standards. In partnership with NISO and using the consensus development process, NSF could support the creation of new standards and best practices. More importantly, NSF could, through its funding role, provide advocacy for—even require—how researchers should use these broad community standards and best practices in the dissemination of their research. We note that there are more than a dozen references to standards in Digital Research Data Sharing and Management report, so we are sure that this point is not falling on unreceptive ears.

The engagement of all relevant stakeholders in the establishment of data sharing and management practices as described in Recommendation #1 is critical in today’s environment—at both the national and international levels. While the promotion of individual communities of practice is a laudable one, it does present problems and issues when it comes to systems interoperability. A robust system of data exchange by default must be one grounded on a core set of interoperable data. More often than not, computational systems will need to act with a minimum of human intervention to be truly successful. This approach will not require a single schema or metadata system for all data, which is of course impossible and unworkable. However, a focus on and inclusion of core data elements and common base-level data standards is critical. For example, geo-location, bibliographic information, identifiers and discoverability data are all things that could be easily standardized and concentrated on to foster interoperability. Domain-specific information can be layered over this base of common and consistent data in a way that maintains domain specificity without sacrificing interoperability.

One of the key problems that the NSB and the NSF should work to avoid is the proliferation of standards for the exchange of information. This is often the butt of standards jokes, but in reality it does create significant problems. It is commonplace for communities of interest to review the landscape of existing standards and determine that existing standards do not meet their exact needs. That community then proceeds to duplicate seventy to eighty percent of existing work to create a specification that is custom-tailored to their specific needs, but which is not necessarily compatible with existing standards. In this way, standards proliferate and complicate interoperability. The NSB is uniquely positioned to help avoid this unnecessary and complicating tendency. Through its funding role, the NSB should promote the application, use and, if necessary, extension of existing standards. It should aggressively work to avoid the creation of new standards, when relevant standards already exist.

The sharing of data on a massive scale is a relatively new activity and we should be cautious in declaring fixed standards at this state. It is conceivable that standards may not exist to address some of the issues in data sharing or that it may be too early in the lifecycle for standards to be promulgated in the community. In that case, lower-level consensus forms, such as consensus-developed best practices or white papers could advance the state of the art without inhibiting the advancement of new services, activities or trends. The NSB should promote these forms of activity as well, when standards development is not yet an appropriate path.

We hope that this response is well received by the NSB in the formulation of its data policies. There is terrific potential in creating an interoperable data environment, but that system will need to be based on standards and rely on best practices within the community to be fully functional. The scientific community, in partnership with the library, publisher and systems provider communities can all collectively help to create this important infrastructure. Its potential can only be helped by consensus agreement on base-level technologies. If development continues in a domain-centered path, the goal of interoperability and delivering on its potential will only be delayed and quite possibly harmed.

The full text PDF of the entire response is available here.  Comments from the public related to this document are welcome.

Mandatory Copyright Deposit for Electronic-only Materials

Thursday, April 1st, 2010

In late February, the Copyright Office at the Library of Congress published a new rule that expands the requirement for the mandatory deposit to include items published in only in digital format.   The interim regulation, Mandatory Deposit of Published Electronic Works Available Only Online (37 CFR Part 202 [Docket No. RM 2009–3]) was released in the Federal Register.  The Library of Congress will focus its first attention on e-only deposit of journals, since this is the area where electronic-only publishing is most advanced.  Very likely, this will move into the space of digital books as well, but it will likely take sometime to coalesce.

I wrote a column about this in Against the Grain last September outlining some of these issues that this change will require.  A free copy of that article is available here.  The Library of Congress is aware, and will become painfully more so when this stream of online content begins to flow their way.  To support an understanding about these new regulations, LC hosting a forum in Washington in May to discuss publisher’s technology for providing these data on a regular basis.  Below is the description about the meeting that LC provided.

Electronic Deposit Publishers Forum
May 10-11, 2010
Library of Congress — Washington, DC

The Mandatory deposit provision of the US Copyright Law requires that published works be deposited with the US Copyright Office for use by the Library of Congress in its collection.  Previously, copyright deposits were required only for works published in a physical form, but recently revised regulations now include the deposit of electronic works published only online.  The purpose of this workshop is to establish a submission process for these works and to explore technical and procedural options that will work for the publishing community and the Library of Congress.

Discussion topics will include:

  • Revised mandatory deposit regulations
  • Metadata elements and file formats to be submitted

Space for this meeting is very limited, but if you’re interested in participating in the meeting, you should contact the Copyright Office.

  • Proposed transfer mechanisms
  • Problems with a “Kindle in Every Backpack”

    Wednesday, July 15th, 2009

    Interestingly, on the heels of last week’s ALA conference in Chicago, the Democratic Leadership Council (DLC) has released a proposal: “A Kindle in Every Backpack: A Proposal for eTextbooks in American Schools”.  This influential DC-based think tank promotes center-left-leaning policies related to education, trade, pro-business tax and economic reform and health care according to their website.  The report was issued by Tom Freedman, a policy analyst and lobbyist, who had worked as a policy adviser to the President in the Clinton administration as and former Press Secretary and Policy Director for Senator Schumer (D-NY).  Unfortunately, this is the kind of DC policy report that approaches these issues from a 30,000-foot level from an expert who, by the looks of his client list, has no experience with the media or publishing industries and therefore comes to the wrong conclusion.  This perspective leads to a report is light on understanding the business impacts, the pitfalls of the technology at this stage, and the significant problems that would be caused by leaping at once behind still maturing technology.

    The report does make several good points about the value of e-texts. The functionality, the reduction in manufacturing costs, the up-date-ability of digital versions, the environmental impact and savings of digital distribution, all make the move to ebooks very compelling. I do agree that this is the general direction that textbooks are headed.  However, before we jump headfirst into handing out ebook readers (especially the Kindle) to every child, there’s much more to this topic than Freedman’s report details.
    While a good idea from some perspectives, Freedman misses the trees through the forest.  First of all, while I am incredibly fond of my Kindle, it is not perfectly suited for textbooks.  Here are several concerns I have at this stage, in no particular order.  Many of these topics were themes we covered in the NISO / BISG Forum last Friday on the Changing Standards Landscape for Ebooks.  We’ll be posting video clips of the presentations later this summer.  NISO is also hosting a webinar on ebooks next month.
    The business models for ebook sales are very early in their development.  Many critical questions, such as license terms, digital rights management, file formats and identification still need to be explored, tested and tweaked.  It took more than a decade for the business model for electronic journals to begin to mature and ebooks are only at the outset of these changes.  Even a year later, the market of e-journals is still a tenuous one, still tied in many ways to print.  It will be at least a decade before these same models mature for ebooks, which is a larger and in many ways a more complex market.

    While a print book might be inefficient from the perspective of distribution, storage and up-to-date content, print has the distinct advantage in that it also lasts a long time.  Freedman’s report notes that many school texts are outdated.  A 2008 report from the New York Library Association that Freedman cites highlights that “the average age of books in school libraries ranges from 21 to 25 years old across the six regions of the state surveyed, with the average book year being 1986.”  That NYLA report also found that “the average price of an elementary school book is $20.82 and $23.38 for secondary school books.” So if one text were purchased once and used for 20+ years, the cost per year, per student is less than $1.00.  I seriously doubt that publishers would be willing to license the texts for so little on an annual ongoing subscription basis.  That would reduce the textbook market from $6 billion per year less than $1 billion (presuming if the 56 million k-12 students were each given an e-book reader with 6 books at $2 per book, which is more than twice the current cost/year/book detailed in the NYLA report.) The problem is that the textbook publishes can’t survive on this reduced revenue stream and they know it.
    I don’t want to quibble, but the data source that Freedman uses for his cost estimates is simply not accurate for manufacturing as a percentage of overall costs of goods sold.  Therefore his estimate of the potential costs savings is way off the mark.  Freedman claims that the savings by moving to digital distribution would be in the range of 45% and is simply wrong.  Anyone who has dealt with the transition from print to electronic distribution of published information knows this.  To paraphrase NBC Universal’s Jeff Zucker, the publishing world cannot sustain itself moving from “analog dollars to digital pennies.”  The vast majority of costs of a product are not associated with the creation and distribution of the physical item.  Much like most product development, most people are shocked to realize that it costs less than a fraction of a penny to “manufacture” the $3.00 soda that they purchase at a restaurant, or only $3 to manufacture the $40 HDMI cable.
    Physical production of books (the actual paper, print and binding) represents only about 10-15% of the retail price.  Had Freedman actually understood the business (or Tim Conneally, who wrote the cited article) he would have understood the flaw in the following statement “32.7% of a textbook’s cost comes from paper, printing, and editorial costs.”  The vast majority of the 32.7% are not manufacturing costs, they are editorial and first copy costs, which do not go away in a digital environment.  Unless people are willing to read straight ASCII-text, a book still needs to be edited, formatted, laid out, tagged for production, images included, etc.  This is especially true in textbooks.  These costs do not go away in a digital environment and will continue to need support.  If the industry is functioning on $6 billion in annual revenues, reducing marginal costs by even 20% wouldn’t allow it to survive on less than half of present revenues.  This is a problem that many publishers are finding with current Kindle sales and has been the subject of a number of posts, conjecture and controversy.
    A much better analysis of the actual costs of manufacturing a book than what Freedman uses in his paper is available on the Kindle Review Blog. Even though this analysis is focused on trade publication, the cost splits are roughly equivalent in other publishing specialties.
    Costs are another issue inhibiting the wide adoption of ebook reader technology.  Once the cost-per-reading device decreases to a point where they cost less than $100 per device they will likely begin to become as ubiquitous as the iPod is today. However, they will have to drop well below $100 before most school districts will begin handing them to students. I doubt that it will take place in the next 3-5 years.  The reasons for this are myriad.
    At the moment, the display technology of e-book readers is still developing and improving.  This is one of the main reasons that manufacturing of the Kindle was slow to meet the demand even into its first year of sales.  Although increased demand from a program such as proposed would significantly boost manufacturing capabilities, it is still an open question as to whether e-ink is the best possible technology, although it is very promising – and one that I’m fond of.  Would it make sense for the government through a “Kindle for every child” program to determine that Kindle and its technology are most appropriate, simply because it was first to market with a successful product (BTW, I’m sure Sony wouldn’t agree with this point.) Even recently with several hundred thousand devices produced, the manufacturing costs for the Kindle are reported to be about $185 per device.  It would take a 75% reduction in direct costs to make a sub-$100 price feasible.  That won’t happen quickly, even if the government through millions of dollars Amazon’s way.
    Other issues that I have about this idea are more specific to the Kindle itself and its functionality.  The note-taking feature is clumsy and when compared to writing in the margins or highlighting on paper falls short – although it is better than other devices (particularly the 1st gen Kindle).  Other devices, with touch-screen technology appear to handle this better, although these too are several years from mass-market production.   Even still, they would likely be more costly than the current e-ink technology.
    At the moment, Kindle is also only a black and white reading device.  Some color display technologies are being developed, but the power drain is too significant for any but small, iPhone-sized devices to support long-term use (such as the week or more between Kindle charges).  The fact that the display on the Kindle is only in black and white would pose significant problems for textbooks where color pictures are critical, especially in the sciences.  This goes back to the underlying costs of the systems noted above.
    Also, the Kindle is a closed proprietary system completely controlled by Amazon.  While fine for the moment in its current, trade book market space, it would be unlikely for a whole new class of publishers (K-12 textbook publishers) to hand their entire business model over to a single company, who many publishers already consider to be too powerful.
    The rendering of graphics, charts and forms is clumsy on the Kindle, in part because of the Kindle format, but more because the file format standards are still improving in this area.  Also, the reflowable file format, EPUB, is still maturing and how it handles complex format issues, as would be the case in textbooks, are still being improved.  EPUB, an open standard for publishing ebooks, isn’t even naively supported by the Kindle, which relies on its own proprietary file format.
    While a great grand vision, I’m sorry to say that even the pilot described in Freedman’s paper isn’t a good idea at this stage.  The ebook market needs time to work through the technical and business model questions.  The DLC report he wrote presumes that dumping millions of dollars in a program of this type will push the industry forward more rapidly. My sense of the energy in the ebooks market is already palpable and would progress regardless of any government intervention.   The unintended consequences of this suggestion would radically and irrevocably alter the publishing community in a way that would likely lead to diminished service, even if it were to be successful.  Eventually, educational textbook publishers will move in the direction of ebooks, as will most other publishers.  Digital versions will not replace print entirely, but they will supplant it in many cases and textbooks are likely one segment where it will.  However, it will take a lot more time than most people think.