Home | About NISO | Blog

NISO Standards Bearer Blog

How the Information World Connects

Interested in the details of the Google Settlement?

November 19th, 2008

Jonathan Band, a DC-based intellectual property lawyer, has produced an excellent distillation of the Google Library/Publisher/Author’s Guild settlement.  For those who are interested but not committed to reading the full 141 pages and 15 attachments, Jonathan’s summary is readable and a much more manageable 21 pages.  Thanks and congratulations to Jonathan for a great summary.

Magazine publishing going digital only — PC Magazine to cease print

November 19th, 2008

Another magazine announced today that they will cease publication of a print edition. In an interview with the website PaidContent.org, the CEO of Ziff Davis Jason Young, announced that PC Magazine will cease distribution of their print edition in January.

PC Magazine is just one of several mass-market publications that are moving to online only distribution. Earlier this week, Reuters reported that a judge has approved the reorganization of Ziff Davis, which is currently under Chapter 11 bankruptcy protection. There was some speculation about the future of Ziff Davis’ assets.

From the story:

The last issue will be dated January 2009; the closure will claim the jobs of about seven employees, all from the print production side. None of the editorial employees, who are now writing for the online sites anyway, will be affected.

Only a few weeks ago, the Christian Science Monitor announced that it would be ending print distribution. The costs of producing and distributing paper has always been a significant expense for publishers and in a period of decreasing advertising revenues, lower circulation, and higher production costs, we can expect that more publications will head in this direction.

Within the scholarly world, in particular, I expect that the economics will drive print distribution to print-on-demand for those who want to pay extra, but overall print journals will quickly become a thing of the past. I know a lot of people have projected this for a long time. ARL produced an interesting report written by Rick Johnson last fall on this topic, but it appears we’re nearing the tipping point Rick described in that report.

This transition makes all the more critical the ongoing work on preservation, authenticity, reuse, and rights particularly as they relate to the differences between print and online distribution.

Changing the ideas of a catalog: Do we really need one?

November 19th, 2008

Here’s one last post on thoughts regarding the Charleston Conference.

Friday afternoon during the Charleston meeting, Karen Calhoun, Vice President, WorldCat and Metadata Services at OCLC and Janet Hawk, Director, Market Analysis and Sales Programs at OCLC gave a joint presentation entitled: Defining Quality As If End Users Matter: The End of the World As We Know It(link to presentations page – actual presentation not up yet). While this program focused on the needs, expectations and desired functionality of users of WorldCat, there was an underlying theme which came out to me and could have deep implications for the community.

“Comprehensive, complete and accurate.” I expect that every librarian, catalogers in particular, would strive to achieve these goals with regard to the information about their collection. The management of the library would likely add cost-effective and efficient to this list as well. Theses goals have driven a tremendous amount of effort at almost every institution when building its catalog. Information is duplicated, entered into systems (be they card catalogs, ILS or ERM systems) and maintained, eventually migrated to new systems. However, is this the best approach?

When you log into the Yahoo web page, for example, the Washington Post, or a service like Netvibes or Pageflakes, what you are presented with is not information culled from a single source, or even 2 or three. On my Netvibes landing page, I have information pulled from no less than 65 feeds, some mashed up, some straight RSS feeds. Possibly (probably), the information in these feeds is derived from dozens of other systems. Increasingly, what the end-user experiences might seem like an integrated and cohesive experience, however on the back-end the page is drawing from multiple sources, multiple formats, multiple streams of data. These data stream could be aggregated, merged and mashed up to provide any number of user experiences. And yet, building a catalog has been an effort to build a single all-encompassing system with data integrated and combined into a single system. It is little wonder that developing, populating and maintaining these systems requires tremendous amounts of time and effort.

During Karen’s and Janet’s presentation last week provided some interesting data about the enhancements that different types of users would like to see in WorldCat and WorldCatLocal. The key take away was that there were different users of the system, with different expectations, needs and problems. Patrons have one set of problems and desired enhancements, while librarians have another. Neither is right or wrong, but represent different sides of the same coin – what a user wants depends entirely on what the need and expect from a service. This is as true for banking and auto repair as it is for ILS systems and metasearch services.

    Putting together the pieces.

Karen’s presentation followed interestingly from another session that I attended on Friday in which Andreas Biedenbach, eProduct Manager Data Systems & Quality at Springer Science + Business Media, spoke about the challenges of supplying data from a publisher’s perspective. Andreas manages a team that distributes metadata and content to the variety of complicated users of Springer data. This includes libraries, but also a diverse range of other organizations such as aggregators, A&I services, preservation services, link resolver suppliers, and even Springer’s own marketing and web site departments. Each of these users of the data that Andreas’ team supplies has their own requirements, formats and business terms, which govern the use of the data. Some of these streams are complicated feeds of XML structures to simple comma-separated text files. Each of which is in its own format, some standardized, some not. It is little wonder there are gaps in the data, non-conformance, or format issues. Similarly, it is not a lack of appropriate or well-developed standards as much as it is conformance, use and rationalization. We as a community cannot continue to provide customer-specific requests to data requests for data that is distributed into the community.

Perhaps the two problems have a related solution. Rather than the community moving data from place to place, populating their own systems with data streams from a variety of authoritative sources could a solution exist where data streams are merged together in a seamless user interface? There was a session at ALA Annual hosted by OCLC on the topic of mashing up library services. Delving deeper, rather than entering or populating library services with gigabytes and terabytes of metadata about holdings, might it be possible to have entire catalogs that were mashed up combinations of information drawn from a range of other sources? The only critical information that a library might need to hold is an identifier (ISBN, ISSN, DOI, ISTC, etc) of the item they hold drawing additional metadata from other sources on demand. Publishers could supply a single authoritative data stream to the community, which could be combined with other data to provide a custom view of the information based on the user’s needs and engagement. Content is regularly manipulated and represented in a variety of ways by many sites, why can’t we do the same with library holdings and other data?

Of course, there are limitations to how far this could go: what about unique special collections holdings; physical location information; cost and other institution-specific data. However, if the workload of librarians could be reduced in significant measure by mashing up data and not replicating it in hundreds or thousands of libraries, perhaps it would free up time to focus on other services that add greater value to the patrons. Similarly, simplifying the information flow out of publishers would reduce errors and incorrect data, as well as reduce costs.

Google Settlement gets tentative court approval

November 18th, 2008

Yesterday, the NY Court judge overseeing the publishers/authors/Google settlement has given tentative approval to the deal.  More details are here.

Digital Repositories meeting: Metrics and assessment

November 18th, 2008

Yesterday and today, I’ve been at the SPARC conference on Digital Repositories. It’s been a good meeting so far. Full disclosure: NISO is a sponsor of the event, although we were not involved in the development of the program.

One topic that has been discussed repeatedly is the need for statistics and measures to assess the quality of matierals deposited into IR systems. Yesterday, one of the speakers (sorry, name and hopefully link to presentation coming) noted that they had begun using the COUNTER code of practice to report out usage from their repository. Not surprisingly, when the COUNTER rules are applied to the usage data that comes out of IRs, the usage figures drop quite precipitously. Raw usage figures numbering in the millions dropped to around 80,000 hits (actual figures will be drawn from presentation when posted). Those in the community familiar with publisher usage data and how much reported usage drops when reporting conformance with COUNTER is instituted will be familiar with these usage level drops. Perhaps, greater application of the COUNTER code for IRs will provide a level playing field upon which people can consider IR traffic on an apples-to-apples approach.

The larger question of assessment is thornier. This ties to my earlier post on metrics for article level usage from the Charleston conference. As yet, there has been no discussion about what these measures will be. Impact Factor is frequently cited, but this is a journal-specific measure. How an individual article or resource is assessed, this measure falls short with relation to IRs and there is likely a need for more item specific measures.

RDA Draft now available

November 17th, 2008

The full draft of the Resource Description and Access (RDA) document is now available on the JSC website.

 From the JSC website:

For information on submitting comments on the draft content, see Making comments on RDA drafts.

The deadline for constituency responses is 2 February, 2009, to allow time for the comments to be compiled for consideration by the JSC at their meeting in March 2009. Note that each constituency committee will be setting their own deadlines for comments in advance of 2 February.    

EU Research Data Preservation Project Seeks Survey Input from Publishers

November 11th, 2008

PARSE.Insight, a European Union project initiated in March 2008 “to highlight the longevity and vulnerability of digital research data,” is conducting an online survey about access and storage of research data.

PARSE.Insight is “concerned with the preservation of digital information in science, from primary data through analysis to the final publications resulting from the research. The problem is how to safeguard this valuable digital material over time, to ensure that it is accessible, usable and understandable in future.”

They are interested in getting publishers’ views included in their survey, in addition to researchers, since publishers play a critical role in the digital preservation of publications and related research data.

The survey is available here:
https://www.surveymonkey.com/s.aspx?sm=VfIpOoxogOv73uWOyaOhoQ_3d_3d

Reponses are aggregated for analysis and made anonymous. If you wish to be informed about the results of the survey you can enter your e-mail address at the end of the survey.

Ultimately, PARSE.insight plans to “to develop a roadmap and recommendations for developing the e-infrastructure in order to maintain the long-term accessibility and usability of scientific digital information in Europe.”

Posted by Cynthia Hodgson

Charleston conference: Every librarian need not be a programmer too

November 8th, 2008

Over dinner on Friday with the Swets team and their customers, I had the chance to speak with Mia Brazil at Smith College.  We had a great conversation.  She was telling me her frustration about getting systems to work and she was lamenting the challenges of not understanding programming.  She’d said that she tried learning SQL, but didn’t have much luck.  Now, learning SQL programming is no small feat and I can appreciate her frustrations (years ago, I helped build and implement marketing and circulation databases for publishers).  However, realistically, librarians aren’t programmers and shouldn’t be expected to be.

The systems that publishers and systems providers sell to libraries shouldn’t require that everyone get a master’s in database programming to implement or use.  While the larger libraries are going to have resources to implement and tweak these systems to meet their own needs, the smaller college or public libraries are not going to have the resources to have programmers on staff.  We shouldn’t expect that the staff at those libraries – on top of their other responsibilities – should have to be able to code their own system hacks to get their work done.

In a way, this was what Andrew Pace discussed in his session Friday on moving library services to the grid.  Essentially, Andrew argued that many libraries should consider moving to a software-as-a-service model for their ILS, catalog and other IT needs.  Much like Salesforce.com, provides an online platform for customer relationship management, or like Quicken does for accounting software, libraries shouldn’t have to locally load, support and hack systems to manage their work.  Some suppliers are headed in that direction.  While there are pros and cons related to this approach, it certainly is a viable solution for some organizations. I hope for Mia’s sake it happens sooner than later.

Charleston Conference: Some Quotes

November 7th, 2008

Here are some interesting paraphrased snippets worthy of consideration from the Charleston Conference:

 John Sack, Highwire: Today, readers and browsers are technology applications.  A decade ago readers and browsers were people.

Andrew Pace, OCLC: There are nearly 19 million library transactions worldwide per day.  That averages to  5,265 per second.

James Neal, Columbia University:  How will new intellectual property polices at universities affect scholarly publishing?

From the Charleston conference: On Trust

November 7th, 2008

I’m at the Charleston Library Conference this week.  As always, it’s a great meeting with terrific presentations and hallway conversations.  NISO is well represented on the program, with discussions of SUSHI, I2, ONIX-PL and JAV among others.
The unofficial theme of this week’s meeting seems to be trust.  Wednesday night over dinner, I had a philosophical discussion with Mark Kurtz at BioOne and Pete Binfield at PLOS about what are the core value-added services that publishers provide.  One point that was made during the conversation was that certification and validation are among the greatest services that publishers add to the publication process.  In a world where the tools and platforms to self-publish are ubiquitous and easily applied so that “publishing” no longer needs to involve a publisher, what value do publishers bring to this process? Validation and certification are critical, but also the reliance of readers on this process to more easily gauge what should be read.

Geoff Bilder spoke yesterday morning about trust heuristics and how do readers gauge what is worth reading.  One of his points during the presentation is that with the increasing breadth and depth of published information, researchers need to have quick and easily understood signals regarding quality. This echoes the theme of my post on  James J. O’Donnell’s presentation at the ARL members meeting.

Geoff suggested that there be some logos be developed that provide information about the quality of a particular article and the types and stages of review or vetting that an article had gone through. The logo could also contain machine-readable metadata, which would provide information about the type and rigor of the review that was applied in the publication process.  Geoff has been exploring this as a potential new activity at CrossRef. My sense is that there is a great deal of value in this approach and it’s worthy of support in the community.

More from the conference tomorrow.