NISO Forum: Tracking it Back to the Source: Managing and Citing Research Data

NISO on the Road

About the Forum

As data creation increases exponentially across nearly all scholarly disciplines, new roles and requirements are rising to meet the challenges in organization, identification, description, publication, discovery, citation, preservation, and curation to allow these materials to realize their potential in support of data-driven, often interdisciplinary research.  This Forum will focus on several new initiatives to improve community practice on data citation and data discovery.

Event Sessions

Registration Desk Opens

8:00 am

Continental Breakfast

8:00 am - 9:00 am

Introduction

Speaker

9:00 am - 9:30 am

Opening Keynote: The Many and the One: BCE themes in 21st century data curation

Speaker

Allen Renear

Professor and Interim Dean, Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
University of Illinois at Urbana-Champaign

9:30 am - 10:30 am:

Two scientists can be using "the same data" even though the computer files involved appear to be quite different.  This is familiar enough, and for the most part, in small communities with shared practices and familiar datasets, raises few problems. But these informal understandings do not scale to 21st century data curation. To get full value from cyberinfrastructure we must support huge quantities of heterogeneous data developed by diverse communities and used by diverse communities -- often with widely varying methods, tools, and purposes. To accomplish this our informal practices and understandings much be replaced, or at least supplemented, by a shared framework of standard terminology for describing complex cascades of representational levels and relationships. Fundamental problems in data curation -- and in particular problems involving provenance, identifiers, and data citation — cannot be fully resolved without such a framework. Although the deepest problems here have ancient origins, useful practical measures are now within reach.  Some recent work toward this end that is being carried out at the Center for Informatics Research in Science and Scholarship (CIRSS) at the Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign will be described. 

Break

10:30 am - 10:45 am

Speaker

Joan Starr

Chair of the Metadata Working Group at DataCite, and Strategic and Project Planning Manager for the California Digital Library
DataCite / California Digital Library

10:45 am - 11:15am:

Data and data curation are assuming a growing role today’s research library. New approaches are needed both to address the resulting challenges and take advantage of the emerging opportunities. Long-term identifiers represent one such tool. In this presentation, Joan Starr will introduce identifiers and an application designed to make them easy to create and manage: EZID. She will provide a closer look at two identifier types: DOIs and ARKs, and discuss what bringing an identifier service to your institution might mean.

Speaker

Paul Bracke

Associate Dean for Digital Programs and Information Services, Purdue University
Purdue University

11:20 am -12:00 pm:

Research libraries are increasingly interested in developing data services for their campuses. There are many perspectives, however, on how to develop services that are responsive to the many needs of scientists; sensitive to the concerns of scientists who are not always accustomed to sharing their data; and that are attractive to campus administrators. This presentation will discuss the development of campus-based data services programs, the centrality of data citation to these efforts, and the ways in which engagement with DataCite can enhance local programs.

Data Equivalence

Speaker

Mark Parsons

Lead Project Manager, Senior Associate Scientist, National Snow and Ice Data Center
National Snow and Ice Data Center

12:00 pm - 12:45 pm:

Data citation, especially using persistent identifiers like Digital Object Identifiers (DOIs), is an increasingly accepted scientific practice. Recently, several, respected organizations have developed guidelines for data citation. The different guidelines are largely congruent in that they agree on the basic practice and elements of data citation, especially for relatively static, whole data collections. There is less agreement on the more subtle nuances of data citation that are sometimes necessary to ensure precise reference and scientific reproducibility--the core purpose of data citation. We need to be sure that if you follow a data reference you get to the precise data that were used or at least their scientific equivalent. Identifiers such as DOIs are necessary but not sufficient for the precise, detailed, references necessary. This talk discusses issues around data set versioning, micro-citation, and scientific equivalence. I propose some interim solutions and suggest research strategies for the future.

Lunch

12:45 pm - 1:45 pm

ResourceSync: Web-Based Resource Synchronization. Also for Data.

Speaker

Herbert Van de Sompel

Chief Innovation Officer
DANS (Data Archiving and Networked Services), The Hague, The Netherlands

1:45 pm - 2:30 pm:

Web applications frequently leverage resources made available by remote Web servers. As resources are created, updated, or deleted these applications face challenges to remain in lockstep with the server’s change dynamics. Several approaches exist to help meet this challenge for use cases where “good enough” synchronization is acceptable. But when strict resource coverage or low synchronization latency is required, commonly accepted Web-based solutions remain elusive. Motivated by the need to synchronize resources for applications in the realm of cultural heritage and research communication, the National Information Standards Organization (NISO) and the Open Archives Initiative (OAI) have launched the ResourceSync project that aims at designing an approach for resource synchronization that is aligned with the web architecture and that has a fair chance of adoption by different communities. The presentation will discuss some motivating use cases and will provide a perspective on the resource synchronization problem that results from ResourceSync project discussions. It will provide an overview of the ongoing thinking regarding an approach to address the challenges and will pay special attention to aspects that are relevant for the synchronization of data.

Scientific discovery and innovation in an era of data-intensive science

Speaker

William (Bill) Michener

Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico; DataONE Principal Investigator
University of New Mexico

2:30 pm - 3:15 pm:

The scope and nature of biological, environmental and earth sciences research are evolving rapidly in response to environmental challenges such as global climate change, invasive species and emergent diseases. Scientific studies are increasingly focusing on long-term, broad-scale, and complex questions that require massive amounts of diverse data collected by remote sensing platforms and embedded environmental sensor networks; collaborative, interdisciplinary science teams; and new tools that promote scientific data preservation, discovery, and innovation. This talk describes the challenges facing scientists as they transition into this new era of data intensive science, presents current solutions, and lays out a roadmap to the future where new information technologies significantly increase the pace of scientific discovery and innovation.

Break

3:15 pm - 3:30 pm

Needs for Data Management & Citation Throughout the Information Lifecycle

Speaker

Micah Altman

Director of Research and Head/Scientist, Program on Information Science for the MIT Libraries
Massachusetts Institute of Technology (MIT)

3:30 pm - 4:15 pm:

This session will examine data management  and data citation from an information lifecycle approach. The session will discuss the implications for data management of analyzing the needs, rights, and responsibilities of researchers and other stakeholders at each lifecycle stage. And the session will  discuss data citation and other related mechanisms that are useful in linking services and aligning incentives across lifecycle stages and among stakeholders. 

"Ask Anything" Session

Speaker

4:15 pm - 4:45 pm:

Bring your questions, comments, and ideas to share with the entire group.

Forum Wrap-up

Speaker

4:45 pm - 5:00 pm

Additional Information

  • Early bird rates are offered until September 10, 2012.
  • Registration closes September 20, 2012. After that date, a processing fee of $50 will be added. This also applies to any on-site registration.
  • Cancellations made by September 20, 2012 will receive a full refund less a $50 processing fee. After that date, there are no refunds.
  • Registration includes a continental breakfast and lunch. Notify the NISO office if you have any dietary restrictions (301-654-2512).
  • Students should submit proof of enrollment when registering. Please contact the NISO office (301-654-2512) with questions.