How Do You Know What You Don’t Know? Digital Preservation Education

Two Scenarios: Scanning Projects Gone Bad

Imagine this scenario: a curator for a local history museum is approached by the museum director to scan some of the photo collections and make an online exhibit. The museum has a web page and the director suggests the photos be put on that page somewhere. The museum has a flatbed scanner and the curator goes to work scanning. The collection of 100 photographs takes quite a bit of time to scan, but within a couple of weeks the images are scanned. The curator has some experience with webpages and places low-resolution copies of the images on a webpage linked from the museum’s main page. The JPEG copies are on the hard drive of the computer attached to the scanner and are numbered sequentially starting with IMG001.jpg. The curator realizes that the images should be preserved and so copies the files onto gold CDs so they will be safe. In reality, the curator clearly does not understand archival file formats, the intricacies of content management systems, issues with file naming conventions, or that CDs are an unstable and impermanent storage media.

In another scenario the Press Association of a medium-sized state is interested in having the state’s newspapers made available online. They are aware that there are large runs of microfilm in the state historical society. In addition, there are large numbers of other state documents that would also be useful. The historical society recently purchased a state of the art microfilm scanner and has tested it enough to know that the scanner is very fast and very good. When approached by the Press Association about scanning the film, they estimate how long it will take them to scan all 200,000 reels of film and with the new scanner realize that it will not take very long at all. They agree to do the job for $200,000. Once they start the project they quickly realize that the files that are created are quite large; so large they can’t afford the storage to store the TIFF images. They also realize that they have not planned for a way to present the pages to users other than as a series of JPEG images. There is also no preservation plan for the images. Rather than go back to the Press Association to re-scope the project, the director of the society decides to do the best they can now and make improvements later—after all it is digital access and it is better than nothing.

In reality, a poorly conceived plan is not better than nothing. Spending limited resources on projects that will have little hope of being sustainable is a tremendous waste that serves no one well. Unfortunately, scenarios similar to these are playing out all across the country. Yes, there are many well thought-out projects with preservation plans in place, but in so many organizations a little knowledge about scanning and webpages can be a dangerous thing. Every institution with responsibility for the stewardship of materials in digital form has some interest in long-term digital preservation. How are the staff members in organizations across the country expected to have the knowledge and skills to ensure that their projects and programs are well conceived, feasible, and have a solid sustainability plan? In short, how does the staff know what they do not know?

Download the article PDF to continue reading...