Guidelines for Digital Newspaper Preservation Readiness Page: 14
This book is part of the collection entitled: General Collection and was provided to UNT Digital Library by the UNT Libraries.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Guidelines for Digital Newspaper Preservation Readiness 14
Case Study: Boston College
Boston College has digitized several of its campus newspapers in accordance with the National Digital
Newspaper Technical Guidelines. This has provided Boston College with several high-quality archival
page scans in both TIFF and JPEG2000 formats.
To conserve storage space Boston College has opted to prioritize its JPEG2000 images as preservation
masters (TIFFs can be quite large). This retains the legibility of text and graphics. Due to the amount of
white space they included, the images were eligible for some small amount of compression. While
JPEG2000 is not as widely adopted as TIFF, Boston College believes this will change and the format still
satisfies the criteria for being non-proprietary and open source.I
Boston College has also tested the conversion from JPEG2000 back to TIFF with satisfying results.
readiness step in the short term. If the
institution identifies obsolescence issues for any
of the formats it manages, however, it should
strongly consider migrating the at-risk files to a
stable file format (remember that this does not
need to lead to removing support for the
original format).
File identification is the first step in format
management, and there are a number of ways
to fulfill this step. One lightweight way an
institution with limited time or modest
technical skills may accumulate basic
knowledge about its file formats is to work with
a technical staff person or system administrator
to install a tool like DROID that has a graphical
user interface (GUI) and has direct links
to PRONOM, one of the longer-standing format
registries. Once the institution understands the
format types it holds and their associated risk
factors, the institution may make policy-based
decisions regarding what normalizing and
migration activities it must take and what
staffing/resources will be required or
partnerships it will need to form in order to
accomplish this further work. Xena (see above)
is a well-documented format normalization tool
that also has a GUI that should be relatively
easy for an institution to begin working with.Optimal Readiness
An institution with more time, expertise, and
resources to expend should pursue a multi-step
workflow to identify and address problematic
formats in its collections.
Institutions with more technical staffing might
prefer to use more advanced command-line
approaches. Unix programs such as
the find and file commands (or similar tools in
other OS environments) can be used in concert
with a shell script to create a per-file list of
MIME type values at a top-level or sub-directory
level. This list can then be exported to a tabular
format (e.g., TXT, TSV, or CSV) for further
analysis and format tracking. The institution can
store this output file and/or any derivations
(e.g., XLS, TXT, DOCX, PDF, etc.) in a sub-
folder(s) along with the corresponding directory
of analyzed files. Ideally, the directory name
and date should be included in the filename(s)
of this file(s). If files are added to the collection
over time, the commands can be re-run, and a
new set of outputs stored. Tools, such as FITS,
go a step further to not only identify file
formats but validate their conformance to the
format. They can also provide report outputs in
several tabular formats such as those
mentioned above, as well as in XML.C)
L,
o
a
a)
DO
o
4J
0
4J
a)
E
a)
ar,
crJ
4J
cD
E
0
U-
0
4rJ
U
a)
L#)l
Upcoming Pages
Here’s what’s next.
Search Inside
This book can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Book.
Skinner, Katherine & Schultz, Matt. Guidelines for Digital Newspaper Preservation Readiness, book, March 4, 2014; Atlanta, GA. (https://digital.library.unt.edu/ark:/67531/metadc282586/m1/28/: accessed April 25, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; .