Workflow Tools for Digital Curation

Workflow Tools for Digital Curation
Andrew James Weidner Daniel Gelaw Alemneh
University of North Texas Libraries University of North Texas Libraries
Andrew.Weidner@unt.edu Daniel.Alemneh@unt.edu
Abstract
Maintaining usable and sustainable digital collections requires a complex set of actions that address the
many challenges at various stages of the digital object lifecycle. Digital curation activities enhance
access and retrieval, maintain quality, add value, and facilitate use and re-use over time. Digital resource
lifecycle management is becoming an increasingly important topic as digital curators actively explore
software tools that perform metadata curation and file management tasks. Accordingly, the University of
North Texas (UNT) Libraries develop tools and workflows that streamline production and quality
assurance activities. This article demonstrates two open source software tools, AutoHotkey and
Selenium IDE, which the UNT Digital Libraries Division has adopted for use during the pre-ingest and
post-ingest stages of the digital resource lifecycle.
Introduction
Digital curation is the continuous activity of managing and enhancing the use of digital resources over
their life-cycle and over time. Digital curation starts when an item is created (born-digital) or selected for
digitization (analog) and continues through image processing, metadata capture, derivative creation, and
preservation for long-term access (Alemneh 2010). High quality metadata is necessary for implementing
reliable, usable, and sustainable digital collections (Sumner & Custard, 2005). Recognizing the important
role of standardized metadata in digital repositories, the University of North Texas (UNT) Libraries actively
promote metadata-based digital resource lifecycle management.
The UNT Digital Libraries Division manages content for The Portal to Texas History and the UNT Digital
Library. The Portal to Texas History provides access to cultural heritage materials related to the history of
Texas. The UNT Digital Library showcases the scholarly and creative output of the university and
highlights some of the Libraries' research holdings. In managing these repositories, the Digital Libraries
Division utilizes various tools and mechanisms to enhance metadata consistency and precision across all
digital resources. Before ingesting digital objects, Web-based metadata creation templates draw terms
from locally controlled vocabularies to ensure standardized data entry values. After objects have been
published online, the metadata records are analyzed with Python scripts and command line tools for
quality review (Phillips, 2013).
This article describes specialized tools and workflows developed by the UNT Digital Libraries Division that
use AutoHotkey and Selenium IDE open source software to manage files and create and edit metadata.
AutoHotkey is especially useful for pre-ingest activities such as file management, data entry, and digital
object quality control. Post-ingest metadata enhancements automated with Selenium IDE facilitate the
use, reuse, and preservation of digital objects. AutoHotkey and Selenium IDE provide quick and accurate
digital resource management capabilities with minimal human intervention.
Automated File Management: AutoHotkey
AutoHotkey is free, open source software for the Windows operating system which allows users to create
automation scripts. Users write scripts that send multiple keystrokes to the operating system with a single
key combination, or hotkey. The AutoHotkey scripting language supports programming constructs (e.g.,
variables, loops, conditionals), dynamic GUIs, and direct interaction with the Windows API. While
AutoHotkey provides a convenient platform for quickly developing tools to assist with digital curation

Weidner, Andrew & Alemneh, Daniel Gelaw. Workflow Tools for Digital Curation. UNT Digital Library. http://digital.library.unt.edu/ark:/67531/metadc157307/. Accessed July 11, 2014.