Metadata Quality Enhancement for Large Digital Collections: Web Browser Automation with Selenium IDE

Description:

Poster presented at the 2012 TCDL Annual Conference. This poster discusses metadata quality enhancement for large digital collections.

Creator(s):
Creation Date: May 24, 2012
Partner(s):
UNT Libraries
Collection(s):
UNT Scholarly Works
Usage:
Total Uses: 511
Past 30 days: 4
Yesterday: 0
Creator (Author):
Weidner, Andrew

University of North Texas

Creator (Author):
Alemneh, Daniel Gelaw

University of North Texas

Date(s):
  • Creation: May 24, 2012
Description:

Poster presented at the 2012 TCDL Annual Conference. This poster discusses metadata quality enhancement for large digital collections.

Degree:
Note:

Abstract: Creating and maintaining accurate descriptive metadata for digital objects is one of the best ways to connect with digital library users and maintain those connections over the long term. Good metadata empowers users to not only discover exactly what they searched for, but also to locate relevant resources that they did not expect to find. Metadata quality characteristics for digital libraries depend on many factors, including: the types of resources the repository offers and the users' needs, which vary across the spectrum of user communities. The metadata quality issue is particularly acute if there are multiple institutions participating in collaborative digital projects that employ diverse naming schemes for their documents and files. Furthermore, harvesting large sets of documents from open repositories presents a number of challenges for creating accurate descriptive metadata. For example, metadata schema do not always map well, creating disconnects when published in the local repository. In the aforementioned cases, substantial rework is usually required to create descriptive data that meets local repository standards. The University of North Texas (UNT) digital libraries group utilizes various tools and mechanisms to ensure metadata consistency and precision across all digital resources. Pre-populated controlled vocabulary terms in its Web-based dashboard editing interface enable metadata operators to easily select standard values via drop-down menus and auto-suggest for text input fields. In addition, careful mapping prior to ingest facilitates accurate conversions among various metadata element sets. Crosswalks also facilitate exporting metadata records to other systems. To support these activities - in cases where post-ingest metadata normalization will enhance recall and precision for its digital objects - the UNT Libraries recently implemented Selenium IDE as a tool for streamlining the process of editing large sets of metadata records. Created by the Web development community in order to simplify the process of testing Web applications, Selenium IDE is a Firefox browser plug-in that provides an integrated development environment for creating, debugging, and running Web browser automation scripts. This poster discusses the complex set of tools and actions required to maintain usable and sustainable digital collections and demonstrate how Selenium IDE facilitates metadata editing for large digital collections by automating a range of data entry tasks. Any institution that employs a content management system with a Web-based metadata editing interface can potentially benefit from Selenium IDE's automation capabilities.

Physical Description:

1 p.

Language(s):
Subject(s):
Keyword(s): metadata | Selenium IDE | automation | digital collections
Source: Texas Conference on Digital Libraries (TCDL), 2012, Austin, Texas, United States
Contributor(s):
Partner:
UNT Libraries
Collection:
UNT Scholarly Works
Identifier:
  • ARK: ark:/67531/metadc86138
Resource Type: Poster
Format: Image
Rights:
Access: Public