Metadata Analysis at the Command-Line Metadata
Metadata describes a digital item, providing (if known) such information as creator, publisher, contents, size, relationship to other resources, and more. Metadata may also contain "preservation" components that help us to maintain the integrity of digital files over time.
- Main Title Metadata Analysis at the Command-Line
Author: Phillips, Mark EdwardCreator Type: PersonalCreator Info: University of North Texas
- Creation: 2013-01-15
- Content Description: This article discusses metadata analysis tools, processes, and methodologies aimed at helping to focus limited quality control resources on the areas of the collection where they might have the most benefit.
- Physical Description: 12 p.
- Keyword: metadata analysis
- Keyword: digital collections
- Keyword: quality control
- Keyword: command-line tools
- Journal: Code4Lib Journal, 2013, Code4Lib
- Publication Title: Code4Lib
- Issue: 19
- Peer Reviewed: True
Name: UNT Scholarly WorksCode: UNTSW
Name: UNT LibrariesCode: UNT
- Rights Access: public
- ISSN: 1940-5758
- Archival Resource Key: ark:/67531/metadc157309
- Academic Department: Digital Projects Unit
- Display Note: Abstract: Over the past few years the University of North Texas Libraries' Digital Projects Unit (DPU) has developed a set of metadata analysis tools, processes, and methodologies aimed at helping to focus limited quality control resources on the areas of the collection where they might have the most benefit. The key to this work lies in its simplicity: records harvested from OAI-PMH-enabled digital repositories are transformed into a format that makes them easily parsable using traditional Unix/Linux-based command-line tools. This article describes the overall methodology, introduces two simple open-source tools developed to help with the aforementioned harvesting and breaking, and provides example commands to demonstrate some common metadata analysis requests. All software tools described in the article are available with an open-source license via the author's GitHub account.