| Description: | In this document, the authors present some aspects of data normalization of the decomposed records to improve the results of analysis. The data normalization processes use pattern-matching techniques to eliminate and/or generalize anomalous characters and terms. Since the unit of analysis in preparing the test dataset of 400,000 MARC 21 records is a "word," there was a need for data normalization to provide reliability in the subsequent analysis. |
|---|---|
| Creator(s): | |
| Creation Date: | October 25, 2001 |
| Partner(s): |
UNT College of Information
|
| Collection(s): |
UNT Scholarly Works
|
| Usage: |
Total Uses: 21
Past 30 days: 3
Yesterday: 0
|
| Creator (Author): |
Kim, Ed
University of North Texas; Z-Interop Research Assistant |
|
|---|---|---|
| Creator (Author): |
Moen, William E.
University of North Texas; Principal Investigator |
|
| Original Creation Date: | October 25, 2001 | |
| Description: | In this document, the authors present some aspects of data normalization of the decomposed records to improve the results of analysis. The data normalization processes use pattern-matching techniques to eliminate and/or generalize anomalous characters and terms. Since the unit of analysis in preparing the test dataset of 400,000 MARC 21 records is a "word," there was a need for data normalization to provide reliability in the subsequent analysis. |
|
| Degree: |
Department:
Library and Information Science
Department:
Texas Center for Digital Knowledge
|
|
| Physical Description: |
5 p. |
|
| Language(s): | ||
| Subject(s): |
|
|
| Keyword(s): | MARC 21 | data normalization | pattern-matching | |
| Contributor(s): | ||
| Series Title: | Z-Interop | |
| Partner: |
UNT College of Information
|
|
| Collection: |
UNT Scholarly Works
|
|
| Identifier: |
|
|
| Resource Type: | Text | |
| Format: | Text | |
| Rights: |
Access:
Public
|
|