User Searches in IMLS DCC Collection Registry: Transaction Log Analysis Page: 8
36 p.View a full description of this report.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
categorization of subject searches lies in ambiguity of actual searches, further discussed in
Findings and Discussion section.
The queries that presented no clue as to what type they belong to (e.g., "aF", "beyond",
"LU+65") were grouped together in an eighth category - unknown.
The second stage of analysis included searching in three controlled vocabularies - GEM, LCSH,
and Art and Architecture Thesaurus - for the semantic matches of actual user queries from the
IMLS Collection Registry transaction log. Library of Congress Subject Headings was selected
for analysis as a controlled vocabulary that almost a half of digital collections participating in
Collection Registry are using for item-level description and that is being considered by some of
surveyed collections as an alternative to GEM for collection-level description. OCLC Connexion
database features - LCSH authority file and Web Dewey search for editorially mapped LCSH
headings - were used for matching user queries with LCSH. Art and Architecture Thesaurus
(AAT) was selected as another plausible alternative for describing cultural heritage materials -
and possibly collections. A number of collections participating in the registry are using AAT for
their item-level description. Moreover, AAT is a controlled vocabulary of a smaller scope than
LCSH, but significantly more detailed than GEM.
Only exact/abbreviated and synonymous matches (e.g., "inoculation" and "vaccination") were
treated as semantic matches for the purposes of this analysis. Abbreviated queries were matched
with the full terms in controlled vocabularies, e.g. "ilgwu" with "International Ladies' Garment
Workers' Union". The order of the terms in the query, as well as presence or absence of
prepositions and conjunctions was ignored for analysis. (e.g., "French art" was matched with
"Art, French"; "epistemology" with "knowledge, theory of', "children that are abused" with
"abused children"). Endings of the words were also disregarded, as long as they did not affect
the meaning (e.g., "automated speech recognition" was matched with "automatic speech
recognition"). Both preferred terms and variant terms in controlled vocabulary were considered
legitimate matches. For example, both 150 MARC field (USE) and 450 field (USE FOR) in
LCSH authority records were analyzed to find a semantic match to a user query. Simple user
queries were in some cases matched with compound LCSH subject headings, for instance
"housing for shipyard workers" was matched with "Shipbuilding industry-Employees-Housing".
Upcoming Pages
Here’s what’s next.
Search Inside
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
Zavalina, Oksana. User Searches in IMLS DCC Collection Registry: Transaction Log Analysis, report, 2006; (https://digital.library.unt.edu/ark:/67531/metadc77123/m1/8/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT College of Information.