345 Matching Results

Search Results

Advanced search parameters have been applied.

New Term Weighting Formulas for the Vector Space Method in Information Retrieval

Description: The goal in information retrieval is to enable users to automatically and accurately find data relevant to their queries. One possible approach to this problem i use the vector space model, which models documents and queries as vectors in the term space. The components of the vectors are determined by the term weighting scheme, a function of the frequencies of the terms in the document or query as well as throughout the collection. We discuss popular term weighting schemes and present several new schemes that offer improved performance.
Date: March 1, 1999
Creator: Chisholm, E. & Kolda, T.G.
Partner: UNT Libraries Government Documents Department

[Dataset of Web Archiving Research Articles]

Description: Datasets used in the presentation, "Towards Building a Collection of Web Archiving Research Articles." The files included here were used to conduct several Machine Learning classification experiments that result in a corpus of scholarly research articles on the topic of web archiving.
Date: August 2014
Creator: Reyes Ayala, Brenda & Caragea, Cornelia
Partner: UNT College of Information

Shifts of focus among dimensions of user information problems as represented during interactive information retrieval

Description: The goal of this study is to increase understanding of information problems as they are revealed in interactions among users and search intermediaries during information retrieval. Specifically, this study seeks to investigate: (a) how interaction between users and search intermediaries reveals aspects of user information problems; (b) to explore the concept of representation with respect to information problems in interactive information retrieval; and (c) how user and search intermediaries focus on aspects of user information problems during the course of searches.
Date: May 1998
Creator: Robins, David B. (David Bruce)
Partner: UNT Libraries

Using Encyclopedic Knowledge for Automatic Topic Identification

Description: This paper presents a method for automatic topic identification using an encyclopedic graph derived from Wikipedia. The system is found to exceed the performance of previously proposed machine learning algorithms for topic identification, with an annotation consistency comparable to human annotations.
Date: May 2009
Creator: Coursey, Kino High; Mihalcea, Rada, 1974- & Moen, William E.
Partner: UNT College of Engineering

Book Reviews in an Electronic Age

Description: Poster presented at the 2012 IAMSLIC Annual Conference. This poster discusses research on book reviews in marine and aquatic journals to gain insight into the characteristics of the reviews and their value to librarians and research.
Date: August 2012
Creator: Avery, Elizabeth Fuseler; Heil, Kathy & Wiest, Natalie H., 1948-
Partner: UNT Libraries

UNT 2005 TREC QA Participation: Using Lemur as IR Search Engine

Description: This paper reports the authors' TREC 2005 QA participation. The authors' QA system Eagle QA developed last year was expanded and modified for this year's QA experiments. Particularly, the authors used Lemur 4.1 as the Information Retrieval (IR) Engine this year to find documents that may contain answers for the test questions from the document collection. The authors' result shows Lemur did a reasonable job on finding relevant documents. But certainly there is room for further improvement.
Date: 2005
Creator: Chen, Jiangping; Yu, Ping & Ge, He
Partner: UNT College of Information

Accessing Information on the World Wide Web: Predicting Usage Based on Involvement

Description: Advice for Web designers often includes an admonition to use short, scannable, bullet-pointed text, reflecting the common belief that browsing the Web most often involves scanning rather than reading. Literature from several disciplines focuses on the myriad combinations of factors related to online reading but studies of the users' interests and motivations appear to offer a more promising avenue for understanding how users utilize information on Web pages. This study utilized the modified Personal Involvement Inventory (PII), a ten-item instrument used primarily in the marketing and advertising fields, to measure interest and motivation toward a topic presented on the Web. Two sites were constructed from Reader's Digest Association, Inc. online articles and a program written to track students' use of the site. Behavior was measured by the initial choice of short versus longer versions of the main page, the number of pages visited and the amount of time spent on the site. Data were gathered from students at a small, private university in the southwest part of the United States to answer six hypotheses which posited that subjects with higher involvement in a topic presented on the Web and a more positive attitude toward the Web would tend to select the longer text version, visit more pages, and spend more time on the site. While attitude toward the Web did not correlate significantly with any of the behavioral factors, the level of involvement was associated with the use of the sites in two of three hypotheses, but only partially in the manner hypothesized. Increased involvement with a Web topic did correlate with the choice of a longer, more detailed initial Web page, but was inversely related to the number of pages viewed so that the higher the involvement, the fewer pages visited. An additional indicator of usage, the average amount ...
Date: May 2003
Creator: Langford, James David
Partner: UNT Libraries

Improving Z39.50 Interoperability: Z39.50 Profiles and Testbeds for Library Applications

Description: An operating assumption for the networked environment is that many different information systems need to interoperate for users to successfully discover and retrieve distributed resources. Meaningful interoperability is often elusive. In the library community, the Z39.50 standard protocol (ISO 23950/ANSI/NISO Z39.50) for information retrieval promised seamless and transparent networked access to library resources. Too often, the reality has not lived up to the promise. This paper discusses two efforts that offer solution paths to Z39.50 interoperability.
Date: August 2001
Creator: Moen, William E.
Partner: UNT College of Information

Integrating Folksonomies into Cultural Heritage Digital Collections: The Challenges and Opportunities of Web 2.0

Description: In this presentation, the author defines Folksonomy and the advantages and disadvantages of Folksonomy. He begins with a background on information retrieval and changing technologies, discusses trends in technologies, and explains the use of tags and Folksonomy.
Date: 2008
Creator: Alemneh, Daniel Gelaw & Hastings, Samantha Kelly
Partner: UNT Libraries

Three-dimensional Information Space : An Exploration of a World Wide Web-based, Three-dimensional, Hierarchical Information Retrieval Interface Using Virtual Reality Modeling Language

Description: This study examined the differences between a 3-D, VRML search interface, similar to Cone Trees, as a front-end to Yahoo on the World Wide Web and a conventional text-based, 1-Dinterface to the same database. The study sought to determine how quickly users could find information using both interfaces, their degree of satisfaction with both search interfaces, and which interface they preferred.
Date: December 1997
Creator: Scannell, Peter
Partner: UNT Libraries

Smoothing the information seeking path: Removing representational obstacles in the middle-school digital library.

Description: Middle school student's interaction within a digital library is explored. Issues of interface features used, obstacles encountered, search strategies and search techniques used, and representation obstacles are examined. A mechanism for evaluating user's descriptors is tested and effects of augmenting the system's resource descriptions with these descriptors on retrieval is explored. Transaction log data analysis (TLA) was used, with external corroborating achievement data provided by teachers. Analysis was conducted using quantitative and qualitative methods. Coding schemes for the failure analysis, search strategies and techniques analysis, as well as extent of match analysis between terms in student's questions and their search terms, and extent of match analysis between search terms and controlled vocabulary were developed. There are five chapters with twelve supporting appendixes. Chapter One presents an introduction to the problem and reviews the pilot study. Chapter Two presents the literature review and theoretical basis for the study. Chapter Three describes the research questions, hypotheses and methods. Chapter Four presents findings. Chapter Five presents a summary of the findings and their support of the hypotheses. Unanticipated findings, limitations, speculations, and areas of further research are indicated. Findings indicate that middle school users interact with the system in various sequences of patterns. User groups' interactions and scaffold use are influenced by the teacher's objectives for using the ADL. Users preferred to use single word searches over Boolean, phrase or natural language searches. Users tended to use a strategy of repeating the same exact search, instead of using the advanced scaffolds. A high percent of users attempted at least one search that included spelling or typographical errors, punctuation, or sequentially repeated searches. Search terms matched the DQ's in some instantiation 54% of all searches. Terms used by the system to represent the resources do not adequately represent the user groups' information needs, however, ...
Date: May 2002
Creator: Abbas, June M.
Partner: UNT Libraries

Cross Language Information Retrieval for Languages with Scarce Resources

Description: Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.
Date: May 2009
Creator: Loza, Christian
Partner: UNT Libraries

Building an Intelligent Filtering System Using Idea Indexing

Description: The widely used vector model maintains its popularity because of its simplicity, fast speed, and the appeal of using spatial proximity for semantic proximity. However, this model faces a disadvantage that is associated with the vagueness from keywords overlapping. Efforts have been made to improve the vector model. The research on improving document representation has been focused on four areas, namely, statistical co-occurrence of related items, forming term phrases, grouping of related words, and representing the content of documents. In this thesis, we propose the idea-indexing model to improve document representation for the filtering task in IR. The idea-indexing model matches document terms with the ideas they express and indexes the document with these ideas. This indexing scheme represents the document with its semantics instead of sets of independent terms. We show in this thesis that indexing with ideas leads to better performance.
Date: August 2003
Creator: Yang, Li
Partner: UNT Libraries

ETDEWEB versus the World-Wide-Web: a specific database/web comparison

Description: A study was performed comparing user search results from the specialized scientific database on energy-related information, ETDEWEB, with search results from the internet search engines Google and Google Scholar. The primary objective of the study was to determine if ETDEWEB (the Energy Technology Data Exchange – World Energy Base) continues to bring the user search results that are not being found by Google and Google Scholar. As a multilateral information exchange initiative, ETDE’s member countries and partners contribute cost- and task-sharing resources to build the largest database of energy-related information in the world. As of early 2010, the ETDEWEB database has 4.3 million citations to world-wide energy literature. One of ETDEWEB’s strengths is its focused scientific content and direct access to full text for its grey literature (over 300,000 documents in PDF available for viewing from the ETDE site and over a million additional links to where the documents can be found at research organizations and major publishers globally). Google and Google Scholar are well-known for the wide breadth of the information they search, with Google bringing in news, factual and opinion-related information, and Google Scholar also emphasizing scientific content across many disciplines. The analysis compared the results of 15 energy-related queries performed on all three systems using identical words/phrases. A variety of subjects was chosen, although the topics were mostly in renewable energy areas due to broad international interest. Over 40,000 search result records from the three sources were evaluated. The study concluded that ETDEWEB is a significant resource to energy experts for discovering relevant energy information. For the 15 topics in this study, ETDEWEB was shown to bring the user unique results not shown by Google or Google Scholar 86.7% of the time. Much was learned from the study beyond just metric comparisons. Observations about the strengths of ...
Date: June 28, 2010
Creator: Cutler, Debbie
Partner: UNT Libraries Government Documents Department

The Effect of Personality Type on the Use of Relevance Criteria for Purposes of Selecting Information Sources.

Description: Even though information scientists generally recognize that relevance judgments are multidimensional and dynamic, there is still discussion and debate regarding the degree to which certain internal (cognition, personality) and external (situation, social relationships) factors affect the use of criteria in reaching those judgments. Much of the debate centers on the relationship of those factors to the criteria and reliable methods for measuring those relationships. This study researched the use of relevance criteria to select an information source by undergraduate students whose task it is to create a course schedule for a semester. During registration periods, when creating their semester schedules, students filled out a two-part questionnaire. After completion of the questionnaire the students completed a Myers-Briggs Type Indicator instrument in order to determine their personality type. Data was analyzed using one-way ANOVAS and Chi-Square. A positive correlation exists between personality type as expressed by the MBTI and the information source selected as most important by the subject. A correlation also exists between personality type and relevance criteria use. The correlation is stronger for some criteria than for others. Therefore, one can expect personality type to have an effect on the use of relevance criteria while selecting information sources.
Date: December 2002
Creator: Sims, Dale B.
Partner: UNT Libraries