Cross Language Information Retrieval for Languages with Scarce Resources

PDF Version Also Available for Download.

Description

Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of ... continued below

Creation Information

Loza, Christian May 2009.

Context

This thesis is part of the collection entitled: UNT Theses and Dissertations and was provided by UNT Libraries to Digital Library, a digital repository hosted by the UNT Libraries. It has been viewed 307 times . More information about this thesis can be viewed below.

Who

People and organizations associated with either the creation of this thesis or its content.

Chair

Committee Members

Publisher

Rights Holder

For guidance see Citations, Rights, Re-Use.

  • Loza, Christian

Provided By

UNT Libraries

With locations on the Denton campus of the University of North Texas and one in Dallas, UNT Libraries serves the school and the community by providing access to physical and online collections; The Portal to Texas History and UNT Digital Libraries; academic research, and much, much more.

Contact Us

What

Descriptive information to help identify this thesis. Follow the links below to find similar items on the Digital Library.

Description

Our generation has experienced one of the most dramatic changes in how society communicates. Today, we have online information on almost any imaginable topic. However, most of this information is available in only a few dozen languages. In this thesis, I explore the use of parallel texts to enable cross-language information retrieval (CLIR) for languages with scarce resources. To build the parallel text I use the Bible. I evaluate different variables and their impact on the resulting CLIR system, specifically: (1) the CLIR results when using different amounts of parallel text; (2) the role of paraphrasing on the quality of the CLIR output; (3) the impact on accuracy when translating the query versus translating the collection of documents; and finally (4) how the results are affected by the use of different dialects. The results show that all these variables have a direct impact on the quality of the CLIR system.

Language

Identifier

Unique identifying numbers for this thesis in the Digital Library or other systems.

Collections

This thesis is part of the following collection of related materials.

UNT Theses and Dissertations

Theses and dissertations represent a wealth of scholarly and artistic content created by masters and doctoral students in the degree-seeking process. Some ETDs in this collection are restricted to use by the UNT community.

What responsibilities do I have when using this thesis?

When

Dates and time periods associated with this thesis.

Creation Date

  • May 2009

Added to The UNT Digital Library

  • March 17, 2010, 11:40 a.m.

Description Last Updated

  • Jan. 21, 2014, 1:59 p.m.

Usage Statistics

When was this thesis last used?

Yesterday: 0
Past 30 days: 3
Total Uses: 307

Interact With This Thesis

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Loza, Christian. Cross Language Information Retrieval for Languages with Scarce Resources, thesis, May 2009; Denton, Texas. (digital.library.unt.edu/ark:/67531/metadc12157/: accessed October 21, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; .