Toward a Unified Retrieval Outcome Analysis Framework for Cross-Language Information Retrieval

Chen, Jiangping

You Are Here:
University Libraries
UNT Digital Library
UNT College of Information
This Paper
Page: 8

Toward a Unified Retrieval Outcome Analysis Framework for Cross-Language Information Retrieval Page: 8

11 p.

This paper is part of the collection entitled: UNT Scholarly Works and was provided to UNT Digital Library by the UNT College of Information.

View a full description of this paper.

Previous search

Adjust Image
Rotate Left
Rotate Right
Brightness, Contrast, etc. (Experimental)
Cropping Tool
Download Sizes
Preview all sizes/dimensions or...
Download Thumbnail
Download Small
Download Medium
Download Large
High Resolution Files
IIIF Image JSON
IIIF Image URL
Accessibility
View Extracted Text

zoom Next

These controls are experimental and have not yet been optimized for user experience.

brightness

Reset Brightness 0

contrast

Reset Contrast 0

saturation

Reset Saturation 0

sharpen

Reset Sharpness 0

exposure

Reset Exposure 0

hue

Reset Hue 0

gamma

Reset Gama 0

Applying filters

Toward a Unified Retrieval Outcome Analysis Framework for Cross-Language Information Retrieval

8

Previous item Next item

Extracted Text

The following text was automatically extracted from the image on this page using optical character recognition software:

ASIST 2005 Contributed Paper -Jiangping Chen

Table 5. Summary of Query Translation using the LKB and the LDC Dictionary
Ikb tdn Idc tdn
Total Terms Evaluated 1538 1610
Number of Correct Translations 1204 (78.3%) 1185 (73.6%)
Number of Incorrect Translations 260 (16.9%) 282 (17.5%)
Number of Missing Translations 74 (4.8%) 143 (8.9%)
Individual Query Analysis
In this component, the researcher was interested in what contributed to the good performance of LKB on
certain queries, and what caused the failure of LKB on some other queries. The first factor being considered was
the translation effectiveness. The analysis above revealed that EC-CLIR using the LKB achieved better retrieval
performance than that using the LDC dictionary, and the translation using the LKB was better as well.
Superficially, a correlation between the difference in EC-CLIR performance and the percentage difference in
correct, incorrect and missing translations can be expected. However, this was not true for the queries tested in
this study. A correlation analysis using Spearman's rho found that the difference in average precision between
Ikb_tdn and Idc_tdn had no correlation with the difference in the percentage of correct, incorrect and missing
translations3
The researcher then decided to examine two types of topics: Hard & Stable, and Hard & Unstable topics, to
explore the reasons behind the above results and other major factors affecting system performance. The analysis
of Hard & Stable topics may discover causes of generally hard topics, and the analysis of Hard & Unstable ones
may help find possible ways to improve the performance of the CLIR system using the LKB.
Eight topics belong to Hard & Stable category, which had an average precision score lower than 0.17 from all
the four runs. They were topics 1, 5, 6, 13, 14, 18, 34, and 46. These queries were resistant to translation errors-
-the query translation results had little effect on their IR performance. Among them, Topics 1, 5, 14, 18 proved
difficult for TREC-5 monolingual participating systems, with median average precision lower than 0.15. In an
attempt to find out the reasons, the top 10 retrieved documents returned by the most precise of the four runs were
examined. Table 6 presents some characteristics of those topics, including the run which returned the highest
average precision (AP), the magnitude of the AP, query length, number of relevant documents, and the number of
relevant documents returned in the top 10 by the top runs. It appears that there were very few relevant documents
returned in the top 10 for most of the Hard & Stable topics.
Table. 6. Hard & Stable Topics
Topic The run The highest Original English # of # of relevant
ID returning the AP score query length (in Relevant documents
highest AP words) Documents in top 10
score
1 mono _tdn 0.1502 56 13 1
5 Idctdn 0.1069 70 28 3
6 mono tdn 0.1325 37 77 4
13 Idc tdn 0.0787 36 110 0
14 mono tdn 0.0558 45 57 2
18 Ikb _tdn 0.1214 93 102 1
34 Idc tdn 0.1632 65 95 5
46 mono tdn 0.1443 68 166 6
A manual inspection of the top 10 retrieved documents (including relevant and irrelevant documents) for each
topic has been conducted. Table 7 summarizes our observations after comparing the relevant and irrelevant
documents in the top 10 sets. It seems to us that most of the 8 topics need a different retrieval strategy from the
traditional tf-idf IR model applied by the system. For example, the query from topic 6, "International support of
3 The values of Spearman's rho between the difference in average precision between kb_tdn and Idc_tdn and the difference
in the percentage of correct, incorrect, and missing translations are 0.144, -0.082, and -0.018 respectively. None is significant.

8 of 11

Upcoming Pages

Here’s what’s next.

9 of 11

10 of 11

11 of 11

Show all pages in this paper.

Search Inside

This paper can be searched. Note: Results may vary based on the legibility of text within the document.

or search this site for other papers

Tools / Downloads

Get a copy of this page or view the extracted text.

Preview all sizes/dimensions or...

Download Thumbnail
Download Small
Download Medium
Download Large
IIIF Image JSON
IIIF Image

View Extracted (OCR) Text

Citing and Sharing

Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.

Reference the current page of this Paper.

Chen, Jiangping. Toward a Unified Retrieval Outcome Analysis Framework for Cross-Language Information Retrieval, paper, 2005; (https://digital.library.unt.edu/ark:/67531/metadc132969/m1/8/: accessed May 5, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT College of Information.

Toward a Unified Retrieval Outcome Analysis Framework for Cross-Language Information Retrieval Page: 8

Upcoming Pages

Search Inside

Tools / Downloads

Citing and Sharing

Reference the current page of this Paper.

Print / Share This Page

Permanent URL (This Page)

Univesal Viewer

International Image Interoperability Framework (This Page)