Date: January 2001
Creator: Mihalcea, Rada, 1974- & Moldovan, Dan I.
Description: This article discusses document indexing using named entities. Abstract: Current text indexing and retrieval techniques have their roots in the field of Information Retrieval where the task is to extract documents that best match a query. With an ever increasing number of documents available due to the easy access through the Internet, the challenge is to provide users with concise and relevant information. The authors are proposing here a novel, yet simple approach, which indexes the named entities in the documents, such as to improve the relevance of documents retrieved. Experiments performed in finding information related to a set of 75 input questions, from a large collection of 125,000 documents, show that this new technique reduces the number of retrieved documents by a factor of 2, while still retrieving the relevant documents.
Contributing Partner: UNT College of Engineering