Accessibility and Integrity of Networked Information Collections Page: 36
113 p. : ill. ; 28 cm.View a full description of this report.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
even attempts at actual language understanding). Recently, a major focus on the
application of these sophisticated hybrid methods in large production textual databases
by the DARPA TIPSTER [Harman, 1992] and TREC [Harman, 1993a; Harman, 1993b]
projects has produced some impressive successes and may encourage their transition
from research efforts to more broadly deployed systems. The difficulty with all of these
sophisticated methods, however, is that their operation is incomprehensible to almost
all users. It is very difficult to predict what they will retrieve and what they will ignore.
Some critics of these approaches have termed them "information retrieval as magic".
These technologies raise very real integrity and access issues in that they work
reasonably well often enough to be useful but seldom work perfectly; worse, they fail
drastically in a reasonable number of cases. And information seekers not only have no
idea what these retrieval systems are doing, but very little sense of when they are or
are not working right; and, as they move from one system to another (as will be
increasingly common in a networked information environment) they also have no sense
of the specific features and idiosyncrasies of a given retrieval system. And,
unfortunately, little effort seems to have been invested in researching effective means
for these systems to explain and document their processes to their users; such features
would help a great deal.
To some extent, these sophisticated "voodoo" retrieval systems have been kept from
the general public by groups like librarians who are sufficiently infomation-retrieval
literate to recognize the problems and be alarmed by them. The general public won't
care; as soon as these developing technologies become effective enough to provide a
useful answer most of the time, the public will accept them (and swear at the "stupid
computers" in cases where they don't work), unless we see an unprecedented rise in
public literacy about information and information retrieval techniques. The unreliability
of probabilistic and statistically based retrieval algorithms is today not a problem that
the public understands; without such understanding they may well become victim to
their limitations simply because they are easier to use than more traditional,
deterministic approaches.
6. Access to and Integrity of the Historical and Scholarly Record
One can consider a printed work as knowledge bound at a given time. For example, an
encyclopedia published on a certain date represents the common wisdom of society
about a number of topics as of some point in time. Indeed, old encyclopedias, obsolete
textbooks, out of date subject heading classification guides and other literature
represent primary databases for cultural research37and for understanding our culture's
view of the world at a given time. The scholarly record in any given area, viewed as a
series of frozen artifacts narrowly spaced in time can be viewed as such a historical
record.
The same issue applies to mass media. The daily, weekly and monthly publications of
popular journals provide a nearly continuous chronology of the shifting perceptions of
any number of cultural issues. The selection criteria for what is published are
themselves a very important part of the cultural record, and represent very definite
371 am indebted to Professor Michael Buckland for illuminating this point.36
Upcoming Pages
Here’s what’s next.
Search Inside
This report can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Report.
United States. Congress. Office of Technology Assessment. Accessibility and Integrity of Networked Information Collections, report, August 1993; [Washington D.C.]. (https://digital.library.unt.edu/ark:/67531/metadc39703/m1/42/: accessed April 24, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT Libraries Government Documents Department.