The enhancement of machine translation for low-density languages using Web-gathered parallel texts. Page: I
The following text was automatically extracted from the image on this page using optical character recognition software:
Mohler, Michael Augustine Gaylord, The enhancement of machine
translation for low-density languages usinq Web-qathered parallel texts. Master
of Science (Computer Science), December 2007, 61 pp., 12 tables, 9
illustrations, bibliography, 25 titles.
The majority of the world's languages are poorly represented in
informational media like radio, television, newspapers, and the Internet.
Translation into and out of these languages may offer a way for speakers of
these languages to interact with the wider world, but current statistical machine
translation models are only effective with a large corpus of parallel texts - texts in
two languages that are translations of one another - which most languages lack.
This thesis describes the Babylon project which attempts to alleviate this
shortage by supplementing existing parallel texts with texts gathered
automatically from the Web -- specifically targeting pages that contain text in a
pair of languages. Results indicate that parallel texts gathered from the Web can
be effectively used as a source of training data for machine translation and can
significantly improve the translation quality for text in a similar domain. However,
the small quantity of high-quality low-density language parallel texts on the Web
remains a significant obstacle.
Here’s what’s next.
This thesis can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Thesis.
Mohler, Michael Augustine Gaylord. The enhancement of machine translation for low-density languages using Web-gathered parallel texts., thesis, December 2007; Denton, Texas. (https://digital.library.unt.edu/ark:/67531/metadc5140/m1/2/: accessed April 18, 2019), University of North Texas Libraries, Digital Library, https://digital.library.unt.edu; .