The UNT College of Engineering strives to educate and train engineers and technologists who have the vision to recognize and solve the problems of society. The college comprises six degree-granting departments of instruction and research.
This paper discusses letter level learning for language independent diacritics restoration.
Physical Description
7 p.
Notes
Abstract: This paper presents a method for diacritics restoration based on learning mechanisms that act at letter level. The method requires no additional tagging tools or resources other than raw text, which makes it independent of the language, and particularly appealing for languages for which there are few resources available. The algorithm was evaluated on four different languages, namely Czech, Hungarian, Polish, and Romanian, and an average accuracy of over 98% was observed.
This paper is part of the following collection of related materials.
UNT Scholarly Works
Materials from the UNT community's research, creative, and scholarly activities and UNT's Open Access Repository. Access to some items in this collection may be restricted.
Mihalcea, Rada, 1974- & Nastase, Vivi.Letter Level Learning for Language Independent Diacritics Restoration,
paper,
September 2002;
(https://digital.library.unt.edu/ark:/67531/metadc30944/:
accessed April 24, 2024),
University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu;
crediting UNT College of Engineering.