Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

PDF Version Also Available for Download.

Description

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may ... continued below

Physical Description

20 p.

Creation Information

Hogden, J. November 5, 1996.

Context

This report is part of the collection entitled: Office of Scientific & Technical Information Technical Reports and was provided by UNT Libraries Government Documents Department to Digital Library, a digital repository hosted by the UNT Libraries. More information about this report can be viewed below.

Who

People and organizations associated with either the creation of this report or its content.

Author

Sponsor

Publisher

Provided By

UNT Libraries Government Documents Department

Serving as both a federal and a state depository library, the UNT Libraries Government Documents Department maintains millions of items in a variety of formats. The department is a member of the FDLP Content Partnerships Program and an Affiliated Archive of the National Archives.

Contact Us

What

Descriptive information to help identify this report. Follow the links below to find similar items on the Digital Library.

Description

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.

Physical Description

20 p.

Notes

OSTI as DE97002800

Source

  • Other Information: PBD: 5 Nov 1996

Language

Item Type

Identifier

Unique identifying numbers for this report in the Digital Library or other systems.

  • Other: DE97002800
  • Report No.: LA-UR--96-3945
  • Grant Number: W-7405-ENG-36
  • DOI: 10.2172/431136 | External Link
  • Office of Scientific & Technical Information Report Number: 431136
  • Archival Resource Key: ark:/67531/metadc679293

Collections

This report is part of the following collection of related materials.

Office of Scientific & Technical Information Technical Reports

What responsibilities do I have when using this report?

When

Dates and time periods associated with this report.

Creation Date

  • November 5, 1996

Added to The UNT Digital Library

  • July 25, 2015, 2:20 a.m.

Description Last Updated

  • Feb. 29, 2016, 9:25 p.m.

Usage Statistics

When was this report last used?

Yesterday: 0
Past 30 days: 1
Total Uses: 6

Interact With This Report

Here are some suggestions for what to do next.

Start Reading

PDF Version Also Available for Download.

Citations, Rights, Re-Use

Hogden, J. Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding, report, November 5, 1996; New Mexico. (digital.library.unt.edu/ark:/67531/metadc679293/: accessed August 21, 2017), University of North Texas Libraries, Digital Library, digital.library.unt.edu; crediting UNT Libraries Government Documents Department.