Banner Portal
Phone based acoustic modeling for automatic speech recognition for Punjabi language
PDF

Keywords

Acoustic model
Phone
Triphones
Gemination
Speech corpus
Pronunciation dictionary

How to Cite

1.
Ghai W, Singh N. Phone based acoustic modeling for automatic speech recognition for Punjabi language. J. of Speech Sci. [Internet]. 2021 Feb. 5 [cited 2024 Apr. 27];3(1):68-83. Available from: https://econtents.bc.unicamp.br/inpec/index.php/joss/article/view/15040

Abstract

Punjabi language is a tonal language belonging to an Indo-Aryan language family and has a number of speakers all around the world. Punjabi language has gained acceptability in the media & communication and therefore deserves to have a place in the growing field of automatic speech recognition which has been explored already for a number of other Indian and foreign languages successfully. Some work has been done in the field of isolated word speech recognition for Punjabi language, but only using whole word based acoustic models. A phone based approach has yet to be applied for Punjabi language speech recognition. This paper describes an automatic speech recognizer that recognizes isolated word speech and connected word speech using a triphone based acoustic model on the HTK 3.4.1 speech Engine and compares the performance with acoustic whole word model based ASR system. Word recognition accuracy of isolated word speech was 92.05% for acoustic whole word model based system and 97.14% for acoustic triphone model based system whereas word recognition accuracy of connected word speech was 87.75% for acoustic whole word model based system and 91.62% for acoustic triphone model based system.

https://doi.org/10.20396/joss.v3i1.15040
PDF

References

Rabiner L, Juang BH, Yegnanarayana B. Fundamentals of Speech Recognition. Pearson Publishers; 2010.

Rabiner LR. A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE; 1989. Vol.77, No.2, pp. 257-286.

Livescu K, Lussier EF, Metze F. Sub-word modeling for automatic speech recognition: Past, Present, & Emerging Approaches. Signal Processing Magazine, IEEE; Nov 2012. Volume: 29, Issue: 6, pp. 44-57. ISSN: 1053-5888

Thangarajan R, Natarajan AM, Selvam M. Word and Triphone Based Approaches in Continuous Speech Recognition for Tamil Language; 2008 March. WSEAS TRANSACTIONS on SIGNAL PROCESSING. Available from http://www.wseas.us/e-library/transactions/signal/2008/30- 649.pdf.

Singh PP.; 2010. Sidhantak Bhasha Vigiyaan, Madaan Publication, Patiala.

Kumar R. Comparison of HMM and DTW for Isolated Word Recognition of Punjabi Language. In Proceedings of Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, Sao Paulo, Brazil. Springer Verlag; 2010 Nov 8-11. Vol. 6419 of Lecture Notes in Computer Science (LNCS), pp. 244– 252. Available from: http://link.springer.com/chapter/10.1007%2F978-3-642-16687-7_35.

Dua M, Aggarwal RK, Kadyan V, Dua S. Punjabi Automatic Speech Recognition Using HTK. International Journal of Computer Science Issues. Vol. 9, Issue 4, No 1; Jul 2012. Available from: http://ijcsi.org/papers/IJCSI-9-4-1-359-364.pdf

Kumar K, Aggarwal RK, Jain A. A Hindi speech recognition system for connected words using HTK. International Journal of Computational Systems Engineering; 2012 Vol.1, No.1, pp.25 – 32. Available from: http: //www. inderscience.com /info/inarticle.php? artid=44740.

Kumar K, Aggarwal RK. Hindi Speech Recognition System Using HTK. International Journal of Computing & Business Research. ISSN (online): 2229-6166. Vol. 2 Issue 2; 2011 May.

Mishra AN, Biswas A, Chandra M, Sharan SN. Robust Hindi connected digits recognition. International Journal of Signal Processing, Image Processing and Pattern Recognition. Vol. 4, No. 2; 2011 Jun. Available from: http://www.sersc.org/journals/IJSIP/vol4_no2/8.pdf

HTK-3.4.1 tool kit retrieved Jul 7, 2012 from http://htk.eng.cam.ac.uk.

Audacity 2.0.0, retrieved; 2012 Jul 15 from http://download.cnet.com/Audacity/

Martin JH, Jurafsky. Speech & Language Processing. Pearson Education; 2000.

Kesarkar MP. Feature extraction for speech recognition. M.Tech. Credit Seminar Report, Electronic Systems Group, EE. Dept, IIT Bombay; 2003 Nov. Available from: http://www.ee.iitb.ac.in/~esgroup/es_mtech03_sem/ sem03_paper_03307003.pdf

Lata S. Challenges for Design of Pronunciation Lexicon Specification (PLS) for Punjabi Language. Available from: http://hnk.ffzg.hr/bibl/ltc2011/book/papers/MPLRL-4.pdf; 2011.

.An Introduction to Gurmukhi. Available from: http:// guca.sourceforge. net/resources /introductiontogurmukhi/ an.introduction .to. gurmukhi.pdf; 2005.

Anusuya MA, Katii SA. Speech Recognition by Machine. International Journal of Computer Science & Information Security; 2009. Vol. 6, No. 3.

Yook D. Introduction to Speech Recognition. Department of Computer Science, Korea University; 2003. Available from: http://ai.korea.ac.kr/data/readings/intro/doc/yook.lecture-noteasr-1.pdf

Rothkrantz LJM. Automatic Speech Recognition Using Hidden Markov Model. TUDelft. IN4012TU, Real-time AI & Automatische Spraakherkenning; 2003. Available from: http://www.kbs.twi.tudelft.nl/docs/syllabi/speech.pdf.

HTK Book. Retrieved on Mar 18, 2012 from http://htk.eng.cam.ac.uk.

http://simple.wikipedia.org/wiki/Punjabi_language

Morris JJ. A STUDY ON THE USE OF CONDITIONAL RANDOM FIELDS FOR AUTOMATIC SPEECH RECOGNITION. Dissertation, Ohio State University, USA. Available from: http://www.cse.ohio-state.edu/~morrijer/Publications/DissertationMorris.pdf; 2010

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Copyright (c) 2013 W. Gha, N. Singh

Downloads

Download data is not yet available.