Wavelet Based Noise Robust Features for Speaker Recognition

Vibha Tiwari; Jyoti Singhai

Call for Papers - Ongoing round of submission, notification and publication.

Home | Login or Register | Contact CSC

Home > CSC-OpenAccess Library > Manuscript Information

Full Text Available
(no registration required)

(150.04KB)

-- CSC-OpenAccess Policy

-- Creative Commons Attribution NonCommercial 4.0 International License

>> COMPLETE LIST OF JOURNALS

EXPLORE PUBLICATIONS BY COUNTRIES


	EUROPE

	MIDDLE EAST

	ASIA

	AFRICA
.............................

	United States of America

	United Kingdom

	Canada

	Australia

	Italy

	France

	Brazil

	Germany

	Malaysia

	Turkey

	China

	Taiwan

	Japan

	Saudi Arabia

	Jordan

	Egypt

	United Arab Emirates

	India

	Nigeria

Wavelet Based Noise Robust Features for Speaker Recognition

Vibha Tiwari, Jyoti Singhai

Pages - 52 - 64 | Revised - 01-05-2011 | Published - 31-05-2011

Published in Signal Processing: An International Journal (SPIJ)

Volume - 5 Issue - 2 | Publication Date - May / June 2011 Table of Contents

MORE INFORMATION

References | Cited By (5) | Abstracting & Indexing

KEYWORDS

Speaker Recognition, Mel Frequency Cepstral Coefficients (MFCC), Amplitude Modulation (AM), Wavelet Filterbank.

ABSTRACT

Extraction and selection of the best parametric representation of acoustic signal is the most important task in designing any speaker recognition system. A wide range of possibilities exists for parametrically representing the speech signal such as Linear Prediction Coding (LPC) ,Mel frequency Cepstrum coefficients (MFCC) and others. MFCC are currently the most popular choice for any speaker recognition system, though one of the shortcomings of MFCC is that the signal is assumed to be stationary within the given time frame and is therefore unable to analyze the non-stationary signal. Therefore it is not suitable for noisy speech signals. To overcome this problem several researchers used different types of AM-FM modulation/demodulation techniques for extracting features from speech signal. In some approaches it is proposed to use the wavelet filterbanks for extracting the features. In this paper a technique for extracting the features by combining the above mentioned approaches is proposed. Features are extracted from the envelope of the signal and then passed through wavelet filterbank. It is found that the proposed method outperforms the existing feature extraction techniques.

CITED BY (5)

1	Faek, F. K. (2015). Objective Gender and Age Recognition from Speech Sentences.

2	Farouk, M. H. (2014). Speaker Recognition. In Application of Wavelets in Speech Processing (pp. 33-35). Springer International Publishing.

3	Vignolo, L. D., Milone, D. H., & Rufiner, H. L. (2013). Genetic wavelet packets for speech recognition. Expert Systems with Applications, 40(6), 2350-2359.

4	Karamangala, N., & Kumaraswamy, R. (2013). Speaker Recognition in Uncontrolled Environment: A Review. Journal of Intelligent Systems, 22(1), 49-65.

5	Faek, F. K., & Al-Talabani, A. K. (2013). Speaker Recognition from Noisy Spoken Sentences. International Journal of Computer Applications, 70(20), 11-14.

ABSTRACTING & INDEXING

1	Google Scholar

2	CiteSeerX

3	refSeek

4	iSEEK

5	Scribd

6	SlideShare

7	PdfSR

REFERENCES

A. Potamianos and P. Maragos “Speech analysis and synthesis using an AM-FM modulation model”, Speech Communication, vol.28, (no.3), pp195-209, 1999.

B. Beek, et. al., “An assessment of the technology of automatic speech recognition for military applications”, IEEE Trans. Acoustics Speech and Signal Processing, ASSP-25, pp. 310-322, 1977.

D. Dimitriadis, J.C. Segura, L. Garcia, A. Potamianos, P. Maragos and V. Pitsikalis “Advanced Front-end for Robust Speech Recognition in Extremely Adverse Environments”, Proc. of Intern. Conf. on Speech Communication and Technology - Interspeech 2007, Antwerp, Belgium, Aug. 2007

D. Dimitriadis, P. Maragos, and A.Potamianos “Robust AM-FM Features for Speech Recognition” , IEEE signal processing letters, vol. 12, no. 9, pp. 621-624, Sep. 2005

F.G. Zeng, K. Nie, G. S. Stickney, Y.Y.Kong, M. Vongphoe, A. Bhargave, C. Wei, and K. Cao “Speech recognition with amplitude and frequency modulations” PNAS vol. 102 , no. 7 ,pp 2293–2298, Feb., 2005

F.Gunnar, “ The acoustic theory of speech production” , S’Gravenhage , Mouton,1960.

H. Hirsch, and D. Pearce , “The Aurora Experimental Framework for the Performance Evaluation of Speech Recognition Systems under Noisy Conditions” , ISCA ITRW ASR 2000, Paris, France, Sep 18-20, 2000.

J.N. Gowdy, and Z. Tufekci, "Mel-Scaled Discrete Wavelet Coefficients for Speech Recognition," Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, Turkey, pp 1351-1354, Jun 2000.

K. P. Li and G. W. Hughes, “Talker differences as they appear in correlation matrices of continuous speech spectra”, J.A.S.A. , 55, pp. 833-837, 1974.

K. P. Li, et. al., “Experimental studies in speaker verification using a adaptive system, J.A.S.A., 40, pp. 966-978, 1966.

Long and Dutta, “Wavelet based feature extraction for phoneme recognition”, in the proceedings of 4th international conference of spoken language processing,USA vol.1, 1996.

M. R. Sambur, “Speaker recognition and verification using linear prediction analysis,” .Ph. D. Dissert, M.I.T., 1972.

O. Farooq and S. Datta “Mel Filter- Like Admissible Wavelet packet Structure for Speech Recognition” , IEEE Signal Processing letters, Vol.8 , No. 7,pp 196-198 , Jul 2001

P. Maragos, J. F. Kaiser and T. F. Quatieri , “Energy Separation in Signal Modulations with Application to Speech Analysis”, IEEE transactions on signal processing, vol. 41, no. 10, pp. 3024-3051 , Oct. 1993.

P. Mermelstein and S. Davis, “Comparison of parametric representation for mono syllabic word recognition in continuously spoken sentences”, In IEEE Transactions on Acoustic Speech and Signal Processing, Vol. 28, No. 4, pp. 357-366, 1980.

Q.Zhu and A. Alwan “Non linear feature extraction for robust speech recognition in stationary and non stationary noise” Computer speech and Language (17) ,pp. 381-402 , Elsevier Science Ltd. ,2003.

Q.Zhu and A. Alwan, “AM demodulation of speech spectra and its application to noise robust speech recognition” in proceedings ICSLP, 2000.

R. Sarikaya and J. H. L. Hansen,“High resolution speech feature parameterization for monophone-based stressed speech recognition”, IEEE signal processing letters vol7(7),pp. 182-185Jul 2000.

S. Pruzansky, “Pattern-matching procedure for automatic talker recognition”, J.A.S.A., 35, pp. 354-358, 1963.

Sarikaya et.al. “Wavelet packet transform features with application to speaker identification”, in proceedings of the IEEE Nordic signal processing symposium 1998.

T. Kinnunen, V. Hautamäki, P. Fränti “Fusion of spectral feature sets for accurate speaker identification”, In Proc. 9th Int. Conf. Speech and Computer ,SPECOM ,2004

Y. Hu, and P. Loizou, , “Subjective evaluation and comparison of speech enhancement algorithms,” Speech Communication, Elsevier, 49, pp 588-601, 2007.

Y. Linde, A. Buzo, and R. M. Gray, ``An Algorithm for Vector Quantizer Design,'' IEEE Transactions on Communications, pp 84-95, Jan. 1980.

MANUSCRIPT AUTHORS

Mr. Vibha Tiwari

Gyan Ganga Institute of Technology and management Bhopal, India - India

vibhatiwari19@gmail.com

Dr. Jyoti Singhai

Maulana Azad National Institute Of Technology Bhopal, India - India

CREATE AUTHOR ACCOUNT

LAUNCH YOUR SPECIAL ISSUE

View all special issues >>

PUBLICATION VIDEOS