Home   >   CSC-OpenAccess Library   >    Manuscript Information
Lip Reading by Using 3-D Discrete Wavelet Transform with Dmey Wavelet
Sunil S. Morade, Suprava Patnaik
Pages - 384 - 396     |    Revised - 10-09-2014     |    Published - 10-10-2014
Volume - 8   Issue - 5    |    Publication Date - September / October 2014  Table of Contents
2-D DWT, 3-D DWT, Dmey Wavelet, BPNN, SVM, Lip Reading.
Lip movement is an useful way to communicate with machines and it is extremely helpful in noisy environments. However, the recognition of lip motion is a difficult task since the region of interest (ROI) is nonlinear and noisy. In the proposed lip reading method we have used two stage feature extraction mechanism which is précised, discriminative and computation efficient. The first stage is to convert video frame data into 3 dimension space and the second stage trims down the raw information space by using 3 Dimension Discrete Wavelet Transform (DWT). These features are smaller in size to give rise a novel lip reading system. In addition to the novel feature extraction technique, we have also compared the performance of Back Propagation Neural Network (BPNN) and Support Vector Machine(SVM) classifier. CUAVE database and Tulips database are used for experimentation. Experimental results show that 3-D DWT feature mining is better than 2-D DWT. 3-D DWT with Dmey wavelet results are better than 3-D DWT Db4. Results of experimentation show that 3-D DWT-Dmey along with BNNN classifier outperforms SVM.
CITED BY (1)  
1 Morade, S. S., & Patnaik, S. (2015, January). A Genetic Algorithm-based 3D feature selection for lip reading. In Pervasive Computing (ICPC), 2015 International Conference on (pp. 1-6). IEEE.
1 Google Scholar 
2 CiteSeerX 
3 refSeek 
4 Scribd 
5 SlideShare 
6 PdfSR 
A. K. Jain, R. P. Duin, and J. Mao, “Statistical Pattern Recognition: A Review” IEEE Transactions On Pattern Analysis And Machine Intelligence, 22, 1, 2000.
A. Shaikh and J. Gubbi, “Lip reading using optical flow and support vector machines”, CISP 2010, 327-310, 2010.
Bergler and Y. Konig, ““Eigenlips” For robust speech recognition,” in Proc. IEEE Int. Conf. on Acustics , Speech and signal processing, 1994.
C. Bregler and Y. Konig, “Eigenlips” For Robust Speech Recognition”, IEEE conf. Acoustics, Speech, and Signal Processing, 1-4, 1994.
E. D. Petajan, “Automatic lip-reading to enhance speech recognition”, Ph.D. Thesis University of Illinois, 1984.
E. Osuna, R.Freund and F.Girosi, An Improved Training Algorithm for Support Vector Machines, Neural networks for signal processing”, Proc. of IEEE 1997, 276-285, 1997
E. Patterson, S. Gurbuz, Z. Tufekci, and J. Gowdy, “CUAVE: a new audio-visual database for multimodal human computer- interface research”, Proceedings of IEEE Int. conf. on Acoustics, speech and Signal Processing, 2017-2020, 2002.
G. F. Meyor, J. B. Mulligan and S. M. Wuerger, “Continuous audio-visual using N test decision Fusion”, Elsevier Journal on Information Fusion, 91-100, 2004.
H. Lee, Y. Kim, A. Rowberg, and E. Riskin, "Statistical Distributions of DCT Coefficients and their Application to an Inter frame Compression Algorithm for 3-D Medical Images," IEEE Transactions of Medical Imaging, Vol. 12, 478-485, 1993.
J. C. Platt, “Sequential Minimal Optimization: A Fast Algorithm for Training Support Vector Machines”, Microsoft research reports, 1-21, 1998.
J. R. Movellan “Visual Speech Recognition with Stochastic Networks”, Advances in Neural Information Processing Systems, MIT Pess, Cambridge, 1995.
J. Wang and H. Huang, "Three-dimensional Medical Image Compression using a Wavelet Transform with Parallel Computing," SPIE Imaging Physics Vol. 2431, 16-26,1995,
L. Rothkrantz, J. Wojdel, and P. Wiggers, “Comparison between different feature extraction techniques in lipreading applications,” SPECOM- 2006, 25-29, 2006.
M. C.Weeks “Architectures For The 3-D Discrete Wavelet Transform” Ph.D. Thesis University of Southwestern Louisiana, 1998.
N. Puviarasan, S. Palanivel, “Lip reading of hearing impaired persons using HMM,” Elsevier Journal on Expert Systems with Applications, 1-5, 2010.
P. Viola, M. Jones, “Rapid Object Detection using a Boosted Cascade of Simple features”, IEEE Int. Conf., 511-517, 2001.
Potamianos, H. Graf, and E. Cosatto, “An image transform approach for HMM based automatic lip reading,” Int. Conf. on Image Processing, 173–177, 1998.
R. Seymour, D. Stewart, and Ji Ming, “Comparison of image transform-based features for visual speech recognition in clean and corrupted videos,” EURASIP Journal on Video Processing, Vol. 2008, 1-9, 2008.
V. Long and L. Gang “Selection of the best wavelet base for speech signal” IEEE. Intelligent multimedia, video and speech processing, 2004.
V. Kechman, “Learning and soft computing, support vector machines, Neural Networks and Fuzzy logic models”, MIT Press Cambridge, 1-58, 2001.
V.N. Vapnik, “stastical learning theory” New York John Wiley & Suns, 1998.
X. Wang, Y. Hao, D. Fu, and C. Yuan “ROI processing for visual features extraction in lip- reading,” IEEE Int. Conf. Neural Networks & Signal Processing, 178-181, 2008.
Mr. Sunil S. Morade
SVNIT, Surat, India - India
Professor Suprava Patnaik
Ex-Professor, SVNIT,Surat - India