Designing of vowel articulation training system for hearing impaired children as an assistive tool.

(1)

ITSI Transactions on Electrical and Electronics Engineering (ITSI-TEEE)

________________________________________________________________________________________________

ISSN (PRINT) : 2320 – 8945, Volume -2, Issue -1,2014 41

Designing of vowel articulation training system for hearing impaired children as an assistive tool.

1Rupali.S.Kad, ²R.P.Mudhalwadkar PG Student, Associate Professor Dept. of Instrumentation &Control, College of Engineering, Pune, Maharashtra, India Email: ¹[email protected], ²[email protected]

Abstract— A vowel articulation training system for hearing impaired children which has a MATLAB based GUI interfaced with microcontroller has been developed.

The system gives visual information about spoken vowels i.e whether vowel is pronounced correctly or not. In this paper, we discuss the development of vowel training system for hearing impaired children specifically children aged between 5 and 10 years whose mother tongue is Marathi.

Formants emerged from vocal tract depends on the position of jaws, tongue and the shape of your mouth opening.

Vowels in English are determined by how much the mouth is opened, and where the tongue constricts the passage through the mouth: front, back or in between parts of the vocal tract and also how you position your tongue. Formant range is different for same vowel for different ascent.To form a normated data 750 vowel samples from normal speakers are collected. We discuss the formation of vowel database & vowel recognition results using the linear predictive coefficient method. The correct recognition obtained from this system is over 80%.

Keywords: Feature extraction, LPC, Vowel Database, GUI

I. INTRODUCTION

Children use their hearing ability to develop their language skills in order to communicate. But a hearing loss can make communication difficult. If a child has a hearing loss the basic development of language will often be delayed. Children with mild to severe hearing loss can develop understandable speech with the right intervention and amplification. So the earlier the hearing loss is detected and the earlier it is treated, the better. They can get special speech and language therapy. Sign language is one of the method used for the speech training. But majority of people in society cannot always read or use sign language. This create the situation where children who cannot communicate verbally are excluded from society and miss a large part of their social learning experiences..A vowel is a speech sound made by the vocal cords. Vowels form the basic block for word formation. It is the main constituent block for word pronunciation. A vowel sound comes from the lungs, through the vocal cords, and is not blocked, so there is no friction. All English words have vowels. Each spoken word is created out of the phonetic combination of a limited set of vowel and consonant .Therefore vowel training becomes important part in speech therapy.

Presentation of speech signal in the frequency domain are

of great importance in studying the nature of speech signal and its acoustic properties .Vowels are voiced components of the sound, that is,/a/,/e/,/i/,/o/,/u/.The excitation is the periodic excitation generated by fundamental frequency of the vocal cords and sound gets modulated when it passes via the vocal tract. Many researchers have worked in this regard. Some commercial software is also available in the market for speech recognition, but mainly in American English or other European languages. Proposed system is an assistive tool for speech training of hearing impaired children aged between 5 to 10 years whose mother tongue is Marathi.

This paper is divided into six sections. Section I gives Introduction. Section II deals with details of formation of vowels database. Section III focuses on system implementation, Section IV covers result section V deals conclusion followed by references

II. VOWEL DATABASE FORMATION

Database was formed from a total 50 individuals consisting children from both gender from Municipal schools and apartments. The speakers were children who had no obvious speech defects. The recordings were done using a microphone and a laptop with a sampling frequency of 8000Hz. The vowels were recorded using omnidiectional microphone using the sound wave recorder. The samples were recorded in closed room where background noise was not present. The speakers were seating in front of the direction of the microphone with the distance of about 1-3 cm..

• Vowels: a,e,i,o,u

• Female vowel samples:675

 Male vowel samples :575

Database was formed with the samples of 23 male and 27 female normal speakers of 5-10 yrs age. Mother tongue of both the speakers was Marathi. Each speaker was asked to speak the 5 vowels with 5 utterances of each vowel. Total 25 utterances of the vowels were recorded for each speaker

III. SYSTEM DESIGN

The vowel training system is designed to record ,analyze and to display pronounced vowel. MATLAB software is used for analysis of vowel and graphical user

(2)

________________________________________________________________________________________________

interface. In addition to that microcontroller is also interface with MATLAB to display pronounced vowel.

Fig 1. Block diagram of vowel training system A. Vowel recognition process

Vowel utterances were recorded by omnidirectional microphone and stored in workspace then processed using MALAB software. Audio signal is sampled at 8 KHz is processed for feature extraction.LPC is used for feature extraction. The steps for processing are as follows.

Vowel sound

Fig 2. Vowel recognition process 1.Recording of vowel:

All vowel samples were recorded with omnidirectional microphone.To avoid noise while recording the sample, the optimal distance between microphone and speaker is found out.To do this powerspectrum of the different audio signal which was output of audio amplifier at different distances were observed.

Fig 3. Audio amplifier set up for generation of audio signal

Fig 4: Power spectrum of 1KHz Audio signal with 1cm distance

Fig 5: Power spectrum of 2KHz Audio signal with 4cm distance

Fig 5 shows distortion at fundamental as compared to Fig 4. The speakers were seating in front of the direction of the microphone with the distance of about 1-3 cm 2. Sampling:

Sampling is a process of converting continuous time signal into discrete signal. The sampling rate selected is 8000 samples/second..The speech signal is considered to be 300 to 3000 Hertz. A sampling rate of 8000 samples /sec gives a Nyquist frequency of 4000 Hertz, which should be adequate for a 3000 Hz voice signal.

3.Pre-emphasis:

Pre-emphasis is used to boost the magnitude of higher frequencies w.r.t to magnitude of lower frequencies. The purpose of pre-emphasis is to improve signal to noise ratio by lowering the adverse effects of attenuation distortion and to shape the voice signals to create a more equal amplitude of lows and highs before their application to further part. To do this, pre- emphasis filter of the form is 1 – 0.99 z^-1 is normally used.

4.Frame Blocking &Windowing

The vocal resonance & their time variation carries phonetic information. This information analyzed in short-time spectrum of signal. Fig 5. shows speech signal is divided into frames. Each frame can be analyzed separately. In order to have each frame stationary in frame blocking process 20-25 millisecond window applied at 10ms intervals.

5. LPC Analysis

Linear predictive coding(LPC) is s a digital method used for encoding of analog signal in which a particular value is predicted by a linear function of the past values of the signal. It is proposed as a method for encoding of human speech by the United States Department of Defense , standard 1015, published in 1984. LPC determines the coefficients of a forward linear predictor by minimizing Sampling

Pre-emphasis

Frame blocking

Hamming Windowing

Autocorrelation

LPC Analysis

Vowel sound Microphone

PC -Vowel Processing

-GUI

Line Driver Microcontroller

Output

(3)

________________________________________________________________________________________________

the prediction error in the least squares sense. It is widely used in filter design and speech coding .In MATLAB [a,g] = lpc (x, p) finds the coefficients of a pth-order linear predictor (FIR filter) that predicts the current value of the real-valued time series x based on past samples. p is the order of the prediction filter polynomial, a = [1 a(2) ... a(p+1)]. If p is not specified, lpc takes p = length(x)-1 as a default value. If x is a matrix containing a separate signal in each column, lpc returns a model estimate for each column in the rows of matrix a and a column vector of prediction error variances g. The length of p must be less than or equal to the length of x. Algorithms for lpc uses the autocorrelation method of autoregressive (AR) modeling to find the filter coefficients.

B.Graphical User Interface Development

Graphical user interface (GUI) is a type of user interface that allows users to interact with electronic devices through graphical icons and visual indicators such as labels or text. This MATLAB application is self- contained MATLAB programs with GUI front ends that automate a task or calculation. The GUI typically contains controls such as menus, toolbars, buttons, and sliders. MATLAB based GUI is developed to record the .wav file or load the .wav file from destination. Toggle buttons are used to select vowel to be analyzed. If utterance of vowel is correct it is displayed on vowel text box otherwise message for incorrect utterance is displayed .Structure of graphical user interface is shown in figure 6.

Fig6: Graphical User Interface C.Hardware Enviornment

The hardware environment for this paper consists of a P89V51 microcontroller, a PC, a RS232 driver/receiver, and a DB-9 serial cable. The Phiips microcontroller development board is interfaced with PC. The PC is used to write user specified embedded programs to be executed by the Philips microcontroller. Furthermore, the PC hosts an interactive GUI for the user to record and load audio file and visualize pronounced vowel. The microcontroller and the PC communicate using a serial interface. In this paper, we use a P89V51RD2BN, 40-pin, 8-bit CMOS FLASH dual inline package IC.To facilitate serial communication between PIC and PC, we interface a RS232 driver/receiver with the P89V51RD2BN. The effectiveness of our MATLAB -based GUI environment to interact with PIC microcontroller is demonstrated by exporting analyzed vowel of speaker from a MATLAB GUI interfaced to the PC

IV. EXPERIMENTAL RESULTS

. Fig7:Recognition of vowel /a .

.wav file of vowels are analyzed .Formant ranges for vowels are determined. .Fig 7 & Fig 8 shows that vowel /a& vowel /e are pronounced correctly.

Fig8:Recognition of vowel /e

Fig 9: Incorrect Pronunciation of vowel /e Fig 9. Shows result of utterance of vowel /e of hearing impaired child. Since formants are different as that of normal speaker system shows displays that vowel is not pronounced correctly.

Fig10. Simulation Result of Microcontroller PC interfacing

(4)

________________________________________________________________________________________________

Fig 11: Microcontroller Board Fig.10 shows PROTEUS simulation result of

microcontroller interfacing with PC.Fig.11 shows that vowel /e is pronounced correctly.

TABLE I. Recognition of vowels Vo

wel

Numb er of sampl es

Samples Recognize

d

Samples Not recognize

d

Correct Recogn

ition

/a/ 135 109 26 80.7

/e/ 130 107 23 82.30

/i/ 135 105 30 77.77

/o/ 135 113 22 83.7

/u/ 150 124 26 82.66

TABLE II. Formant Ranges

Vowel Formant Range

/a/ 450-700Hz

/e/ 290-400Hz

/i/ 500-900Hz

/o/ 400-550Hz

/u/ 350-470Hz

Fig 12:Recognition of vowels

V. CONCLUSION

From experimental results, it can be concluded that LPC can recognize the speech signal well. The highest correct recognition achieved is 82.30%. For further work, in order to get better recognition another recognition method such as ANN or neuro-fuzzy method can be applied in this system. The low cost and good performance of this system indicate that developed system will be useful in vowel training of hearing impaired children as an assistive tool .As compared to the common multimedia sound card which adds significant noise above system has potential for speech training at home for the hearing impaired.

ACKNOWLEDGMENT

It is my pleasure to get this opportunity to thank my respected Guide Dr. R. P. Mudhalwadkar who has imparted valuable knowledge for the development of this system .

REFRENCES

[1] Shahrul Azmi M.Y., “An improved feature extraction method for Malay vowel recognition based on spectrum data,” International Journal of Software Engineering and Its ApplicationsVol.8, No.1 (2014), pp.413-426

[2] Y. A. Alotaibi and A. Hussain, “Comparative analysis of Arabic vowels using formants and an automatic speech recognition system,”

International Journal of Signal Processing, Image Processing and Pattern Recognition, vol. 3, 2010.

[3] L. Rabiner and B. Juang, Fundamentals of speech recognition. Prentice Hall, 1993.

[4] Ayaz Keerio, Bhargav Kumar Mitra, Philip Birch, Rupert Young, and Chris Chatwin, “On Preprocessing of Speech Signals,” International Journal of Signal Processing vol.5,2009,pp.216- 222

[5] Rabiner. L. R., Schafer. R. W., “Digital Processing of Speech Signals”, First Edition, Prentice-Hall.

[6] X. Huang, A. Acero and H. Hon, “Spoken language processing: A guide to theory, algorithm, and system development”, Prentice Hall PTR Upper Saddle River, NJ, USA, (2001).



80.782.3 77.77

83.782.66

7075 8085

/e/ /o/

% Correct recogniti

on

Vowel

Correct Recognition