week2 ch, 3: Feature extraction from A. Filtering audio signals B. Linear predictive coding LPC C. Cepstrum Feature extraction v 9a
Ch. 3: Feature extraction from audio signals A. Filtering B. Linear predictive coding LPC C. Cepstrum Feature extraction, v.9a 1 week2
(A)Filt Filtering Ways to find the spectral envelope Filter banks: uniform ectral energy envelo filter 1 filter filter 3 output ou output/ filter output Filter banks can also be non -uniform. freq LPC and Cepstral LPC parameters Vector quantization method to represent data more efficiently Feature extraction v 9a
(A) Filtering • Ways to find the spectral envelope – Filter banks: uniform – Filter banks can also be non-uniform – LPC and Cepstral LPC parameters • Vector quantization method to represent data more efficiently Feature extraction, v.9a 2 freq.. filter1 output filter2 output filter3 output filter4 output spectral envelop Spectral envelop energy
You can see the filter band output using windoWs-medla-player for a frame Try to look at it X Run energylll ragt windows-media-player To play music Right-click, select Visualization/bar and waves Video demo Spectral envelop Feature extraction v 9a Frequency
You can see the filter band output using windows-media-player for a frame • Try to look at it • Run – windows-media-player – To play music – Right-click, select • Visualization / bar and waves • Video Demo Feature extraction, v.9a 3 Spectral envelop Frequency energy
Speech recognition idea using 4 linear filters each bandwidth is 2, 5KHz Two sounds with two spectral Envelopes sear seoi,e.g spectra Envelop(se)"ar", Spectral envelop"ei Spectral envelope sear=ar Spectral envelope seei=ei energy energy eq 0 reg 10KHz filter 1 2 3 4 filter 1 2 3 4 10KHz Filter 2V3V4 Filter out W1 W2 3W4 out Feature extraction, v ga
Speech recognition idea using 4 linear filters, each bandwidth is 2.5KHz • Two sounds with two Spectral Envelopes SEar,SEei ,E.g. Spectral Envelop (SE) “ar”, Spectral envelop “ei” Feature extraction, v.9a 4 Spectral envelope SEar=“ar” energy energy Freq. Freq. Spectrum A Spectrum B filter 1 2 3 4 filter 1 2 3 4 v1 v2 v3 v4 w1 w2 w3 w4 Spectral envelope SEei=“ei” Filter out Filter out 10KHz 10KHz 0 0
Difference between two sounds or spectral envelopes SE Se) Difference between two sounds, E. g SEa={v1,v2,V34}=ar", SEoi=w1 e W2W3W4}=“ei a simple measure of the difference is Dist =sqrt(v1-W12+v2-W2 2+v3-W3 2+114 W4|2) Where x =magnitude of x Feature extraction v 9a
Difference between two sounds (or spectral envelopes SE SE’) • Difference between two sounds, E.g. • SEar={v1,v2,v3,v4}=“ar”, • SEei={w1,w2,w3,w4}=“ei” • A simple measure of the difference is • Dist =sqrt(|v1-w1|2+|v2-w2|2+|v3-w3|2+|v4- w4|2 ) • Where |x|=magnitude of x Feature extraction, v.9a 5
Filtering method For each frame(10-30 ms), a set of filter outputs will be calculated. frame overlap 5ms There are many different methods for setting the filter bandwidths --uniform or non-uniform Input waveform Time frame irms Filter outputs(v1, v2, Time frame i+-1 30ms Filter outputs(v'1, V2, Time frame i+2 Filter outputs(v"1,v"2 5ms Feature extraction v 9a
Filtering method • For each frame (10 - 30 ms), a set of filter outputs will be calculated. (frame overlap 5ms) • There are many different methods for setting the filter bandwidths -- uniform or non-uniform Feature extraction, v.9a 6 Time frame i Time frame i+1 Time frame i+2 Input waveform 30ms 30ms 30ms 5ms Filter outputs (v1,v2,…) Filter outputs (v’1,v’2,…) Filter outputs (v’’1,v’’2,…)
How to determine filter band ranges The pervious example of using 4 linear filters is too Simple and primItive. · We will discuss Uniform filter banks Log frequency banks Mel filter bands Feature extraction v 9a
How to determine filter band ranges • The pervious example of using 4 linear filters is too simple and primitive. • We will discuss – Uniform filter banks – Log frequency banks – Mel filter bands Feature extraction, v.9a 7
Uniform Filter banks · Uniform filter banks bandwidth B= Sampling Freq.(Fs)/no of banks(N) For example fs=10Kz n=20 then b=500HZ Simple to implement but not too useful FⅰterV1 output 123 4 5001K1.5K2K25K3K Feature extraction v ga
Uniform Filter Banks • Uniform filter banks – bandwidth B= Sampling Freq... (Fs)/no. of banks (N) – For example Fs=10Kz, N=20 then B=500Hz – Simple to implement but not too useful Feature extraction, v.9a 8 ... freq.. 1 2 3 4 5 .... Q 500 1K 1.5K 2K 2.5K 3K ... (Hz) V Filter output v1 v2 v3
Non-uniform filter banks: Log frequency ·Log.Freq… scale: close to human ear filter 1 filter 2 filter 3 filter 4 Center freq 300 600 1200 2400 bankwidth 200 400 800 1600 Filter output v1 v2 V3 200400800 1600 fre eq ..(HZ 3200 Feature extraction v 9a
Non-uniform filter banks: Log frequency • Log. Freq... scale : close to human ear filter 1 filter 2 filter 3 filter 4 Center freq. 300 600 1200 2400 bankwidth 200 400 800 1600 Feature extraction, v.9a 9 200 400 800 1600 3200 freq.. (Hz) v1 v2 v3 V Filter output
nner ear and the cochlea (human also has filter bands) Ear and cochlea 4000H2 Basilar membrane nergy Structure of ear 3000+ 800 utricle saccule semicircular N canal inner 1000H cochlea 2000H2 7000 出mk middle ossicles ound energy Oval wind eustachian tube \canal Round auricle external 20000H2 http:/universe-review.ca/110-85-cochlea2.ipgFeatureextractionv.9a http://www.edu.ipagojp/chiyo/hubed/html1/en/3d/ear.html
Inner ear and the cochlea (human also has filter bands) • Ear and cochlea http://universe-review.ca/I10-85-cochlea2.jpg Feature extraction, v.9a 10 http://www.edu.ipa.go.jp/chiyo/HuBEd/HTML1/en/3D/ear.html