Preface | |
The Nature of Speech | p. 1 |
Mechanism of speech production | p. 3 |
Source-filter model of speech production | p. 5 |
Speech sounds | p. 7 |
Co-articulation and prosody | p. 9 |
Time waveforms and frequency spectra | p. 11 |
The human auditory system | p. 13 |
Digital Speech | p. 17 |
Sampling | p. 18 |
Pre-sampling filter | p. 21 |
Quantisation | p. 22 |
Uniform quantisation | p. 23 |
Logarithmic quantisation | p. 25 |
Adaptive quantisation | p. 28 |
Differential quantisation (DPCM) | p. 29 |
Adaptive differential quantisation (ADPCM) | p. 30 |
Delta modulation | p. 31 |
Parametric Speech Analysis | p. 35 |
Pre-emphasis | p. 36 |
Filter banks for short-time spectral analysis | p. 37 |
Discrete Fourier transform (DFT) | p. 44 |
Fast Fourier transform (FFT) | p. 50 |
Cepstral analysis of speech | p. 53 |
The autocorrelation function | p. 58 |
Linear predictive analysis (LPA) | p. 59 |
Pitch-synchronous analysis | p. 67 |
Feature Extraction | p. 70 |
Short-time energy function | p. 70 |
Zero-crossing rate | p. 71 |
Endpoint detection | p. 72 |
Vector quantisation | p. 72 |
Formant tracking | p. 74 |
Pitch extraction | p. 78 |
Gold-Rabiner pitch extractor | p. 79 |
Autocorrelation methods | p. 81 |
The SIFT algorithm | p. 84 |
Post-processing of pitch contours | p. 85 |
Phonetic analysis | p. 85 |
Speech Synthesis | p. 88 |
History of speech synthesis | p. 88 |
Formant synthesisers | p. 92 |
Linear predictive synthesisers | p. 100 |
Copy synthesis | p. 101 |
Phoneme synthesis | p. 102 |
Concatenation of multi-phonemic units | p. 107 |
Text-to-speech synthesis | p. 108 |
Articulatory speech synthesis | p. 111 |
Speech Coding | p. 122 |
Sub-band coding | p. 123 |
Transform coding | p. 125 |
Channel Vocoder | p. 127 |
Formant vocoder | p. 129 |
Cepstral vocoder | p. 130 |
Linear predictive vocoders | p. 130 |
The LPC-10 algorithm | p. 132 |
Multi-pulse and RELP vocoders | p. 134 |
Vector quantiser coders | p. 136 |
Automatic Speech Recognition | p. 138 |
Problems in ASR | p. 138 |
Dynamic time-warping (DTW) | p. 140 |
Isolated word recognition | p. 141 |
Pattern matching | p. 141 |
Speaker-independent recognition | p. 146 |
Pattern classification (decision rule) | p. 147 |
Connected-word recognition | p. 147 |
Hidden Markov models | p. 152 |
Word recognition using HMMs | p. 155 |
Training hidden Markov models | p. 162 |
Speaker identification/verification | p. 163 |
Future trends | p. 166 |
Front-end processing | p. 167 |
Hidden Markov models | p. 168 |
Neural networks | p. 169 |
Speech understanding | p. 171 |
References | p. 174 |
Index | p. 176 |
Table of Contents provided by Blackwell. All Rights Reserved. |