Speech Synthesis and Recognition

ISBN-10: 0748408576
ISBN-13: 9780748408573
Edition: 2nd 2001 (Revised)
List price: $71.95
30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

Description: With the growing impact of information technology on daily life, speech is becoming increasingly important for providing a natural means of communication between humans and machines. This extensively reworked and updated new edition of Speech  More...

what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!

Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
SQL Online content $4.95 $1.99
Add to cart
Study Briefs
MS Excel® 2010 Online content $4.95 $1.99
Add to cart
Study Briefs
MS Word® 2010 Online content $4.95 $1.99
Add to cart
Study Briefs
MS PowerPoint® 2010 Online content $4.95 $1.99

Customers also bought

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $71.95
Edition: 2nd
Copyright year: 2001
Publisher: CRC Press LLC
Publication date: 12/6/2001
Binding: Paperback
Pages: 320
Size: 6.00" wide x 9.00" long x 0.75" tall
Weight: 1.100
Language: English

With the growing impact of information technology on daily life, speech is becoming increasingly important for providing a natural means of communication between humans and machines. This extensively reworked and updated new edition of Speech Synthesis and Recognition is an easy-to-read introduction to current speech technology.Aimed at advanced undergraduates and graduates in electronic engineering, computer science and information technology, the book is also relevant to professional engineers who need to understand enough about speech technology to be able to apply it successfully and to work effectively with speech experts. No advanced mathematical ability is required and no specialist prior knowledge of phonetics or of the properties of speech signals is assumed.Speech Synthesis and Recognition: Explains the complexity of speech communication Describes mechanisms and models of human speech production and perception Covers concatentive synthesis techniques and format synthesis by rule, as well as the processing required for synthesis from text Introduces methods for automatic speech recognition by whole-word template matching and by statistical pattern matching using hidden Markov methods Describes practical techniques that contribute to the successful implementation of speech recognition systems, including those for recognizing very large vocabularies Includes chapters covering the related technologies of digital speech coding and automatic recognition of speaker characteristics Discusses applications and performance of current speech technologyThroughout the book the emphasis is on explaining underlying principles with sufficient but not unnecessary detail, so as to provide the reader with a thorough grounding in the problems and techniques in speech synthesis and recognition. This book is therefore ideal as an introduction before tackling more advanced texts.

Preface to the First Edition
Preface to the Second Edition
List of Abbreviations
Human Speech Communication
Value of speech for human-machine communication
Ideas and language
Relationship between written and spoken language
Phonetics and phonology
The acoustic signal
Phonemes, phones and allophones
Vowels, consonants and syllables
Phonemes and spelling
Prosodic features
Language, accent and dialect
Supplementing the acoustic signal
The complexity of speech processing
Chapter 1 summary
Chapter 1 exercises
Mechanisms and Models of Human Speech Production
Introduction
Sound sources
The resonant system
Interaction of laryngeal and vocal tract functions
Radiation
Waveforms and spectrograms
Speech production models
Excitation models
Vocal tract models
Chapter 2 summary
Chapter 2 exercises
Mechanisms and Models of the Human Auditory System
Introduction
Physiology of the outer and middle ears
Structure of the cochlea
Neural response
Psychophysical measurements
Analysis of simple and complex signals
Models of the auditory system
Mechanical filtering
Models of neural transduction
Higher-level neural processing
Chapter 3 summary
Chapter 3 exercises
Digital Coding of Speech
Introduction
Simple waveform coders
Pulse code modulation
Deltamodulation
Analysis/synthesis systems (vocoders)
Channel vocoders
Sinusoidal coders
LPC vocoders
Formant vocoders
Efficient parameter coding
Vocoders based on segmental/phonetic structure
Intermediate systems
Sub-band coding
Linear prediction with simple coding of the residual
Adaptive predictive coding
Multipulse LPC
Code-excited linear prediction
Evaluating speech coding algorithms
Subjective speech intelligibility measures
Subjective speech quality measures
Objective speech quality measures
Choosing a coder
Chapter 4 summary
Chapter 4 exercises
Message Synthesis from Stored Human Speech Components
Introduction
Concatenation of whole words
Simple waveform concatenation
Concatenation of vocoded words
Limitations of concatenating word-size units
Concatenation of sub-word units: general principles
Choice of sub-word unit
Recording and selecting data for the units
Varying durations of concatenative units
Synthesis by concatenating vocoded sub-word units
Synthesis by concatenating waveform segments
Pitch modification
Timing modification
Performance of waveform concatenation
Variants of concatenative waveform synthesis
Hardware requirements
Chapter 5 summary
Chapter 5 exercises
Phonetic synthesis by rule
Introduction
Acoustic-phonetic rules
Rules for formant synthesizers
Table-driven phonetic rules
Simple transition calculation
Overlapping transitions
Using the tables to generate utterances
Optimizing phonetic rules
Automatic adjustment of phonetic rules
Rules for different speaker types
Incorporating intensity rules
Current capabilities of phonetic synthesis by rule
Chapter 6 summary
Chapter 6 exercises
Speech Synthesis from Textual or Conceptual Input
Introduction
Emulating the human speaking process
Converting from text to speech
TTS system architecture
Overview of tasks required for TTS conversion
Text analysis
Text pre-processing
Morphological analysis
Phonetic transcription
Syntactic analysis and prosodic phrasing
Assignment of lexical stress and pattern of word accents
Prosody generation
Timing pattern
Fundamental frequency contour
Implementation issues
Current TTS synthesis capabilities
Speech synthesis from concept
Chapter 7 summary
Chapter 7 exercises
Introduction to automatic speech recognition: template matching
Introduction
General principles of pattern matching
Distance metrics
Filter-bank analysis
Level normalization
End-point detection for isolated words
Allowing for timescale variations
Dynamic programming for time alignment
Refinements to isolated-word DP matching
Score pruning
Allowing for end-point errors
Dynamic programming for connected words
Continuous speech recognition
Syntactic constraints
Training a whole-word recognizer
Chapter 8 summary
Chapter 8 exercises
Introduction to stochastic modelling
Feature variability in pattern matching
Introduction to hidden Markov models
Probability calculations in hidden Markov models
The Viterbi algorithm
Parameter estimation for hidden Markov models
Forward and backward probabilities
Parameter re-estimation with forward and backward probabilities
Viterbi training
Vector quantization
Multi-variate continuous distributions
Use of normal distributions with HMMs
Probability calculations
Estimating the parameters of a normal distribution
Baum-Welch re-estimation
Viterbi training
Model initialization
Gaussian mixtures
Calculating emission probabilities
Baum-Welch re-estimation
Re-estimation using the most likely state sequence
Initialization of Gaussian mixture distributions
Tied mixture distributions
Extension of stochastic models to word sequences
Implementing probability calculations
Using the Viterbi algorithm with probabilities in logarithmic form
Adding probabilities when they are in logarithmic form
Relationship between DTW and a simple HMM
State durational characteristics of HMMs
Chapter 9 summary
Chapter 9 exercises
Introduction to front-end analysis for automatic speech recognition
Introduction
Pre-emphasis
Frames and windowing
Filter banks, Fourier analysis and the mel scale
Cepstral analysis
Analysis based on linear prediction
Dynamic features
Capturing the perceptually relevant information
General feature transformations
Variable-frame-rate analysis
Chapter 10 summary
Chapter 10 exercises
Practical techniques for improving speech recognition performance
Introduction
Robustness to environment and channel effects
Feature-based techniques
Model-based techniques
Dealing with unknown or unpredictable noise corruption
Speaker-independent recognition
Speaker normalization
Model adaptation
Bayesian methods for training and adaptation of HMMs
Adaptation methods based on linear transforms
Discriminative training methods
Maximum mutual information training
Training criteria based on reducing recognition errors
Robustness of recognizers to vocabulary variation
Chapter 11 summary
Chapter 11 exercises
Automatic speech recognition for large vocabularies
Introduction
Historical perspective
Speech transcription and speech understanding
Speech transcription
Challenges posed by large vocabularies
Acoustic modelling
Context-dependent phone modelling
Training issues for context-dependent models
Parameter tying
Training procedure
Methods for clustering model parameters
Constructing phonetic decision trees
Extensions beyond triphone modelling
Language modelling
N-grams
Perplexity and evaluating language models
Data sparsity in language modelling
Discounting
Backing off in language modelling
Interpolation of language models
Choice of more general distribution for smoothing
Improving on simple N-grams
Decoding
Efficient one-pass Viterbi decoding for large vocabularies
Multiple-pass Viterbi decoding
Depth-first decoding
Evaluating LVCSR performance
Measuring errors
Controlling word insertion errors
Performance evaluations
Speech understanding
Measuring and evaluating speech understanding performance
Chapter 12 summary
Chapter 12 exercises
Neural networks for speech recognition
Introduction
The human brain
Connectionist models
Properties of ANNs
ANNs for speech recognition
Hybrid HMM/ANN methods
Chapter 13 summary
Chapter 13 exercises
Recognition of speaker characteristics
Characteristics of speakers
Verification versus identification
Assessing performance
Measures of verification performance
Speaker recognition
Text dependence
Methods for text-dependent/text-prompted speaker recognition
Methods for text-independent speaker recognition
Acoustic features for speaker recognition
Evaluations of speaker recognition performance
Language recognition
Techniques for language recognition
Acoustic features for language recognition
Chapter 14 summary
Chapter 14 exercises
Applications and performance of current technology
Introduction
Why use speech technology?
Speech synthesis technology
Examples of speech synthesis applications
Aids for the disabled
Spoken warning signals, instructions and user feedback
Education, toys and games
Telecommunications
Speech recognition technology
Characterizing speech recognizers and recognition tasks
Typical recognition performance for different tasks
Achieving success with ASR in an application
Examples of ASR applications
Command and control
Education, toys and games
Dictation
Data entry and retrieval
Telecommunications
Applications of speaker and language recognition
The future of speech technology applications
Chapter 15 summary
Chapter 15 exercises
Future research directions in speech synthesis and recognition
Introduction
Speech synthesis
Speech sound generation
Prosody generation and higher-level linguistic processing
Automatic speech recognition
Advantages of statistical pattern-matching methods
Limitations of HMMs for speech recognition
Developing improved recognition models
Relationship between synthesis and recognition
Automatic speech understanding
Chapter 16 summary
Chapter 16 exercises
Further Reading
Books
Journals
Conferences and workshops
The Internet
Reading for individual chapters
References
Solutions to Exercises
Glossary
Index

×
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×