Skip to content

Learning to Classify Text Using Support Vector Machines Methods, Theory and Algorithms

Best in textbook rentals since 2012!

ISBN-10: 079237679X

ISBN-13: 9780792376798

Edition: 2002

Authors: Thorsten Joachims

List price: $109.99
Shipping box This item qualifies for FREE shipping.
Blue ribbon 30 day, 100% satisfaction guarantee!
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

This volume provides a description of the support vector machine approach to learning text classifiers, including training algorithms, transductive text classification, and an efficient performance estimation.
Customers also bought

Book details

List price: $109.99
Copyright year: 2002
Publisher: Springer
Publication date: 4/30/2002
Binding: Hardcover
Pages: 205
Size: 6.10" wide x 9.25" long x 1.00" tall
Weight: 2.420
Language: English

Foreword
Preface
Acknowledgments
Notation
Introduction
Challenges
Goals
Overview and Structure of the Argument
Theory
Methods
Algorithms
Summary
Text Classification
Learning Task
Binary Setting
Multi-Class Setting
Multi-Label Setting
Representing Text
Word Level
Sub-Word Level
Multi-Word Level
Semantic Level
Feature Selection
Feature Subset Selection
Feature Construction
Term Weighting
Conventional Learning Methods
Naive Bayes Classifier
Rocchio Algorithm
[kappa]-Nearest Neighbors
Decision Tree Classifier
Other Methods
Performance Measures
Error Rate and Asymmetric Cost
Precision and Recall
Precision/Recall Breakeven Point and F[subscript [beta]-Measure
Micro- and Macro-Averaging
Experimental Setup
Test Collections
Design Choices
Support Vector Machines
Linear Hard-Margin SVMs
Soft-Margin SVMs
Non-Linear SVMs
Asymmetric Misclassification Cost
Other Maximum-Margin Methods
Further Work and Further Information
Part Theory
A Statistical Learning Model of Text Classification for SVMs
Properties of Text-Classification Tasks
High-Dimensional Feature Space
Sparse Document Vectors
Heterogeneous Use of Terms
High Level of Redundancy
Frequency Distribution of Words and Zipf's Law
A Discriminative Model of Text Classification
Step 1: Bounding the Expected Error Based on the Margin
Step 2: Homogeneous TCat-Concepts as a Model of Text-Classification Tasks
Step 3: Learnability of TCat-Concepts
Comparing the Theoretical Model with Experimental Results
Sensitivity Analysis: Difficult and Easy Learning Tasks
Influence of Occurrence Frequency
Discriminative Power of Term Sets
Level of Redundancy
Noisy TCat-Concepts
Limitations of the Model and Open Questions
Related Work
Summary and Conclusions
Efficient Performance Estimators for SVMs
Generic Performance Estimators
Training Error
Hold-Out Testing
Bootstrap and Jackknife
Cross-Validation and Leave-One-Out
[xi alpha]-Estimators
Error Rate
Recall, Precision, and F[subscript 1]
Fast Leave-One-Out Estimation
Experiments
How Large are Bias and Variance of the [xi alpha]-Estimators?
What is the Influence of the Training Set Size?
How Large is the Efficiency Improvement for Exact Leave-One-Out?
Summary and Conclusions
Part Methods
Inductive Text Classification
Learning Task
Automatic Model and Parameter Selection
Leave-One-Out Estimator of the PRBEP
[xi alpha]-Estimator of the PRBEP
Model-Selection Algorithm
Experiments
Word Weighting, Stemming and Stopword Removal
Trading Off Training Error vs. Complexity
Non-Linear Classification Rules
Comparison with Conventional Methods
Related Work
Summary and Conclusions
Transductive Text Classification
Learning Task
Transductive Support Vector Machines
What Makes TSVMs well Suited for Text Classification?
An Intuitive Example
Transductive Learning of TCat-Concepts
Experiments
Constraints on the Transductive Hyperplane
Relation to Other Approaches Using Unlabeled Data
Probabilistic Approaches using EM
Co-Training
Other Work on Transduction
Summary and Conclusions
Part Algorithms
Training Inductive Support Vector Machines
Problem and Approach
General Decomposition Algorithm
Selecting a Good Working Set
Convergence
How to Compute the Working Set
Shrinking: Reducing the Number of Variables
Efficient Implementation
Termination Criteria
Computing the Gradient and the Termination Criteria Efficiently
What are the Computational Resources Needed in each Iteration?
Caching Kernel Evaluations
How to Solve the QP on the Working Set
Related Work
Experiments
Training Times for Reuters, WebKB, and Ohsumed
How does Training Time Scale with the Number of Training Examples?
What is the Influence of the Working-Set-Selection Strategy?
What is the Influence of Caching?
What is the Influence of Shrinking?
Summary and Conclusions
Training Transductive Support Vector Machines
Problem and Approach
The TSVM Algorithm
Analysis of the Algorithm
How does the Algorithm work?
Convergence
Experiments
Does the Algorithm Effectively Maximize Margin?
Training Times for Reuters, WebKB, and Ohsumed
How does Training Time Scale with the Number of Training Examples?
How does Training Time Scale with the Number of Test Examples?
Related Work
Summary and Conclusions
Conclusions
Open Question
Bibliography
Appendices
SVM-Light Commands and Options
Index