| |
| |
Series Foreword | |
| |
| |
Preface | |
| |
| |
| |
Introduction to Semi-Supervised Learning | |
| |
| |
| |
Supervised, Unsupervised, and Semi-Supervised Learning | |
| |
| |
| |
When Can Semi-Supervised Learning Work? | |
| |
| |
| |
Classes of Algorithms and Organization of This Book | |
| |
| |
| |
Generative Models | |
| |
| |
| |
A Taxonomy for Semi-Supervised Learning Methods | |
| |
| |
| |
| |
The Semi-Supervised Learning Problem | |
| |
| |
| |
Paradigms for Semi-Supervised Learning | |
| |
| |
| |
Examples | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Semi-Supervised Text Classification Using EM | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
A Generative Model for Text | |
| |
| |
| |
Experminental Results with Basic EM | |
| |
| |
| |
Using a More Expressive Generative Model | |
| |
| |
| |
Overcoming the Challenges of Local Maxima | |
| |
| |
| |
Conclusions and Summary | |
| |
| |
| |
Risks of Semi-Supervised Learning | |
| |
| |
| |
| |
Do Unlabled Data Improve or Degrade Classification Performance? | |
| |
| |
| |
Understanding Unlabeled Data: Asymptotic Bias | |
| |
| |
| |
The Asymptotic Analysis of Generative Smei-Supervised Learning | |
| |
| |
| |
The Value of Labeled and Unlabeled Data | |
| |
| |
| |
Finite Sample Effects | |
| |
| |
| |
Model Search and Robustness | |
| |
| |
| |
Conclusion | |
| |
| |
| |
Probabilistic Semi-Supervised Cluster with Constraints | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
HMRF Model for Semi-Supervised Clustering | |
| |
| |
| |
HMRF-KMeans Algorithm | |
| |
| |
| |
Active Learning for Constraint Acquistion | |
| |
| |
| |
Experimental Results | |
| |
| |
| |
Related Work | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Low-Density Separation | |
| |
| |
| |
Transductive Support Vector Machines | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Transductive Support Vector Machines | |
| |
| |
| |
Why Use Margin on the Test Set? | |
| |
| |
| |
Experiments and Applications of the TSVMs | |
| |
| |
| |
Solving the TSVM Optimization Problem | |
| |
| |
| |
Connection to Related Approaches | |
| |
| |
| |
Summary and Conclusions | |
| |
| |
| |
Semi-Supervised Learning Using Semi-Definite Programming | |
| |
| |
| |
| |
Relaxing SVM transduction | |
| |
| |
| |
An Approximation for Speedup | |
| |
| |
| |
General Semi-Supervised Learning Settings | |
| |
| |
| |
Empirical Results | |
| |
| |
| |
Summary and Outlook | |
| |
| |
Appendix | |
| |
| |
The Extended Schur Complement Lemma | |
| |
| |
| |
Gaussian Processes and the Null-Category Noise Model | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
The Noise Model | |
| |
| |
| |
Process Model and the Effect of the Null-Category | |
| |
| |
| |
Posterior Inference and Prediction | |
| |
| |
| |
Results | |
| |
| |
| |
Discussion | |
| |
| |
| |
Entropy Regularization | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Derivation of the Criterion | |
| |
| |
| |
Optimization Algorithms | |
| |
| |
| |
Related Methods | |
| |
| |
| |
Experiments | |
| |
| |
| |
Conclusion | |
| |
| |
Appendix | |
| |
| |
Proof of Theorem 9.1 | |
| |
| |
| |
Data-Dependent Regularization | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Information Regularization on Metric Spaces | |
| |
| |
| |
Information Regularization and Relational Data | |
| |
| |
| |
Discussion | |
| |
| |
| |
Graph-Based Models | |
| |
| |
| |
Label Propogation and Quadratic Criterion | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Label Propogation on a Similarity Graph | |
| |
| |
| |
Quadratic Cost Criterion | |
| |
| |
| |
From Transduction to Induction | |
| |
| |
| |
Incorporating Class Prior Knowledge | |
| |
| |
| |
Curse of Dimensionality for Semi-Supervised Learning | |
| |
| |
| |
Discussion | |
| |
| |
| |
The Geometric Basis of Semi-Supervised Learning | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Incorporating Geometry in Regularization | |
| |
| |
| |
Algorithms | |
| |
| |
| |
Data-Dependent Kernels for Semi-Supervised Learning | |
| |
| |
| |
Linear Methods for Large-Scale Semi-Supervised Learning | |
| |
| |
| |
Connections to Other Algorithms and Related Work | |
| |
| |
| |
Future Directions | |
| |
| |
| |
Discrete Regularization | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Discrete Analysis | |
| |
| |
| |
Discrete Regularization | |
| |
| |
| |
Conclusion | |
| |
| |
| |
Semi-Supervised Learning with Conditional Harmonic Mixing | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Conditional Harmonic Mixing | |
| |
| |
| |
Learning in CHM Models | |
| |
| |
| |
Incorporating Prior Knowledge | |
| |
| |
| |
Learning the Conditionals | |
| |
| |
| |
Model Averaging | |
| |
| |
| |
Experiments | |
| |
| |
| |
Conclusions | |
| |
| |
| |
Change of Representation | |
| |
| |
| |
Graph Kernels by Spectral Transforms | |
| |
| |
| |
| |
The Graph Laplacian | |
| |
| |
| |
Kernels by Spectral Transforms | |
| |
| |
| |
Kernel Alignment | |
| |
| |
| |
Optimizing Alignment Using QCQP for Semi-Supervised Learning | |
| |
| |
| |
Semi-Supervised Kernels with Order Restraints | |
| |
| |
| |
Experimental Results | |
| |
| |
| |
Conclusion | |
| |
| |
| |
Spectral Methods for Dimensionality Reduction | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Linear Methods | |
| |
| |
| |
Graph-Based Methods | |
| |
| |
| |
Kernel Methods | |
| |
| |
| |
Discussion | |
| |
| |
| |
Modifying Distances | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Estimating DBD Metrics | |
| |
| |
| |
Computing DBD Metrics | |
| |
| |
| |
Semi-Supervised Learning Using Density-Based Metrics | |
| |
| |
| |
Conclusions and Future Work | |
| |
| |
| |
Semi-Supervised Learning in Practice | |
| |
| |
| |
Large-Scale Algorithms | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Cost Approximations | |
| |
| |
| |
Subset Selection | |
| |
| |
| |
Discussion | |
| |
| |
| |
Semi-Supervised Protein Classification Using Cluster Kernels | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Representation and Kernels for Protein Sequences | |
| |
| |
| |
Semi-Supervised Kernels for Protein Sequences | |
| |
| |
| |
Experiments | |
| |
| |
| |
Discussion | |
| |
| |
| |
Prediction of Protein Function from Networks | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Graph-Based Semi-Supervised Learning | |
| |
| |
| |
Combining Multiple Graphs | |
| |
| |
| |
Experiments on Function Prediction of Proteins | |
| |
| |
| |
Conclusion and Outlook | |
| |
| |
| |
Analysis of Benchmarks | |
| |
| |
| |
The Benchmark | |
| |
| |
| |
Application of SSL Methods | |
| |
| |
| |
Results and Discussion | |
| |
| |
| |
Perspectives | |
| |
| |
| |
An Augmented PAC Model for Semi-Supervised Learning | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
A Formal Framework | |
| |
| |
| |
Sample Complexity Results | |
| |
| |
| |
Algorithmic Results | |
| |
| |
| |
Related Models and Discussion | |
| |
| |
| |
Metric-Based Approaches for Semi-Supervised Regression and Classification | |
| |
| |
| |
| |
Introduction | |
| |
| |
| |
Metric Structure of Supervised Learning | |
| |
| |
| |
Model Selection | |
| |
| |
| |
Regularization | |
| |
| |
| |
Classification | |
| |
| |
| |
Conclusion | |
| |
| |
| |
Transductive Inference and Semi-Supervised Learning | |
| |
| |
| |
| |
Problem Settings | |
| |
| |
| |
Problem of Generalization in Inductive and Transductive Inference | |
| |
| |
| |
Structure of the VC Bounds and Transductive Inference | |
| |
| |
| |
The Symmetrization Lemma and Transductive Inference | |
| |
| |
| |
Bounds for Transductive Inference | |
| |
| |
| |
The Structural Risk Minimization Principle for Induction and Transduction | |
| |
| |
| |
Combinatorics in Transductive Inference | |
| |
| |
| |
Measures of Size of Equivalence Classes | |
| |
| |
| |
Algorithms for Inductive and Transductive SVMs | |
| |
| |
| |
Semi-Supervised Learning | |
| |
| |
| |
Conclusion: | |
| |
| |
Transductive Inference and the New Problems of Inference | |
| |
| |
| |
Beyond Transduction: Selective Inference | |
| |
| |
| |
A Discussion of Semi-Supervised Learning and Transduction | |
| |
| |
References | |
| |
| |
Notation and Symbols | |
| |
| |
Contributors | |
| |
| |
Index | |
| |
| |
Online Index | |