| |
| |
| |
Introduction | |
| |
| |
| |
A Natural Representation of Data Similarity | |
| |
| |
| |
Goals | |
| |
| |
| |
Outline | |
| |
| |
| |
Basic Structure of High-Dimensional Spaces | |
| |
| |
| |
Comparing Attributes | |
| |
| |
| |
Comparing Records | |
| |
| |
| |
Similarity | |
| |
| |
| |
High-Dimensional Spaces | |
| |
| |
| |
Summary | |
| |
| |
| |
Algorithms | |
| |
| |
| |
Improving the Natural Geometry | |
| |
| |
| |
Projection | |
| |
| |
| |
Singular Value Decompositions | |
| |
| |
| |
Random Projections | |
| |
| |
| |
Algorithms that Find Standalone Clusters | |
| |
| |
| |
Clusters Based on Density | |
| |
| |
| |
Parallel Coordinates | |
| |
| |
| |
Independent Component Analysis | |
| |
| |
| |
Latent Dirichlet Allocation | |
| |
| |
| |
Algorithms that Find Clusters and Their Relationships | |
| |
| |
| |
Clusters Based on Distance | |
| |
| |
| |
Clusters Based on Distribution | |
| |
| |
| |
Semidiscrete Decomposition | |
| |
| |
| |
Hierarchical Clustering | |
| |
| |
| |
Minimum Spanning Tree with Collapsing | |
| |
| |
| |
Overall Process for Constructing a Skeleton | |
| |
| |
| |
Algorithms that Wrap Clusters | |
| |
| |
| |
Distance-Based | |
| |
| |
| |
Distribution-Based | |
| |
| |
| |
1-Class Support Vector Machines | |
| |
| |
| |
Autoassociative Neural Networks | |
| |
| |
| |
Covers | |
| |
| |
| |
Algorithms to Place Boundaries Between Clusters | |
| |
| |
| |
Support Vector Machines | |
| |
| |
| |
Random Forests | |
| |
| |
| |
Overall Process for Constructing Empty Space | |
| |
| |
| |
Summary | |
| |
| |
| |
Spaces with a Single Center | |
| |
| |
| |
Using Distance | |
| |
| |
| |
Using Density | |
| |
| |
| |
Understanding the Skeleton | |
| |
| |
| |
Understanding Empty Space | |
| |
| |
| |
Summary | |
| |
| |
| |
Spaces with Multiple Centers | |
| |
| |
| |
What is a Cluster? | |
| |
| |
| |
Identifying Clusters | |
| |
| |
| |
Clusters Known Already | |
| |
| |
| |
Finding Clusters | |
| |
| |
| |
Finding the Skeleton | |
| |
| |
| |
Empty Space | |
| |
| |
| |
An Outer Boundary and Novel Data | |
| |
| |
| |
Interesting Data | |
| |
| |
| |
One-Cluster Boundaries | |
| |
| |
| |
One-Cluster-Against-the-Rest Boundaries | |
| |
| |
| |
Summary | |
| |
| |
| |
Representation by Graphs | |
| |
| |
| |
Building a Graph from Records | |
| |
| |
| |
Local Similarities | |
| |
| |
| |
Embedding Choices | |
| |
| |
| |
Using the Embedding for Clustering | |
| |
| |
| |
Summary | |
| |
| |
| |
Using Models of High-Dimensional Spaces | |
| |
| |
| |
Understanding Clusters | |
| |
| |
| |
Structure in the Set of Clusters | |
| |
| |
| |
Semantic Stratified Sampling | |
| |
| |
| |
Ranking Using the Skeleton | |
| |
| |
| |
Ranking Using Empty Space | |
| |
| |
| |
Applications to Streaming Data | |
| |
| |
| |
Concealment | |
| |
| |
| |
Summary | |
| |
| |
| |
Including Contextual Information | |
| |
| |
| |
What is Context? | |
| |
| |
| |
Changing Data | |
| |
| |
| |
Changing Analyst and Organizational Properties | |
| |
| |
| |
Changing Algorithmic Properties | |
| |
| |
| |
Letting Context Change the Models | |
| |
| |
| |
Recomputing the View | |
| |
| |
| |
Recomputing Derived Structures | |
| |
| |
| |
Recomputing the Clustering | |
| |
| |
| |
Summary | |
| |
| |
| |
Conclusions | |
| |
| |
References | |
| |
| |
Index | |