| |
| |
Preface | |
| |
| |
| |
Introduction: Data-Analytic Thinking | |
| |
| |
The Ubiquity of Data Opportunities | |
| |
| |
Example: Hurricane Frances | |
| |
| |
Example: Predicting Customer Churn | |
| |
| |
Data Science, Engineering, and Data-Driven Decision Making | |
| |
| |
Data Processing and "Big Data" | |
| |
| |
From Big Data 1.0 to Big Data 2.0 | |
| |
| |
Data and Data Science Capability as a Strategic Asset | |
| |
| |
Data-Analytic Thinking | |
| |
| |
This Book | |
| |
| |
Data Mining and Data Science, Revisited | |
| |
| |
Chemistry Is Not About Test Tubes: Data Science Versus the Work of the Data Scientist | |
| |
| |
Summary | |
| |
| |
| |
Business Problems and Data Science Solutions | |
| |
| |
Fundamental concepts: A set of canonical data mining tasks; The data mining process; Supervised versus unsupervised data mining. | |
| |
| |
From Business Problems to Data Mining Tasks | |
| |
| |
Supervised Versus Unsupervised Methods | |
| |
| |
Data Mining and Its Results | |
| |
| |
The Data Mining Process | |
| |
| |
Business Understanding | |
| |
| |
Data Understanding | |
| |
| |
Data Preparation | |
| |
| |
Modeling | |
| |
| |
Evaluation | |
| |
| |
Deployment | |
| |
| |
Implications for Managing the Data Science Team | |
| |
| |
Other Analytics Techniques and Technologies | |
| |
| |
Statistics | |
| |
| |
Database Querying | |
| |
| |
Data Warehousing | |
| |
| |
Regression Analysis | |
| |
| |
Machine Learning and Data Mining | |
| |
| |
Answering Business Questions with These Techniques | |
| |
| |
Summary | |
| |
| |
| |
Introduction to Predictive Modeling: From Correlation to Supervised Segmentation. | |
| |
| |
Fundamental concepts: Identifying informative attributes; Segmenting data by progressive attribute selection. | |
| |
| |
Exemplary techniques: Finding correlations; Attribute/variable selection; Tree induction. | |
| |
| |
Models, Induction, and Prediction | |
| |
| |
Supervised Segmentation | |
| |
| |
Selecting Informative Attributes | |
| |
| |
Example: Attribute Selection with Information Gain | |
| |
| |
Supervised Segmentation with Tree-Structured Models | |
| |
| |
Visualizing Segmentations | |
| |
| |
Trees as Sets of Rules | |
| |
| |
Probability Estimation | |
| |
| |
Example: Addressing the Churn Problem with Tree Induction | |
| |
| |
Summary | |
| |
| |
| |
Fitting a Model to Data | |
| |
| |
Fundamental concepts: Finding "optimal" model parameters based on data; Choosing the goal for data mining; Objective functions; Loss functions. | |
| |
| |
Exemplary techniques: Linear regression; Logistic regression; Support-vector machines. | |
| |
| |
Classification via Mathematical Functions | |
| |
| |
Linear Discriminant Functions | |
| |
| |
Optimizing an Objective Function | |
| |
| |
An Example of Mining a Linear Discriminant from Data | |
| |
| |
Linear Discriminant Functions for Scoring and Ranking Instances | |
| |
| |
Support Vector Machines, Briefly | |
| |
| |
Regression via Mathematical Functions | |
| |
| |
Class Probability Estimation and Logistic "Regression" | |
| |
| |
Logistic Regression: Some Technical Details | |
| |
| |
Example: Logistic Regression versus Tree Induction | |
| |
| |
Nonlinear Functions, Support Vector Machines, and Neural Networks | |
| |
| |
Summary | |
| |
| |
| |
Overfitting and Its Avoidance | |
| |
| |
Fundamental concepts: Generalization; Fitting and overfitting; Complexity control. Exemplary techniques: Cross-validation; Attribute selection; Tree pruning; Regularization. | |
| |
| |
Generalization | |
| |
| |
Overfitting | |
| |
| |
Overfitting Examined | |
| |
| |
Holdout Data and Fitting Graphs | |
| |
| |
Overfitting in Tree Induction | |
| |
| |
Overfitting in Mathematical Functions | |
| |
| |
Example: Overfitting Linear Functions | |
| |
| |
Example: Why Is Overfitting Bad? | |
| |
| |
From Holdout Evaluation to Cross-Validation | |
| |
| |
The Churn Dataset Revisited | |
| |
| |
Learning Curves | |
| |
| |
Overfitting Avoidance and Complexity Control | |
| |
| |
Avoiding Overfitting with Tree Induction | |
| |
| |
A General Method for Avoiding Overfitting | |
| |
| |
Avoiding Overfitting for Parameter Optimization | |
| |
| |
Summary | |
| |
| |
| |
Similarity, Neighbors, and Clusters | |
| |
| |
Fundamental concepts: Calculating similarity of objects described by data; Using similarity for prediction; Clustering as similarity-based segmentation. | |
| |
| |
Exemplary techniques: Searching for similar entities; Nearest neighbor methods; Clustering methods; Distance metrics for calculating similarity. | |
| |
| |
Similarity and Distance | |
| |
| |
Nearest-Neighbor Reasoning | |
| |
| |
Example: Whiskey Analytics | |
| |
| |
Nearest Neighbors for Predictive Modeling | |
| |
| |
How Many Neighbors and How Much Influence? | |
| |
| |
Geometric Interpretation, Overfitting, and Complexity Control | |
| |
| |
Issues with Nearest-Neighbor Methods | |
| |
| |
Some Important Technical Details Relating to Similarities and Neighbors | |
| |
| |
Heterogeneous Attributes | |
| |
| |
Other Distance Functions | |
| |
| |
Combining Functions: Calculating Scores from Neighbors | |
| |
| |
Clustering | |
| |
| |
Example: Whiskey Analytics Revisited | |
| |
| |
Hierarchical Clustering | |
| |
| |
Nearest Neighbors Revisited: Clustering Around Centroids | |
| |
| |
Example: Clustering Business News Stories | |
| |
| |
Understanding the Results of Clustering | |
| |
| |
Using Supervised Learning to Generate Cluster Descriptions | |
| |
| |
Stepping Back: Solving a Business Problem Versus Data Exploration | |
| |
| |
Summary | |
| |
| |
| |
Decision Analytic Thinking I: What Is a Good Model? | |
| |
| |
Fundamental concepts: Careful consideration of what is desired from data science results; Expected value as a key evaluation framework; Consideration of appropriate comparative baselines. | |
| |
| |
Exemplary techniques: Various evaluation metrics; Estimating costs and benefits; Calculating expected profit; Creating baseline methods for comparison. | |
| |
| |
Evaluating Classifiers | |
| |
| |
Plain Accuracy and Its Problems | |
| |
| |
The Confusion Matrix | |
| |
| |
Problems with Unbalanced Classes | |
| |
| |
Problems with Unequal Costs and Benefits | |
| |
| |
Generalizing Beyond Classification | |
| |
| |
A Key Analytical Framework: Expected Value | |
| |
| |
Using Expected Value to Frame Classifier Use | |
| |
| |
Using Expected Value to Frame Classifier Evaluation | |
| |
| |
Evaluation, Baseline Performance, and Implications for Investments in Data | |
| |
| |
Summary | |
| |
| |
| |
Visualizing Model Performance | |
| |
| |
Fundamental concepts: Visualization of model performance under various kinds of uncertainty; Further consideration of what is desired from data mining results. | |
| |
| |
Exemplary techniques: Profit curves; Cumulative response curves; Lift curves; ROC curves. | |
| |
| |
Ranking Instead of Classifying | |
| |
| |
Profit Curves | |
| |
| |
ROC Graphs and Curves | |
| |
| |
The Area Under the ROC Curve (AUC) | |
| |
| |
Cumulative Response and Lift Curves | |
| |
| |
Example: Performance Analytics for Churn Modeling | |
| |
| |
Summary | |
| |
| |
| |
Evidence and Probabilities | |
| |
| |
Fundamental concepts: Explicit evidence combination with Bayes' Rule; Probabilistic reasoning via assumptions of conditional independence. | |
| |
| |
Exemplary techniques: Naive Bayes classification; Evidence lift. | |
| |
| |
Example: Targeting Online Consumers With Advertisements | |
| |
| |
Combining Evidence Probabilistically | |
| |
| |
Joint Probability and Independence | |
| |
| |
Bayes' Rule | |
| |
| |
Applying Bayes' Rule to Data Science | |
| |
| |
Conditional Independence and Naive Bayes | |
| |
| |
Advantages and Disadvantages of Naive Bayes | |
| |
| |
A Model of Evidence "Lift" | |
| |
| |
Example: Evidence Lifts from Facebook "Likes" | |
| |
| |
Evidence in Action: Targeting Consumers with Ads | |
| |
| |
Summary | |
| |
| |
| |
Representing and Mining Text | |
| |
| |
Fundamental concepts: The importance of constructing mining-friendly data representations; Representation of text for data mining. | |
| |
| |
Exemplary techniques: Bag of words representation; TFIDF calculation; N-grams; Stemming; Named entity extraction; Topic models. | |
| |
| |
Why Text Is Important | |
| |
| |
Why Text Is Difficult | |
| |
| |
Representation | |
| |
| |
Bag of Words | |
| |
| |
Term Frequency | |
| |
| |
Measuring Sparseness: Inverse Document Frequency | |
| |
| |
Combining Them: TFIDF | |
| |
| |
Example: Jazz Musicians | |
| |
| |
The Relationship of IDF to Entropy | |
| |
| |
Beyond Bag of Words | |
| |
| |
N-gram Sequences | |
| |
| |
Named Entity Extraction | |
| |
| |
Topic Models | |
| |
| |
Example: Mining News Stories to Predict Stock Price Movement | |
| |
| |
The Task | |
| |
| |
The Data | |
| |
| |
Data Preprocessing | |
| |
| |
Results | |
| |
| |
Summary | |
| |
| |
| |
Decision Analytic Thinking II: Toward Analytical Engineering | |
| |
| |
Fundamental concept: Solving business problems with data science starts with analytical engineering: designing an analytical solution, based on the data, tools, and techniques available. | |
| |
| |
Exemplary technique: Expected value as a framework for data science solution design. | |
| |
| |
Targeting the Best Prospects for a Charity Mailing | |
| |
| |
The Expected Value Framework: Decomposing the Business Problem and Recomposing the Solution Pieces | |
| |
| |
A Brief Digression on Selection Bias | |
| |
| |
Our Churn Example Revisited with Even More Sophistication | |
| |
| |
The Expected Value Framework: Structuring a More Complicated Business Problem | |
| |
| |
Assessing the Influence of the Incentive | |
| |
| |
From an Expected Value Decomposition to a Data Science Solution | |
| |
| |
Summary | |
| |
| |
| |
Other Data Science Tasks and Techniques | |
| |
| |
Fundamental concepts: Our fundamental concepts as the basis of many common data science techniques; The importance of familiarity with the building blocks of data science. | |
| |
| |
Exemplary techniques: Association and co - occurrences; Behavior profiling; Link prediction; Data reduction; Latent information mining; Movie recommendation; Bias-variance decomposition of error; Ensembles of models; Causal reasoning from data. | |
| |
| |
Co-occurrences and Associations: Finding Items That Go Together | |
| |
| |
Measuring Surprise: Lift and Leverage | |
| |
| |
Example: Beer and Lottery Tickets | |
| |
| |
Associations Among Facebook Likes | |
| |
| |
Profiling: Finding Typical Behavior | |
| |
| |
Link Prediction and Social Recommendation | |
| |
| |
Data Reduction, Latent Information, and Movie Recommendation | |
| |
| |
Bias, Variance, and Ensemble Methods | |
| |
| |
Data-Driven Causal Explanation and a Viral Marketing Example | |
| |
| |
Summary | |
| |
| |
| |
Data Science and Business Strategy | |
| |
| |
Fundamental concepts: Our principles as the basis of success for a data-driven business; Acquiring and sustaining competitive advantage via data science; The importance of careful curation of data science capability. | |
| |
| |
Thinking Data-Analytically, Redux | |
| |
| |
Achieving Competitive Advantage with Data Science | |
| |
| |
Sustaining Competitive Advantage with Data Science | |
| |
| |
Formidable Historical Advantage | |
| |
| |
Unique Intellectual Property | |
| |
| |
Unique Intangible Collateral Assets | |
| |
| |
Superior Data Scientists | |
| |
| |
Superior Data Science Management | |
| |
| |
Attracting and Nurturing Data Scientists and Their Teams | |
| |
| |
Examine Data Science Case Studies | |
| |
| |
Be Ready to Accept Creative Ideas from Any Source | |
| |
| |
Be Ready to Evaluate Proposals for Data Science Projects | |
| |
| |
Example Data Mining Proposal | |
| |
| |
Flaws in the Big Red Proposal | |
| |
| |
A Firm's Data Science Maturity | |
| |
| |
| |
Conclusion | |
| |
| |
The Fundamental Concepts of Data Science | |
| |
| |
Applying Our Fundamental Concepts to a New Problem: Mining Mobile Device Data | |
| |
| |
Changing the Way We Think about Solutions to Business Problems | |
| |
| |
What Data Can't Do: Humans in the Loop, Revisited | |
| |
| |
Privacy, Ethics, and Mining Data About Individuals | |
| |
| |
Is There More to Data Science? | |
| |
| |
Final Example: From Crowd-Sourcing to Cloud-Sourcing | |
| |
| |
Final Words | |
| |
| |
| |
Proposal Review Guide | |
| |
| |
| |
Another Sample Proposal | |
| |
| |
Glossary | |
| |
| |
Bibliography | |
| |
| |
Index | |