| |
| |
Preface | |
| |
| |
Acknowledgments | |
| |
| |
| |
Introduction | |
| |
| |
Reference | |
| |
| |
| |
Processing the Information and Getting to Know Your Data | |
| |
| |
| |
Example 1: 2006 Birth Data | |
| |
| |
| |
Example 2: Alumni Donations | |
| |
| |
| |
Example 3: Orange Juice | |
| |
| |
References | |
| |
| |
| |
Standard Linear Regression | |
| |
| |
| |
Estimation in R | |
| |
| |
| |
Example 1: Fuel Efficiency of Automobiles | |
| |
| |
| |
Example 2: Toyota Used-Car Prices | |
| |
| |
| |
The Effects of Model Overfitting on the Average Mean Square Error of the Regression Prediction | |
| |
| |
References | |
| |
| |
| |
Local Polynomial Regression: a Nonparametric Regression Approach | |
| |
| |
| |
Model Selection | |
| |
| |
| |
Application to Density Estimation and the Smoothing of Histograms | |
| |
| |
| |
Extension to the Multiple Regression Model | |
| |
| |
| |
Examples and Software | |
| |
| |
References | |
| |
| |
| |
Importance of Parsimony in Statistical Modeling | |
| |
| |
| |
How Do We Guard Against False Discovery | |
| |
| |
References | |
| |
| |
| |
Penalty-Based Variable Selection in Regression Models with Many Parameters (LASSO) | |
| |
| |
| |
Example 1: Prostate Cancer | |
| |
| |
| |
Example 2: Orange Juice | |
| |
| |
References | |
| |
| |
| |
Logistic Regression | |
| |
| |
| |
Building a Linear Model for Binary Response Data | |
| |
| |
| |
Interpretation of the Regression Coefficients in a Logistic Regression Model | |
| |
| |
| |
Statistical Inference | |
| |
| |
| |
Classification of New Cases | |
| |
| |
| |
Estimation in R | |
| |
| |
| |
Example 1: Death Penalty Data | |
| |
| |
| |
Example 2: Delayed Airplanes | |
| |
| |
| |
Example 3: Loan Acceptance | |
| |
| |
| |
Example 4: German Credit Data | |
| |
| |
References | |
| |
| |
| |
Binary Classification, Probabilities, and Evaluating Classification Performance | |
| |
| |
| |
Binary Classification | |
| |
| |
| |
Using Probabilities to Make Decisions | |
| |
| |
| |
Sensitivity and Specificity | |
| |
| |
| |
Example: German Credit Data | |
| |
| |
| |
Classification Using a Nearest Neighbor Analysis | |
| |
| |
| |
The k-Nearest Neighbor Algorithm | |
| |
| |
| |
Example 1: Forensic Glass | |
| |
| |
| |
Example 2: German Credit Data | |
| |
| |
Reference | |
| |
| |
| |
The Na�ve Bayesian Analysis: a Model for Predicting a Categorical Response from Mostly Categorical Predictor Variables | |
| |
| |
| |
Example: Delayed Airplanes | |
| |
| |
Reference | |
| |
| |
| |
Multinomial Logistic Regression | |
| |
| |
| |
Computer Software | |
| |
| |
| |
Example 1: Forensic Glass | |
| |
| |
| |
Example 2: Forensic Glass Revisited | |
| |
| |
| |
Specification of a Simple Triplet Matrix | |
| |
| |
References | |
| |
| |
| |
More on Classification and a Discussion on Discriminant Analysis | |
| |
| |
| |
Fisher's Linear Discriminant Function | |
| |
| |
| |
Example 1: German Credit Data | |
| |
| |
| |
Example 2: Fisher Iris Data | |
| |
| |
| |
Example 3: Forensic Glass Data | |
| |
| |
| |
Example 4: MBA Admission Data 159 Reference | |
| |
| |
| |
Decision Trees | |
| |
| |
| |
Example 1: Prostate Cancer | |
| |
| |
| |
Example 2: Motorcycle Acceleration | |
| |
| |
| |
Example 3: Fisher Iris Data Revisited | |
| |
| |
| |
Further Discussion on Regression and Classification Trees, Computer Software, and Other Useful Classification Methods | |
| |
| |
| |
R Packages for Tree Construction | |
| |
| |
| |
Chi-Square Automatic Interaction Detection (CELAJD) | |
| |
| |
| |
Ensemble Methods: Bagging, Boosting, and Random Forests | |
| |
| |
| |
Support Vector Machines (SVM) | |
| |
| |
| |
Neural Networks | |
| |
| |
| |
The R Package Rattle: A Useful Graphical User Interface for Data Mining | |
| |
| |
References | |
| |
| |
| |
Clustering | |
| |
| |
| |
k-Means Clustering | |
| |
| |
| |
Another Way to Look at Clustering: Applying the Expectation-Maximization (EM) Algorithm to Mixtures of Normal Distributions | |
| |
| |
| |
Hierarchical Clustering Procedures | |
| |
| |
References | |
| |
| |
| |
Market Basket Analysis: Association Rules and Lift | |
| |
| |
| |
Example 1: Online Radio | |
| |
| |
| |
Example 2: Predicting Income | |
| |
| |
References | |
| |
| |
| |
Dimension Reduction: Factor Models and Principal Components | |
| |
| |
| |
Example 1: European Protein Consumption | |
| |
| |
| |
Example 2: Monthly US Unemployment Rates | |
| |
| |
| |
Reducing the Dimension in Regressions with Multicollinear Inputs: Principal Components Regression and Partial Least Squares | |
| |
| |
| |
Three Examples | |
| |
| |
References | |
| |
| |
| |
Text as Data: Text Mining and Sentiment Analysis | |
| |
| |
| |
Inverse Multinomial Logistic Regression | |
| |
| |
| |
Example 1: Restaurant Reviews | |
| |
| |
| |
Example 2: Political Sentiment | |
| |
| |
| |
Relationship Between the Gentzkow Shapiro Estimate of "Slant" and Partial Least Squares | |
| |
| |
References | |
| |
| |
| |
Network Data | |
| |
| |
| |
Example 1: Marriage and Power in Fifteenth Century Florence | |
| |
| |
| |
Example 2: Connections in a Friendship Network | |
| |
| |
References | |
| |
| |
| |
Exercises | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
References | |
| |
| |
Index | |