| |
| |
List of Figures | |
| |
| |
List of Tables | |
| |
| |
Preface | |
| |
| |
| |
Introduction | |
| |
| |
| |
What Is a Model? | |
| |
| |
| |
What Is a Statistical Model? | |
| |
| |
| |
The Modeling Process | |
| |
| |
| |
Modeling Pitfalls | |
| |
| |
| |
Characteristics of Good Modelers | |
| |
| |
| |
The Future of Predictive Analytics | |
| |
| |
| |
Properties of Statistical Distributions | |
| |
| |
| |
Fundamental Distributions | |
| |
| |
| |
Uniform Distribution | |
| |
| |
| |
Details of the Normal (Gaussian) Distribution | |
| |
| |
| |
Lognormal Distribution | |
| |
| |
| |
� Distribution | |
| |
| |
| |
Chi-Squared Distribution | |
| |
| |
| |
Non-Central Chi-Squared Distribution | |
| |
| |
| |
Student's t-Distribution | |
| |
| |
| |
Multivariate t-Distribution | |
| |
| |
| |
F-Distribution | |
| |
| |
| |
Binomial Distribution | |
| |
| |
| |
Poisson Distribution | |
| |
| |
| |
Exponential Distribution | |
| |
| |
| |
Geometric Distribution | |
| |
| |
| |
Hypergeometric Distribution | |
| |
| |
| |
Negative Binomial Distribution | |
| |
| |
| |
Inverse Gaussian (IG) Distribution | |
| |
| |
| |
Normal Inverse Gaussian (NIG) Distribution | |
| |
| |
| |
Central Limit Theorem | |
| |
| |
| |
Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Data | |
| |
| |
| |
Estimate of the Standard Deviation of the Sample Mean | |
| |
| |
| |
(Pseudo) Random Number Generators | |
| |
| |
| |
Mersenne Twister Pseudorandom Number Generator | |
| |
| |
| |
Box-Muller Transform for Generating a Normal Distribution | |
| |
| |
| |
Transformation of a Distribution Function | |
| |
| |
| |
Distribution of a Function of Random Variables | |
| |
| |
| |
Z = X + Y | |
| |
| |
| |
Z = XY | |
| |
| |
| |
(Z<sub>1</sub>,Z<sub>2</sub>,…,Z<sub>n</sub>) = (X<sub>1</sub>,X<sub>2</sub>,…,X<sub>n</sub>) Y | |
| |
| |
| |
Z = X/Y | |
| |
| |
| |
Z = max(X,Y) | |
| |
| |
| |
Z = min(X,Y) | |
| |
| |
| |
Moment Generating Function | |
| |
| |
| |
Moment Generating Function of Binomial Distribution | |
| |
| |
| |
Moment Generating Function of Normal Distribution | |
| |
| |
| |
Moment Generating Function of the � Distribution | |
| |
| |
| |
Moment Generating Function of Chi-Square Distribution | |
| |
| |
| |
Moment Generating Function of the Poisson Distribution | |
| |
| |
| |
Cumulant Generating Function | |
| |
| |
| |
Characteristic Function | |
| |
| |
| |
Relationship between Cumulative Function and Characteristic Function | |
| |
| |
| |
Characteristic Function of Normal Distribution | |
| |
| |
| |
Characteristic Function of � Distribution | |
| |
| |
| |
Chebyshev's Inequality | |
| |
| |
| |
Markov's Inequality | |
| |
| |
| |
Gram-Charlier Series | |
| |
| |
| |
Edgeworth Expansion | |
| |
| |
| |
Cornish-Fisher Expansion | |
| |
| |
| |
Lagrange Inversion Theorem | |
| |
| |
| |
Cornish-Fisher Expansion | |
| |
| |
| |
Copula Functions | |
| |
| |
| |
Gaussian Copula | |
| |
| |
| |
t-Copula | |
| |
| |
| |
Archimedean Copula | |
| |
| |
| |
Important Matrix Relationships | |
| |
| |
| |
Pseudo-Inverse of a Matrix | |
| |
| |
| |
A Lemma of Matrix Inversion | |
| |
| |
| |
Identity for a Matrix Determinant | |
| |
| |
| |
Inversion of Partitioned Matrix | |
| |
| |
| |
Determinant of Partitioned Matrix | |
| |
| |
| |
Matrix Sweep and Partial Correlation | |
| |
| |
| |
Singular Value Decomposition (SVD) | |
| |
| |
| |
Diagonalization of a Matrix | |
| |
| |
| |
Spectral Decomposition of a Positive Semi-Definite Matrix | |
| |
| |
| |
Normalization in Vector Space | |
| |
| |
| |
Conjugate Decomposition of a Symmetric Definite Matrix | |
| |
| |
| |
Cholesky Decomposition | |
| |
| |
| |
Cauchy-Schwartz Inequality . | |
| |
| |
| |
Relationship of Correlation among Three Variables | |
| |
| |
| |
Linear Modeling and Regression | |
| |
| |
| |
Properties of Maximum Likelihood Estimators | |
| |
| |
| |
Likelihood Ratio Test | |
| |
| |
| |
Wald Test | |
| |
| |
| |
Lagrange Multiplier Statistic | |
| |
| |
| |
Linear Regression | |
| |
| |
| |
Ordinary Least Squares (OLS) Regression | |
| |
| |
| |
Interpretation of the Coefficients of Linear Regression | |
| |
| |
| |
Regression on Weighted Data | |
| |
| |
| |
Incrementally Updating a Regression Model with Additional Data | |
| |
| |
| |
Partitioned Regression | |
| |
| |
| |
How Does the Regression Change When Adding One More Variable? | |
| |
| |
| |
Linearly Restricted Least Squares Regression | |
| |
| |
| |
Significance of the Correlation Coefficient | |
| |
| |
| |
Partial Correlation | |
| |
| |
| |
Ridge Regression | |
| |
| |
| |
Fisher's Linear Discriminant Analysis | |
| |
| |
| |
Principal Component Regression (PCR) | |
| |
| |
| |
Factor Analysis | |
| |
| |
| |
Partial Least Squares Regression (PLSR) | |
| |
| |
| |
Generalized Linear Model (GLM) | |
| |
| |
| |
Logistic Regression: Binary | |
| |
| |
| |
Logistic Regression: Multiple Nominal | |
| |
| |
| |
Logistic Regression: Proportional Multiple Ordinal | |
| |
| |
| |
Fisher Scoring Method for Logistic Regression . . | |
| |
| |
| |
Tobit Model: A Censored Regression Model | |
| |
| |
| |
Some Properties of the Normal Distribution | |
| |
| |
| |
Formulation of the Tobit Model | |
| |
| |
| |
Nonlinear Modeling | |
| |
| |
| |
Naive Bayesian Classifier | |
| |
| |
| |
Neural Network | |
| |
| |
| |
Back Propagation Neural Network | |
| |
| |
| |
Segmentation and Tree Models | |
| |
| |
| |
Segmentation | |
| |
| |
| |
Tree Models | |
| |
| |
| |
Sweeping to Find the Best Cutpoint | |
| |
| |
| |
Impurity Measure of a Population: Entropy and Gini Index | |
| |
| |
| |
Chi-Square Splitting Rule | |
| |
| |
| |
Implementation of Decision Trees | |
| |
| |
| |
Additive Models | |
| |
| |
| |
Boosted Tree | |
| |
| |
| |
Least Squares Regression Boosting Tree | |
| |
| |
| |
Binary Logistic Regression Boosting Tree | |
| |
| |
| |
Support Vector Machine (SVM) | |
| |
| |
| |
Wolfe Dual | |
| |
| |
| |
Linearly Separable Problem | |
| |
| |
| |
Linearly Inseparable Problem | |
| |
| |
| |
Constructing Higher-Dimensional Space and Kernel | |
| |
| |
| |
Model Output | |
| |
| |
| |
C-Support Vector Classification (C-SVC) for Classification | |
| |
| |
| |
�-Support Vector Regression (�-SVR) for Regression | |
| |
| |
| |
The Probability Estimate | |
| |
| |
| |
Fuzzy Logic System | |
| |
| |
| |
A Simple Fuzzy Logic System | |
| |
| |
| |
Clustering | |
| |
| |
| |
K Means, Fuzzy C Means | |
| |
| |
| |
Nearest Neighbor, K Nearest Neighbor (KNN | |
| |
| |
| |
Comments on Clustering Methods | |
| |
| |
| |
Time Series Analysis | |
| |
| |
| |
Fundamentals of Forecasting | |
| |
| |
| |
Box-Cox Transformation | |
| |
| |
| |
Smoothing Algorithms | |
| |
| |
| |
Convolution of Linear Filters | |
| |
| |
| |
Linear Difference Equation | |
| |
| |
| |
The Autocovariance Function and Autocorrelation Function | |
| |
| |
| |
The Partial Autocorrelation Function | |
| |
| |
| |
ARIMA Models | |
| |
| |
| |
MA(q) Process | |
| |
| |
| |
AR(p) Process | |
| |
| |
| |
ARMA(p, q) Process | |
| |
| |
| |
Survival Data Analysis | |
| |
| |
| |
Sampling Method | |
| |
| |
| |
Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1) | |
| |
| |
| |
Exponentially Weighted Moving Average (EWMA) | |
| |
| |
| |
ARCH and GARCH Models | |
| |
| |
| |
Data Preparation and Variable Selection | |
| |
| |
| |
Data Quality and Exploration | |
| |
| |
| |
Variable Scaling and Transformation | |
| |
| |
| |
How to Bin Variables . | |
| |
| |
| |
Equal Interval | |
| |
| |
| |
Equal Population | |
| |
| |
| |
Tree Algorithms | |
| |
| |
| |
Interpolation in One and Two Dimensions | |
| |
| |
| |
Weight of Evidence (WOE) Transformation | |
| |
| |
| |
Variable Selection Overview | |
| |
| |
| |
Missing Data Imputation | |
| |
| |
| |
Stepwise Selection Methods | |
| |
| |
| |
Forward Selection in Linear Regression | |
| |
| |
| |
Forward Selection in Logistic Regression | |
| |
| |
| |
Mutual Information, KL Distance | |
| |
| |
| |
Detection of Multicollinearity | |
| |
| |
| |
Model Goodness Measures | |
| |
| |
| |
Training, Testing, Validation | |
| |
| |
| |
Continuous Dependent Variable | |
| |
| |
| |
Example: Linear Regression | |
| |
| |
| |
Binary Dependent Variable (Two-Group Classification) | |
| |
| |
| |
Kolmogorov-Smirnov (KS) Statistic | |
| |
| |
| |
Confusion Matrix | |
| |
| |
| |
Concordant and Discordant | |
| |
| |
| |
R<sup>2</sup> for Logistic Regression | |
| |
| |
| |
AIC and SBC | |
| |
| |
| |
Hosmer-Lemeshow Goodness-of-Fit Test | |
| |
| |
| |
Example: Logistic Regression | |
| |
| |
| |
Population Stability Index Using Relative Entropy | |
| |
| |
| |
Optimization Methods | |
| |
| |
| |
Lagrange Multiplier | |
| |
| |
| |
Gradient Descent Method | |
| |
| |
| |
Newton-Raphson Method | |
| |
| |
| |
Conjugate Gradient Method | |
| |
| |
| |
Quasi-Newton Method | |
| |
| |
| |
Genetic Algorithms (GA) | |
| |
| |
| |
Simulated Annealing | |
| |
| |
| |
Linear Programming | |
| |
| |
| |
Nonlinear Programming (NLP) | |
| |
| |
| |
General Nonlinear Programming (GNLP) | |
| |
| |
| |
Lagrange Dual Problem | |
| |
| |
| |
Quadratic Programming (QP) | |
| |
| |
| |
Linear Complementarity Programming (LCP | |
| |
| |
| |
Sequential Quadratic Programming (SQP) | |
| |
| |
| |
Nonlinear Equations | |
| |
| |
| |
Expectation-Maximization (EM) Algorithm | |
| |
| |
| |
Optimal Design of Experiment | |
| |
| |
| |
Miscellaneous Topics | |
| |
| |
| |
Multidimensional Scaling | |
| |
| |
| |
Simulation | |
| |
| |
| |
Odds Normalization and Score Transformation | |
| |
| |
| |
Reject Inference | |
| |
| |
| |
Dempster-Shafer Theory of Evidence | |
| |
| |
| |
Some Properties in Set Theory | |
| |
| |
| |
Basic Probability Assignment, Belief Function, and Plausibility Function | |
| |
| |
| |
Dempster-Shafer's Rule of Combination | |
| |
| |
| |
Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Function | |
| |
| |
| |
Useful Mathematical Relations | |
| |
| |
| |
Information Inequality | |
| |
| |
| |
Relative Entropy | |
| |
| |
| |
Saddle-Point Method | |
| |
| |
| |
Stirling's Formula | |
| |
| |
| |
Convex Function and Jensen's Inequality | |
| |
| |
| |
DataMinerXL - Microsoft Excel Add-In for Building Predictive Models | |
| |
| |
| |
Overview | |
| |
| |
| |
Utility Functions | |
| |
| |
| |
Data Manipulation Functions | |
| |
| |
| |
Basic Statistical Functions | |
| |
| |
| |
Modeling Functions for All Models | |
| |
| |
| |
Weight of Evidence Transformation Functions | |
| |
| |
| |
Linear Regression Functions | |
| |
| |
| |
Partial Least Squares Regression Functions | |
| |
| |
| |
Logistic Regression Functions | |
| |
| |
| |
Time Series Analysis Functions | |
| |
| |
| |
Naive Bayes Classifier Functions | |
| |
| |
| |
Tree-Based Model Functions | |
| |
| |
| |
Clustering and Segmentation Functions | |
| |
| |
| |
Neural Network Functions | |
| |
| |
| |
Support Vector Machine Functions | |
| |
| |
| |
Optimization Functions | |
| |
| |
| |
Matrix Operation Functions | |
| |
| |
| |
Numerical Integration Functions | |
| |
| |
| |
Excel Built-in Statistical Distribution Functions | |
| |
| |
Bibliography | |
| |
| |
Index | |