Foundations of Predictive Analytics

Name: Foundations of Predictive Analytics
Availability: InStock
ISBN: 9781439869468

ISBN-10: 1439869464

ISBN-13: 9781439869468

Edition: 2012

Authors: James Wu, Stephen Coggeshall

This item qualifies for FREE shipping.

30 day, 100% satisfaction guarantee!

Semester Rental: $68.98 Due date: 8/15/2024

Rental notice: supplementary materials (access codes, CDs, etc.) are not guaranteed with rental orders.

Buy used: $91.96

Buy new: $131.98

Marketplace

3 new & used from $98.07

what's this?

Rush Rewards U
Members Receive:

You have reached 400 XP and carrot coins. That is the daily max!

Description:

Written by industry experts, this book introduces the various concepts, theorems, and algorithms widely used in statistical data analysis and data mining. It covers important topics in data mining, machine learning, and statistical pattern recognition, including linear and nonlinear regression models, time series analysis, and variable selection. The text also explores key topics that are not extensively covered in similar books, such as copula functions, incremental regression, censored data models, Dempster-Shafer theory, survival data analysis, and GARCH.

Book details

Copyright year: 2012
Publisher: Taylor & Francis Group
Publication date: 3/19/2012
Binding: Hardcover
Pages: 338
Size: 6.25" wide x 9.25" long x 0.75" tall
Weight: 1.298
Language: English



List of Figures


List of Tables


Preface



Introduction



What Is a Model?



What Is a Statistical Model?



The Modeling Process



Modeling Pitfalls



Characteristics of Good Modelers



The Future of Predictive Analytics



Properties of Statistical Distributions



Fundamental Distributions



Uniform Distribution



Details of the Normal (Gaussian) Distribution



Lognormal Distribution



ï¿½ Distribution



Chi-Squared Distribution



Non-Central Chi-Squared Distribution



Student's t-Distribution



Multivariate t-Distribution



F-Distribution



Binomial Distribution



Poisson Distribution



Exponential Distribution



Geometric Distribution



Hypergeometric Distribution



Negative Binomial Distribution



Inverse Gaussian (IG) Distribution



Normal Inverse Gaussian (NIG) Distribution



Central Limit Theorem



Estimate of Mean, Variance, Skewness, and Kurtosis from Sample Data



Estimate of the Standard Deviation of the Sample Mean



(Pseudo) Random Number Generators



Mersenne Twister Pseudorandom Number Generator



Box-Muller Transform for Generating a Normal Distribution



Transformation of a Distribution Function



Distribution of a Function of Random Variables



Z = X + Y



Z = XY



(Z<sub>1</sub>,Z<sub>2</sub>,…,Z<sub>n</sub>) = (X<sub>1</sub>,X<sub>2</sub>,…,X<sub>n</sub>) Y



Z = X/Y



Z = max(X,Y)



Z = min(X,Y)



Moment Generating Function



Moment Generating Function of Binomial Distribution



Moment Generating Function of Normal Distribution



Moment Generating Function of the ï¿½ Distribution



Moment Generating Function of Chi-Square Distribution



Moment Generating Function of the Poisson Distribution



Cumulant Generating Function



Characteristic Function



Relationship between Cumulative Function and Characteristic Function



Characteristic Function of Normal Distribution



Characteristic Function of ï¿½ Distribution



Chebyshev's Inequality



Markov's Inequality



Gram-Charlier Series



Edgeworth Expansion



Cornish-Fisher Expansion



Lagrange Inversion Theorem



Cornish-Fisher Expansion



Copula Functions



Gaussian Copula



t-Copula



Archimedean Copula



Important Matrix Relationships



Pseudo-Inverse of a Matrix



A Lemma of Matrix Inversion



Identity for a Matrix Determinant



Inversion of Partitioned Matrix



Determinant of Partitioned Matrix



Matrix Sweep and Partial Correlation



Singular Value Decomposition (SVD)



Diagonalization of a Matrix



Spectral Decomposition of a Positive Semi-Definite Matrix



Normalization in Vector Space



Conjugate Decomposition of a Symmetric Definite Matrix



Cholesky Decomposition



Cauchy-Schwartz Inequality .



Relationship of Correlation among Three Variables



Linear Modeling and Regression



Properties of Maximum Likelihood Estimators



Likelihood Ratio Test



Wald Test



Lagrange Multiplier Statistic



Linear Regression



Ordinary Least Squares (OLS) Regression



Interpretation of the Coefficients of Linear Regression



Regression on Weighted Data



Incrementally Updating a Regression Model with Additional Data



Partitioned Regression



How Does the Regression Change When Adding One More Variable?



Linearly Restricted Least Squares Regression



Significance of the Correlation Coefficient



Partial Correlation



Ridge Regression



Fisher's Linear Discriminant Analysis



Principal Component Regression (PCR)



Factor Analysis



Partial Least Squares Regression (PLSR)



Generalized Linear Model (GLM)



Logistic Regression: Binary



Logistic Regression: Multiple Nominal



Logistic Regression: Proportional Multiple Ordinal



Fisher Scoring Method for Logistic Regression . .



Tobit Model: A Censored Regression Model



Some Properties of the Normal Distribution



Formulation of the Tobit Model



Nonlinear Modeling



Naive Bayesian Classifier



Neural Network



Back Propagation Neural Network



Segmentation and Tree Models



Segmentation



Tree Models



Sweeping to Find the Best Cutpoint



Impurity Measure of a Population: Entropy and Gini Index



Chi-Square Splitting Rule



Implementation of Decision Trees



Additive Models



Boosted Tree



Least Squares Regression Boosting Tree



Binary Logistic Regression Boosting Tree



Support Vector Machine (SVM)



Wolfe Dual



Linearly Separable Problem



Linearly Inseparable Problem



Constructing Higher-Dimensional Space and Kernel



Model Output



C-Support Vector Classification (C-SVC) for Classification



ï¿½-Support Vector Regression (ï¿½-SVR) for Regression



The Probability Estimate



Fuzzy Logic System



A Simple Fuzzy Logic System



Clustering



K Means, Fuzzy C Means



Nearest Neighbor, K Nearest Neighbor (KNN



Comments on Clustering Methods



Time Series Analysis



Fundamentals of Forecasting



Box-Cox Transformation



Smoothing Algorithms



Convolution of Linear Filters



Linear Difference Equation



The Autocovariance Function and Autocorrelation Function



The Partial Autocorrelation Function



ARIMA Models



MA(q) Process



AR(p) Process



ARMA(p, q) Process



Survival Data Analysis



Sampling Method



Exponentially Weighted Moving Average (EWMA) and GARCH(1, 1)



Exponentially Weighted Moving Average (EWMA)



ARCH and GARCH Models



Data Preparation and Variable Selection



Data Quality and Exploration



Variable Scaling and Transformation



How to Bin Variables .



Equal Interval



Equal Population



Tree Algorithms



Interpolation in One and Two Dimensions



Weight of Evidence (WOE) Transformation



Variable Selection Overview



Missing Data Imputation



Stepwise Selection Methods



Forward Selection in Linear Regression



Forward Selection in Logistic Regression



Mutual Information, KL Distance



Detection of Multicollinearity



Model Goodness Measures



Training, Testing, Validation



Continuous Dependent Variable



Example: Linear Regression



Binary Dependent Variable (Two-Group Classification)



Kolmogorov-Smirnov (KS) Statistic



Confusion Matrix



Concordant and Discordant



R<sup>2</sup> for Logistic Regression



AIC and SBC



Hosmer-Lemeshow Goodness-of-Fit Test



Example: Logistic Regression



Population Stability Index Using Relative Entropy



Optimization Methods



Lagrange Multiplier



Gradient Descent Method



Newton-Raphson Method



Conjugate Gradient Method



Quasi-Newton Method



Genetic Algorithms (GA)



Simulated Annealing



Linear Programming



Nonlinear Programming (NLP)



General Nonlinear Programming (GNLP)



Lagrange Dual Problem



Quadratic Programming (QP)



Linear Complementarity Programming (LCP



Sequential Quadratic Programming (SQP)



Nonlinear Equations



Expectation-Maximization (EM) Algorithm



Optimal Design of Experiment



Miscellaneous Topics



Multidimensional Scaling



Simulation



Odds Normalization and Score Transformation



Reject Inference



Dempster-Shafer Theory of Evidence



Some Properties in Set Theory



Basic Probability Assignment, Belief Function, and Plausibility Function



Dempster-Shafer's Rule of Combination



Applications of Dempster-Shafer Theory of Evidence: Multiple Classifier Function



Useful Mathematical Relations



Information Inequality



Relative Entropy



Saddle-Point Method



Stirling's Formula



Convex Function and Jensen's Inequality



DataMinerXL - Microsoft Excel Add-In for Building Predictive Models



Overview



Utility Functions



Data Manipulation Functions



Basic Statistical Functions



Modeling Functions for All Models



Weight of Evidence Transformation Functions



Linear Regression Functions



Partial Least Squares Regression Functions



Logistic Regression Functions



Time Series Analysis Functions



Naive Bayes Classifier Functions



Tree-Based Model Functions



Clustering and Segmentation Functions



Neural Network Functions



Support Vector Machine Functions



Optimization Functions



Matrix Operation Functions



Numerical Integration Functions



Excel Built-in Statistical Distribution Functions


Bibliography


Index