Skip to content

Data Mining for Business Intelligence Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner

Spend $50 to get a free DVD!

ISBN-10: 0470084855

ISBN-13: 9780470084854

Edition: 2007

Authors: Nitin R. Patel, Galit Shmueli, Peter C. Bruce

List price: $126.50
Blue ribbon 30 day, 100% satisfaction guarantee!
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

Description:

Providing a theoretical and practical understanding of the key methods of classification, prediction, reduction and exploration that are at the heart of data mining, this book also presents a business decision-making context for these methods and uses cases and data to illustrate their application.
Customers also bought

Book details

List price: $126.50
Copyright year: 2007
Publisher: John Wiley & Sons, Incorporated
Publication date: 12/11/2006
Binding: Hardcover
Pages: 298
Size: 7.25" wide x 10.25" long x 1.00" tall
Weight: 1.496
Language: English

Foreword
Preface
Acknowledgments
Introduction
What Is Data Mining?
Where Is Data Mining Used?
The Origins of Data Mining
The Rapid Growth of Data Mining
Why Are There So Many Different Methods?
Terminology and Notation
Road Maps to This Book
Overview of the Data Mining Process
Introduction
Core Ideas in Data Mining
Supervised and Unsupervised Learning
The Steps in Data Mining
Preliminary Steps
Building a Model: Example with Linear Regression
Using Excel for Data Mining
Problems
Data Exploration and Dimension Reduction
Introduction
Practical Considerations
House Prices in Boston
Data Summaries
Data Visualization
Correlation Analysis
Reducing the Number of Categories in Categorical Variables
Principal Components Analysis
Breakfast Cereals
Principal Components
Normalizing the Data
Using Principal Components for Classification and Prediction
Problems
Evaluating Classification and Predictive Performance
Introduction
Judging Classification Performance
Accuracy Measures
Cutoff for Classification
Performance in Unequal Importance of Classes
Asymmetric Misclassification Costs
Oversampling and Asymmetric Costs
Classification Using a Triage Strategy
Evaluating Predictive Performance
Problems
Multiple Linear Regression
Introduction
Explanatory vs. Predictive Modeling
Estimating the Regression Equation and Prediction
Example: Predicting the Price of Used Toyota Corolla Automobiles
Variable Selection in Linear Regression
Reducing the Number of Predictors
How to Reduce the Number of Predictors
Problems
Three Simple Classification Methods
Introduction
Predicting Fraudulent Financial Reporting
Predicting Delayed Flights
The Naive Rule
Naive Bayes
Conditional Probabilities and Pivot Tables
A Practical Difficulty
A Solution: Naive Bayes
Advantages and Shortcomings of the naive Bayes Classifier
k-Nearest Neighbors
Riding Mowers
Choosing k
k-NN for a Quantitative Response
Advantages and Shortcomings of k-NN Algorithms
Problems
Classification and Regression Trees
Introduction
Classification Trees
Recursive Partitioning
Example 1: Riding Mowers
Measures of Impurity
Evaluating the Performance of a Classification Tree
Acceptance of Personal Loan
Avoiding Overfitting
Stopping Tree Growth: CHAID
Pruning the Tree
Classification Rules from Trees
Regression Trees
Prediction
Measuring Impurity
Evaluating Performance
Advantages, Weaknesses, and Extensions
Problems
Logistic Regression
Introduction
The Logistic Regression Model
Example: Acceptance of Personal Loan
Model with a Single Predictor
Estimating the Logistic Model from Data: Computing Parameter Estimates
Interpreting Results in Terms of Odds
Why Linear Regression Is Inappropriate for a Categorical Response
Evaluating Classification Performance
Variable Selection
Evaluating Goodness of Fit
Example of Complete Analysis: Predicting Delayed Flights
Data Preprocessing
Model Fitting and Estimation
Model Interpretation
Model Performance
Goodness of fit
Variable Selection
Logistic Regression for More Than Two Classes
Ordinal Classes
Nominal Classes
Problems
Neural Nets
Introduction
Concept and Structure of a Neural Network
Fitting a Network to Data
Tiny Dataset
Computing Output of Nodes
Preprocessing the Data
Training the Model
Classifying Accident Severity
Avoiding overfitting
Using the Output for Prediction and Classification
Required User Input
Exploring the Relationship Between Predictors and Response
Advantages and Weaknesses of Neural Networks
Problems
Discriminant Analysis
Introduction
Example 1: Riding Mowers
Example 2: Personal Loan Acceptance
Distance of an Observation from a Class
Fisher's Linear Classification Functions
Classification Performance of Discriminant Analysis
Prior Probabilities
Unequal Misclassification Costs
Classifying More Than Two Classes
Medical Dispatch to Accident Scenes
Advantages and Weaknesses
Problems
Association Rules
Introduction
Discovering Association Rules in Transaction Databases
Example 1: Synthetic Data on Purchases of Phone Faceplates
Generating Candidate Rules
The Apriori Algorithm
Selecting Strong Rules
Support and Confidence
Lift Ratio
Data Format
The Process of Rule Selection
Interpreting the Results
Statistical Significance of Rules
Example 2: Rules for Similar Book Purchases
Summary
Problems
Cluster Analysis
Introduction
Example: Public Utilities
Measuring Distance Between Two Records
Euclidean Distance
Normalizing Numerical Measurements
Other Distance Measures for Numerical Data
Distance Measures for Categorical Data
Distance Measures for Mixed Data
Measuring Distance Between Two Clusters
Hierarchical (Agglomerative) Clustering
Minimum Distance (Single Linkage)
Maximum Distance (Complete Linkage)
Group Average (Average Linkage)
Dendrograms: Displaying Clustering Process and Results
Validating Clusters
Limitations of Hierarchical Clustering
Nonhierarchical Clustering: The k-Means Algorithm
Initial Partition into k Clusters
Problems
Cases
Charles Book Club
German Credit
Tayko Software Cataloger
Segmenting Consumers of Bath Soap
Direct-Mail Fundraising
Catalog Cross-Selling
Predicting Bankruptcy
References
Index