Skip to content

R and Data Mining Examples and Case Studies

Best in textbook rentals since 2012!

ISBN-10: 0123969638

ISBN-13: 9780123969637

Edition: 2012

Authors: Yanchang Zhao

List price: $79.95
Blue ribbon 30 day, 100% satisfaction guarantee!
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

Description:

This book introduces into using R for data mining. Data mining techniques are widely used in government agencies, banks, insurance, retail, telecom, medicine and research. Recently, there is an increasing tendency to do data mining with R, a free software environment for statistical computing and graphics . According to a poll by KDnuggets.com in early 2011, R is the 2nd popular tool for data mining work. By introducing into using R for data mining, this book will have a broad audience from both academia and industry. It targets researchers in the field of data mining, postgraduate students who are interested in data mining, as well as data miners and analysts from industry. For example,…    
Customers also bought

Book details

List price: $79.95
Copyright year: 2012
Publisher: Elsevier Science & Technology
Publication date: 1/31/2013
Binding: Hardcover
Pages: 256
Size: 5.94" wide x 9.00" long x 1.00" tall
Weight: 1.232
Language: English

A Senior Data Mining Analyst in Australia Government since 2009. Before joining public sector, he was an Australian Postdoctoral Fellow (Industry) in the Faculty of Engineering & Information Technology at University of Technology, Sydney, Australia. His research interests include clustering, association rules, time series, outlier detection and data mining applications and he has over forty papers published in journals and conference proceedings. He is a member of the IEEE and a member of the Institute of Analytics Professionals of Australia, and served as program committee member for more than thirty international conferences.

List of Figures
List of Abbreviations
Introduction
Data Mining
R
Datasets
The Iris Dataset
The Bodyfat Dataset
Data Import and Export
Save and Load R Data
Import from and Export to .CSV Files
Import Data from SAS
Import/Export via ODBC
Read from Databases
Output to and Input from EXCEL Files
Data Exploration
Have a Look at Data
Explore Individual Variables
Explore Multiple Variables
More Explorations
Save Charts into Files
Decision Trees and Random Forest
Decision Trees with Package party
Decision Trees with Package rpart
Random Forest
Regression
Linear Regression
Logistic Regression
Generalized Linear Regression
Non-Linear Regression
Clustering
The k-Means Clustering
The k-Medoids Clustering
Hierarchical Clustering
Density-Based Clustering
Outlier Detection
Univariate Outlier Detection
Outlier Detection with LOF
Outlier Detection by Clustering
Outlier Detection from Time Series
Discussions
Time Series Analysis and Mining
Time Series Data in R
Time Series Decomposition
Time Series Forecasting
Time Series Clustering
Dynamic Time Warping
Synthetic Control Chart Time Series Data
Hierarchical Clustering with Euclidean Distance
Hierarchical Clustering with DTW Distance
Time Series Classification
Classification with Original Data
Classification with Extracted Features
k-NN Classification
Discussions
Further Readings
Association Rules
Basics of Association Rules
The Titanic Dataset
Association Rule Mining
Removing Redundancy
Interpreting Rules
Visualizing Association Rules
Discussions and Further Readings
Text Mining
Retrieving Text from Twitter
Transforming Text
Stemming Words
Building a Term-Document Matrix
Frequent Terms and Associations
Word Cloud
Clustering Words
Clustering Tweets
Clustering Tweets with the k-Means Algorithm
Clustering Tweets with the k-Medoids Algorithm
Packages, Further Readings, and Discussions
Social Network Analysis
Network of Terms
Network of Tweets
Two-Mode Network
Discussions and Further Readings
Case Study I: Analysis and Forecasting of House Price Indices
Importing HPI Data
Exploration of HPI Data
Trend and Seasonal Components of HPI
HPI Forecasting
The Estimated Price of a Property
Discussion
Case Study II: Customer Response Prediction and Profit Optimization
Introduction
The Data of KDD Cup 1998
Data Exploration
Training Decision Trees
Model Evaluation
Selecting the Best Tree
Scoring
Discussions and Conclusions
Case Study III: Predictive Modeling of Big Data with Limited Memory
Introduction
Methodology
Data and Variables
Random Forest
Memory Issue
Train Models on Sample Data
Build Models with Selected Variables
Scoring
Print Rules
Print Rules in Text
Print Rules for Scoring with SAS
Conclusions and Discussion
Online Resources
R Reference Cards
R
Data Mining
Data Mining with R
Classification/Prediction with R
Time Series Analysis with R
Association Rule Mining with R
Spatial Data Analysis with R
Text Mining with R
Social Network Analysis with R
Data Cleansing and Transformation with R
Big Data and Parallel Computing with R
R Reference Card for Data Mining
Bibliography
General Index
Package Index
Function Index