Skip to content

Data Analysis What Can Be Learned from the Past 50 Years

Spend $50 to get a free DVD!

ISBN-10: 1118010647

ISBN-13: 9781118010648

Edition: 2011

Authors: Peter J. Huber

List price: $132.00
Blue ribbon 30 day, 100% satisfaction guarantee!
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

Description:

This book explores the many provocative questions concerning the fundamentals of data analysis. It is based on the time-tested experience of one of the gurus of the subject matter. Why should one study data analysis? How should it be taught? What techniques work best, and for whom? How valid are the results? How much data should be tested? Which machine languages should be used, if used at all? Emphasis on apprenticeship (through hands-on case studies) and anecdotes (through real-life applications) are the tools that Peter J. Huber uses in this volume. Concern with specific statistical techniques is not of immediate value; rather, questions of strategy when to use which technique are employed. Central to the discussion is an understanding of the significance of massive (or robust) data sets, the implementation of languages, and the use of models. Each is sprinkled with an ample number of examples and case studies. Personal practices, various pitfalls, and existing controversies are presented when applicable. The book serves as an excellent philosophical and historical companion to any present-day text in data analysis, robust statistics, data mining, statistical learning, or computational statistics.
Customers also bought

Book details

List price: $132.00
Copyright year: 2011
Publisher: John Wiley & Sons, Incorporated
Publication date: 4/11/2011
Binding: Hardcover
Pages: 234
Size: 6.75" wide x 10.00" long x 0.50" tall
Weight: 0.990
Language: English

Preface
What is Data Analysis?
Tukey's 1962 paper
The Path of Statistics
Strategy Issues in Data Analysis
Strategy in Data Analysis
Philosophical issues
On the theory of data analysis and its teaching
Science and data analysis
Economy of forces
Issues of size
Strategic planning
Planning the data collection
Choice of data and methods.
Systematic and random errors
Strategic reserves
Human factors
The stages of data analysis
Inspection
Error checking
Modification
Comparison
Modeling and Model fitting
Simulation
What-if analyses
Interpretation
Presentation of conclusions
Tools required for strategy reasons
Ad hoc programming
Graphics
Record keeping
Creating and keeping order
Massive Data Sets
Introduction
Disclosure: Personal experiences
What is massive? A classification of size
Obstacles to scaling
Human limitations: visualization
Human - machine interactions
Storage requirements
Computational complexity
Conclusions
On the structure of large data sets
Types of data
How do data sets grow?
On data organization
Derived data sets
Data base management and related issues
Data archiving
The stages of a data analysis
Planning the data collection
Actual collection
Data access
Initial data checking
Data analysis proper
The final product: presentation of arguments and conclusions
Examples and some thoughts on strategy
Volume reduction
Supercomputers and software challenges
When do we need a Concorde?
General Purpose Data Analysis and Supercomputers
Languages, Programming Environments and Data-based Prototyping
Summary of conclusions
Languages for Data Analysis
Goals and purposes
Natural languages and computing languages
Natural languages
Batch languages
Immediate languages
Language and literature
Object orientation and related structural issues
Extremism and compromises, slogans and reality
Some conclusions
Interface issues
The command line interface
The menu interface
The batch interface and programming environments
Some personal experiences
Miscellaneous issues
On building blocks
On the scope of names
On notation
Book-keeping problems
Requirements for a general purpose immediate language
Approximate Models
Models
Bayesian modeling
Mathematical statistics and approximate models
Statistical significance and physical relevance
Judicious use of a wrong model
Composite models
Modeling the length of day
The role of simulation
Summary of conclusions
Pitfalls
Simpson's paradox
Missing data
The Case of the Babylonian Lunar Six
X-ray crystallography
Regression of Y on X or of X on Y?
Create order in data
General considerations
Principal component methods
Principal component methods: Jury data
Multidimensional scaling
Multidimensional scaling: the method
Multidimensional scaling: a synthetic example
Multidimensional scaling: map reconstruction
Correspondence analysis
Correspondence analysis: the method
K�ltepe eponyms
Further examples: marketing and Shakespearean plays
Multidimensional scaling vs. Correspondence analysis
Hodson's grave data
Plato data
More case studies
A nutshell example
Shape invariant modeling
Comparison of point configurations
The cyclodecane conformation
The Thomson problem
Notes on numerical optimization
References
Index