Agile Data Science Building Data Analytics Applications with Hadoop

ISBN-10: 1449326269
ISBN-13: 9781449326265
Edition: 2012
Authors: Russell Jurney
List price: $39.99 Buy it from $8.51
eBook available
This item qualifies for FREE shipping

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

30 day, 100% satisfaction guarantee

If an item you ordered from TextbookRush does not meet your expectations due to an error on our part, simply fill out a return request and then return it by mail within 30 days of ordering it for a full refund of item cost.

Learn more about our returns policy

Description: Mining data requires a deep investment in people and time. How can you be sure you’re building the right models? What tools help you connect with the customer’s needs? With this hands-on book, you’ll learn a flexible toolset and methodology for  More...

New Starting from $38.84
eBooks Starting from $33.99
Rent
Buy
what's this?
Rush Rewards U
Members Receive:
coins
coins
You have reached 400 XP and carrot coins. That is the daily max!
You could win $10,000

Get an entry for every item you buy, rent, or sell.

Study Briefs

Limited time offer: Get the first one free! (?)

All the information you need in one place! Each Study Brief is a summary of one specific subject; facts, figures, and explanations to help you learn faster.

Add to cart
Study Briefs
Robert's Rules of Order Online content $4.95 $1.99
Add to cart
Study Briefs
Aromatherapy Basics Online content $4.95 $1.99
Add to cart
Study Briefs
Study Tactics Online content $4.95 $1.99

Customers also bought

Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading
Loading

Book details

List price: $39.99
Copyright year: 2012
Publisher: O'Reilly Media, Incorporated
Publication date: 10/28/2013
Binding: Paperback
Pages: 250
Size: 6.75" wide x 9.00" long x 0.50" tall
Weight: 0.660
Language: English

Mining data requires a deep investment in people and time. How can you be sure you’re building the right models? What tools help you connect with the customer’s needs? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications.Agile Datashows you how to create an environment for exploring data, using lightweight tools such as Ruby, Python, Apache Pig, and the D3.js (Data-Driven Documents) JavaScript library. You’ll learn an iterative approach that allows you to quickly change the kind of analysis you’re doing, as you discover what the data is telling you. All the example code in this book is available as working Heroku apps.Build an application to mine your own email inboxUse several data structures to extract multiple features from a single dataset, and learn how different perspectives can yield insightRapidly boot your applications as simple front-ends to key/value storesAdd features driven by descriptive and inferential statistics, machine learning, and data visualizationGather usage data and talk to real users to help guide your data-driven explorationYou can provide constructive comments on the manuscript through O’Reilly’s Open Feedback Publishing System (OFPS). Learn more athttp://labs.oreilly.com/ofps.html.

Russell Jurney cut his data teeth in casino gaming, building web apps to analyze the performance of slot machines in the US and Mexico. After dabbling in entrepreneurship, interactive media and journalism, he moved to silicon valley to build analytics applications at scale at Ning and LinkedIn. He lives on the ocean in Pacifica, California with his wife Kate and two fuzzy dogs.

Preface
Setup
Theory
Agile Big Data
Big Words Defined
Agile Big Data Teams
Recognizing the Opportunity and Problem
Adapting to Change
Agile Big Data Process
Code Review and Pair Programming
Agile Environments: Engineering Productivity
Collaboration Space
Private Space
Personal Space
Realizing Ideas with Large-Format Printing
Data
Email
Working with Raw Data
Raw Email
Structured Versus Semistructured Data
SQL
NoSQL
Serialization
Extracting and Exposing Features in Evolving Schemas
Data Pipelines
Data Perspectives
Networks
Time Series
Natural Language
Probability
Conclusion
Agile Tools
Scalability = Simplicity
Agile Big Data Processing
Setting Up a Virtual Environment for Python
Serializing Events with Avro
Avro for Python
Collecting Data
Data Processing with Pig
Installing Pig
Publishing Data with MongoDB
Installing MongoDB
Installing MongoDB's Java Driver
Installing mongo-hadoop
Pushing Data to MongoDB from Pig
Searching Data with ElasticSearch
Installation
ElasticSearch and Pig with Wonderdog
Reflecting on our Workflow
Lightweight Web Applications
Python and Flask
Presenting Our Data
Installing Bootstrap
Booting Boostrap
Visualizing Data with D3.js and nvd3.js
Conclusion
To the Cloud!
Introduction
GitHub
dotCloud
Echo on dotCloud
Python Workers
Amazon Web Services
Simple Storage Service
Elastic MapReduce
MongoDB as a Service
Instrumentation
Google Analytics
Mortar Data
Climbing the Pyramid
Collecting and Displaying Records
Putting It All Together
Collect and Serialize Our Inbox
Process and Publish Our Emails
Presenting Emails in a Browser
Serving Emails with Flask and pymongo
Rendering HTML5 with Jinja2
Agile Checkpoint
Listing Emails
Listing Emails with MongoDB
Anatomy of a Presentation
Searching Our Email
Indexing Our Email with Pig, ElasticSearch, and Wonderdog
Searching Our Email on the Web
Conclusion
Visualizing Data with Charts
Good Charts
Extracting Entities: Email Addresses
Extracting Emails
Visualizing Time
Conclusion
Exploring Data with Reports
Building Reports with Multiple Charts
Linking Records
Extracting Keywords from Emails with TF-IDF
Conclusion
Making Predictions
Predicting Response Rates to Emails
Personalization
Conclusion
Driving Actions
Properties of Successful Emails
Better Predictions with Naive Bayes
P(Reply From & To)
P(Reply Token)
Making Predictions in Real Time
Logging Events
Conclusion
Index

×
Free shipping on orders over $35*

*A minimum purchase of $35 is required. Shipping is provided via FedEx SmartPost® and FedEx Express Saver®. Average delivery time is 1 – 5 business days, but is not guaranteed in that timeframe. Also allow 1 - 2 days for processing. Free shipping is eligible only in the continental United States and excludes Hawaii, Alaska and Puerto Rico. FedEx service marks used by permission."Marketplace" orders are not eligible for free or discounted shipping.

Learn more about the TextbookRush Marketplace.

×