Skip to content

Advanced Analytics with Spark Patterns for Learning from Data at Scale

Best in textbook rentals since 2012!

ISBN-10: 1491912766

ISBN-13: 9781491912768

Edition: 2015

Authors: Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills

List price: $39.99
Blue ribbon 30 day, 100% satisfaction guarantee!
Buy eBooks
what's this?
Rush Rewards U
Members Receive:
Carrot Coin icon
XP icon
You have reached 400 XP and carrot coins. That is the daily max!

Description:

Apache Spark is emerging as one of the most popular technologies for performing analytics on huge datasets, and this practical guide shows you how to harness Spark’s power for approaching a variety of analytics problems. You’ll learn how to apply common techniques, such as classification, clustering, collaborative filtering, anomaly detection, dimensionality reduction, and Monte Carlo simulation to fields such as genomics, security, and finance.Advanced Analytics with Spark supplies complete implementations that analyze large public datasets, and acts as an introduction to using these techniques and other best practices in Spark programming.Become familiar with the Spark programming model…    
Customers also bought

Book details

List price: $39.99
Copyright year: 2015
Publisher: O'Reilly Media, Incorporated
Publication date: 4/14/2015
Binding: Paperback
Pages: 276
Size: 7.13" wide x 9.17" long x 0.56" tall
Weight: 1.012
Language: English

Sandy Ryza is a data scientist at Cloudera and active contributor to the Apache Spark project. He recently led Spark development at Cloudera and now spends his time helping customers with a variety of analytic use cases on Spark. He is also a member of the Hadoop Project Management Committee.

Uri Laserson is a data scientist at Cloudera, where he focuses on Python in the Hadoop ecosystem. He also helps customers deploy Hadoop on a wide range of problems, focusing on life sciences and health care. Previously, Uri cofounded Good Start Genetics, a next generationdiagnostics company while working towards a PhD in biomedical engineering at MIT.

Sean Owen has been a practicing software engineer for 9 years, most recently at Google, where he helped build and launch Mobile Web search. He joined Apache's Mahout machine learning project in 2008 as a primary committer and works as a Mahout consultant.Robin Anil joined Apache's Mahout project as a Google Summer of Code student in 2008 and contributed to the Classifier and Frequent Pattern Mining packages with algorithms that run on the Hadoop Map/Reduce platform. Since 2009, he has been a committer at Mahout and works as a full-time Software Engineer at Google.

Josh Willis is Cloudera's Senior Director of Data Science, working with customers and engineers to develop Hadoop based solutions across a wide range of industries. He is the founder and VP of the Apache Crunch project for creating optimized MapReduce and Spark pipelines in Java.Prior to joining Cloudera, Josh worked at Google, where he worked on the ad auction system and then led the development of the analytics infrastructure used in Google+.