Hadoop is an open source MapReduce platform designed to query and analyze data distributed across large clusters. Especially eff ective for big data systems, Hadoop powers mission-critical soft ware at Apple, eBay, LinkedIn, Yahoo, and Facebook. It off ers developers handy ways to store, manage, andanalyze data.
Hadoop in Practice collects 85 battle-tested examples and presents them in a problem/solution format. It balances conceptual foundations with practical recipes for key problem areas likedata ingress and egress, serialization, and LZO compression. Youll explore each technique step by step, learning how to build a specifi c solution along with the thinking that went into it. As a bonus, the books examples create a wellstructured and understandable codebase you can tweak to meet your own needs.
Table of Contents
PART 1 BACKGROUND AND FUNDAMENTALS
Hadoop in a heartbeat
PART 2 DATA LOGISTICS Moving data in and out of Hadoop Data serializationworking with text and beyond PART 3 BIG DATA PATTERNS Applying MapReduce patterns to big data Streamlining HDFS for big data Diagnosing and tuning performance problems
more
Predictive analytics with Mahout
DREAMTECH PRESS
19-A, Ansari Road, Daryaganj New Delhi-110 002, INDIA Tel: +91-11-2324 3463-73, Fax: +91-11-2324 3078 Email: feedback@dreamtechpress.com Website: www.dreamtechpress.com
Regional Offices: Bangalore: Tel: +91-80-2313 2383, Fax: +91-80-2312 4319, Email: blrsales@wiley.com Mumbai: Tel: +91-22-2788 9263, 2788 9272, Telefax: +91-22-2788 9263, Email: mumsales@wiley.com /dtechpress /dtechpress /dreamtechpress /company/dreamtech-press