Big Data for Chimps: A Guide to Massive-Scale Data Processing in Practice (Paperback)
暫譯: 大數據與猩猩:實務中的大規模數據處理指南 (平裝本)

Philip Kromer, Russell Jurney

  • 出版商: O'Reilly
  • 出版日期: 2015-11-17
  • 定價: $1,320
  • 售價: 8.0$1,056
  • 語言: 英文
  • 頁數: 220
  • 裝訂: Paperback
  • ISBN: 1491923946
  • ISBN-13: 9781491923948
  • 相關分類: 大數據 Big-data
  • 立即出貨 (庫存 < 4)

商品描述

Finding patterns in massive event streams can be difficult, but learning how to find them doesn’t have to be. This unique hands-on guide shows you how to solve this and many other problems in large-scale data processing with simple, fun, and elegant tools that leverage Apache Hadoop. You’ll gain a practical, actionable view of big data by working with real data and real problems.

Perfect for beginners, this book’s approach will also appeal to experienced practitioners who want to brush up on their skills. Part I explains how Hadoop and MapReduce work, while Part II covers many analytic patterns you can use to process any data. As you work through several exercises, you’ll also learn how to use Apache Pig to process data.

  • Learn the necessary mechanics of working with Hadoop, including how data and computation move around the cluster
  • Dive into map/reduce mechanics and build your first map/reduce job in Python
  • Understand how to run chains of map/reduce jobs in the form of Pig scripts
  • Use a real-world dataset—baseball performance statistics—throughout the book
  • Work with examples of several analytic patterns, and learn when and where you might use them

商品描述(中文翻譯)

尋找大量事件流中的模式可能很困難,但學習如何找到它們並不一定如此。本書這本獨特的實作指南將向您展示如何使用簡單、有趣且優雅的工具來解決這個以及許多其他大規模數據處理中的問題,這些工具利用了 Apache Hadoop。您將通過處理真實數據和真實問題,獲得對大數據的實用且可行的視角。

這本書非常適合初學者,其方法也會吸引希望提升技能的經驗豐富的從業者。第一部分解釋了 Hadoop 和 MapReduce 的運作方式,而第二部分則涵蓋了許多您可以用來處理任何數據的分析模式。在完成幾個練習的過程中,您還將學習如何使用 Apache Pig 來處理數據。

- 學習使用 Hadoop 的必要機制,包括數據和計算如何在集群中移動
- 深入了解 map/reduce 機制,並在 Python 中建立您的第一個 map/reduce 工作
- 理解如何以 Pig 腳本的形式運行 map/reduce 工作鏈
- 在整本書中使用一個真實的數據集——棒球表現統計
- 與幾個分析模式的範例一起工作,並學習何時以及在哪裡使用它們