Field Guide to Hadoop: An Introduction to Hadoop, Its Ecosystem, and Aligned Technologies (Paperback)
暫譯: Hadoop 實用指南:Hadoop 及其生態系統與相關技術入門 (平裝本)

Kevin Sitto, Marshall Presser

  • 出版商: O'Reilly
  • 出版日期: 2015-04-21
  • 定價: $1,320
  • 售價: 8.0$1,056
  • 語言: 英文
  • 頁數: 132
  • 裝訂: Paperback
  • ISBN: 1491947934
  • ISBN-13: 9781491947937
  • 相關分類: Hadoop
  • 立即出貨(限量) (庫存=4)

買這商品的人也買了...

商品描述

If your organization is about to enter the world of big data, you not only need to decide whether Apache Hadoop is the right platform to use, but also which of its many components are best suited to your task. This field guide makes the exercise manageable by breaking down the Hadoop ecosystem into short, digestible sections. You’ll quickly understand how Hadoop’s projects, subprojects, and related technologies work together.

Each chapter introduces a different topic—such as core technologies or data transfer—and explains why certain components may or may not be useful for particular needs. When it comes to data, Hadoop is a whole new ballgame, but with this handy reference, you’ll have a good grasp of the playing field.

Topics include:

  • Core technologies—Hadoop Distributed File System (HDFS), MapReduce, YARN, and Spark
  • Database and data management—Cassandra, HBase, MongoDB, and Hive
  • Serialization—Avro, JSON, and Parquet
  • Management and monitoring—Puppet, Chef, Zookeeper, and Oozie
  • Analytic helpers—Pig, Mahout, and MLLib
  • Data transfer—Scoop, Flume, distcp, and Storm
  • Security, access control, auditing—Sentry, Kerberos, and Knox
  • Cloud computing and virtualization—Serengeti, Docker, and Whirr

商品描述(中文翻譯)

如果您的組織即將進入大數據的世界,您不僅需要決定 Apache Hadoop 是否是合適的平台,還需要選擇其眾多組件中最適合您任務的部分。本指南通過將 Hadoop 生態系統分解為簡短、易於消化的部分,使這一過程變得可管理。您將迅速了解 Hadoop 的專案、子專案及相關技術如何協同工作。

每一章介紹一個不同的主題——例如核心技術或數據傳輸——並解釋為什麼某些組件可能對特定需求有用或無用。談到數據時,Hadoop 是一個全新的遊戲,但有了這本方便的參考資料,您將對這個領域有良好的掌握。

主題包括:

- **核心技術**——Hadoop 分散式檔案系統 (HDFS)、MapReduce、YARN 和 Spark
- **資料庫和數據管理**——Cassandra、HBase、MongoDB 和 Hive
- **序列化**——Avro、JSON 和 Parquet
- **管理和監控**——Puppet、Chef、Zookeeper 和 Oozie
- **分析輔助工具**——Pig、Mahout 和 MLLib
- **數據傳輸**——Scoop、Flume、distcp 和 Storm
- **安全性、存取控制、審計**——Sentry、Kerberos 和 Knox
- **雲計算和虛擬化**——Serengeti、Docker 和 Whirr