Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Michael Frampton

  • 出版商: Apress
  • 出版日期: 2014-12-24
  • 售價: $2,000
  • 貴賓價: 9.5$1,900
  • 語言: 英文
  • 頁數: 392
  • 裝訂: Paperback
  • ISBN: 1484200950
  • ISBN-13: 9781484200957
  • 相關分類: Hadoop大數據 Big-data
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system.

As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive).

The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton.

Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to:

  • Store big data
  • Configure big data
  • Process big data
  • Schedule processes
  • Move data among SQL and NoSQL systems
  • Monitor data
  • Perform big data analytics
  • Report on big data processes and projects
  • Test big data systems

Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career.

商品描述(中文翻譯)

許多企業發現,他們的數據集規模已超過系統存儲和處理的能力。數據變得過於龐大,無法使用傳統工具進行管理和使用。解決方案是實施大數據系統。

正如《Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset》所示,Apache Hadoop 提供了一個可擴展的、容錯的系統,用於並行存儲和處理數據。它擁有非常豐富的工具集,允許進行存儲(Hadoop)、配置(YARN 和 ZooKeeper)、收集(Nutch 和 Solr)、處理(Storm、Pig 和 Map Reduce)、排程(Oozie)、移動(Sqoop 和 Avro)、監控(Chukwa、Ambari 和 Hue)、測試(Big Top)和分析(Hive)。

問題在於,互聯網為進入大數據領域的 IT 專業人士提供了許多版本的真相,還有一些出於無知而產生的完全虛假信息。所需的是一本像這樣的書:一套範圍廣泛但易於理解的指導,解釋如何獲取 Hadoop 工具、它們的功能、如何安裝、如何配置、如何整合以及如何成功使用它們。而且你需要一位在這個領域工作了十年的專家——就像作者和大數據專家 Mike Frampton 一樣。

《Big Data Made Easy》從系統的角度來處理管理龐大數據集的問題,並解釋每個項目的角色(例如架構師和測試員),展示如何在每個系統階段使用 Hadoop 工具集。它以易於理解的方式和通過大量示例解釋如何使用每個工具。這本書還解釋了根據數據大小可用工具的滑動比例,以及何時和如何使用它們。《Big Data Made Easy》向開發人員和架構師,以及測試人員和項目經理展示如何:

- 存儲大數據
- 配置大數據
- 處理大數據
- 排程過程
- 在 SQL 和 NoSQL 系統之間移動數據
- 監控數據
- 執行大數據分析
- 報告大數據過程和項目
- 測試大數據系統

《Big Data Made Easy》還解釋了最重要的一點,即這個工具集是免費的。任何人都可以下載它,並在這本書的幫助下,開始在一天內使用它。掌握這本書所教的技能後,你將立即為你的公司或客戶增值,更不用說你的職業生涯了。