Hadoop 2 Essentials: An End-to-End Approach
暫譯: Hadoop 2 基礎知識:端到端的方法

Dr. Henry H Liu

  • 出版商: CreateSpace Independ
  • 出版日期: 2014-02-09
  • 售價: $2,390
  • 貴賓價: 9.5$2,271
  • 語言: 英文
  • 頁數: 308
  • 裝訂: Paperback
  • ISBN: 1495496120
  • ISBN-13: 9781495496127
  • 相關分類: Hadoop
  • 海外代購書籍(需單獨結帳)

商品描述

Updated on Feb 22, 2015: All examples have been updated from 2.2.0 to the latest stable version of 2.6.0 with some very minimal changes. The other major update is that detailed instructions are given for using the free version of VMware Player virtualization software to build your 4-node Linux Yarn cluster on a Windows laptop. A similar procedure is also given on how to build a 4-node Linux Yarn cluster using VMware Fusion virtualization software on a Mac OS X machine.
   
This textbook adopts a unique approach to helping developers and CS students learn Hadoop MapReduce programming fast in an easy-to-setup, virtual 4-node Linux YARN cluster on a Windows or Mac OS X laptop. Rather than filled with disjointed, piecemeal code snippets to show Hadoop MapReduce programming features one at a time, it is designed to place your total Hadoop MapReduce programming learning process in a common application context of mining customer spending patterns ensconced in large volumes of credit card transaction record data. Precise, end-to-end procedures are given to help you set up your Hadoop MapReduce development environment quickly on Eclipse with Maven on Windows. Step-by-step procedures are also given on how to set up a four-node Linux cluster at minimum so that you can run your MapReduce programs not only in local but also in standalone and fully distributed mode on a real cluster. In fact, all MapReduce programs presented in the book have been tested and verified on such a Linux cluster. This textbook mainly focuses on teaching Hadoop MapReduce programming in a scientific, objective, quantitative approach. Rather than heavily relying on subjective, verbose (and sometimes even pompous) textual descriptions with sparse code snippets, this textbook uses Hadoop Java APIs, Hadoop configuration parameters, complete MapReduce programs and their execution logs and outputs to demonstrate how Hadoop MapReduce framework works and how to write MapReduce programs. Specifically, this text covers the following subjects:

  • Introduction to Hadoop
  • Setting up a Linux Hadoop Cluster
  • The Hadoop Distributed FileSystem
  • MapReduce Job Orchestration and Workflows
  • Basic MapReduce Programming
  • Advanced MapReduce Programming
  • Hadoop Streaming
  • Hadoop Administration

No matter what role you play on your team, this text can help you gain truly applicable Hadoop skills in a most effective and efficient manner. The book can also be used as a supplementary textbook for a distributed computing or Hadoop course offered to upper-division CS students.

商品描述(中文翻譯)

更新於2015年2月22日:所有範例已從2.2.0更新至最新穩定版本2.6.0,並進行了一些非常小的更改。另一個主要更新是提供了詳細的說明,說明如何使用免費版本的VMware Player虛擬化軟體在Windows筆記型電腦上建立4節點的Linux Yarn叢集。還提供了如何在Mac OS X機器上使用VMware Fusion虛擬化軟體建立4節點Linux Yarn叢集的類似程序。

本教科書採用獨特的方法,幫助開發人員和計算機科學學生快速學習Hadoop MapReduce編程,並在Windows或Mac OS X筆記型電腦上輕鬆設置虛擬的4節點Linux YARN叢集。與其用不連貫的、零散的程式碼片段逐一展示Hadoop MapReduce編程特性,不如將整個Hadoop MapReduce編程學習過程置於一個共同的應用背景中,即挖掘大量信用卡交易記錄數據中的客戶消費模式。提供了精確的端到端程序,幫助您快速在Windows上使用Eclipse和Maven設置Hadoop MapReduce開發環境。還提供了逐步程序,說明如何以最小配置設置四節點Linux叢集,以便您可以在本地、獨立和完全分佈模式下在真實叢集上運行MapReduce程序。事實上,本書中呈現的所有MapReduce程序均已在這樣的Linux叢集上進行測試和驗證。本教科書主要專注於以科學、客觀、定量的方法教授Hadoop MapReduce編程。與其過度依賴主觀、冗長(有時甚至自命不凡)的文字描述和稀疏的程式碼片段,本教科書使用Hadoop Java API、Hadoop配置參數、完整的MapReduce程序及其執行日誌和輸出來演示Hadoop MapReduce框架的運作方式以及如何編寫MapReduce程序。具體來說,本書涵蓋以下主題:


  • Hadoop簡介

  • 設置Linux Hadoop叢集

  • Hadoop分佈式檔案系統

  • MapReduce作業協調和工作流程

  • 基本MapReduce編程

  • 進階MapReduce編程

  • Hadoop Streaming

  • Hadoop管理

無論您在團隊中扮演什麼角色,本書都能幫助您以最有效和高效的方式獲得真正可應用的Hadoop技能。本書也可作為高年級計算機科學學生的分佈式計算或Hadoop課程的補充教材。

最後瀏覽商品 (20)