Apache Flume: Distributed Log Collection for Hadoop (What You Need to Know)

Name: Apache Flume: Distributed Log Collection for Hadoop (What You Need to Know)
Price: 1625 TWD
Availability: OnlineOnly
Author: Steve Hoffman
ISBN: 1782167919

Steve Hoffman

出版商: Packt Publishing
出版日期: 2013-07-04
售價: $1,710
貴賓價: 9.5 折 $1,625
語言: 英文
頁數: 108
裝訂: Paperback
ISBN: 1782167919
ISBN-13: 9781782167914
相關分類: Hadoop

海外代購書籍(需單獨結帳)

買這商品的人也買了...

~~$720~~ $562

Java Objects 徹底研究 (Beginning Java Objects: From Concepts to Code, 2/e)
~~$320~~ $272

上班族一定要會的 Excel 技巧－不必問前輩‧效率馬上 UP !
~~$750~~ $638

Linux 驅動程式開發實戰 (Essential Linux Device Drivers)
~~$1,940~~ $1,843

Understanding Cryptography: A Textbook for Students and Practitioners (Hardcover)
~~$680~~ $537

精通 Python 3 程式設計, 2/e (Programming in Python 3: A Complete Introduction to the Python Language, 2/e)
~~$950~~ $808

Google Android SDK 開發範例大全, 3/e
~~$520~~ $411

Android 4.X 手機/平板電腦程式設計入門、應用到精通, 2/e (適用 Android 1.X~4.X)
~~$580~~ $435

HTML & CSS : 網站設計建置優化之道 (HTML and CSS: Design and Build Websites)
~~$780~~ $764

微積分, 7/e (Stewart)
~~$680~~ $537

ASP.NET MVC 4 網站開發美學
~~$650~~ $553

Visual C# 2012 資料庫程式設計暨進銷存系統實作
~~$1,130~~ $961

超圖解 Arduino 互動設計入門 (附 Arduino UNO R3 開發板)
~~$480~~ $408

易讀程式之美學－提升程式碼可讀性的簡單法則 (The Art of Readable Code)
~~$290~~ $226

雲端行動 App 設計與開發－使用 CmoreCloud 雲端行動 App 設計與開發，讓您不會寫程式也能輕鬆、快速的設計 App！
~~$880~~ $695

深入淺出 HTML and CSS, 2/e (Head First HTML and CSS, 2/e)
~~$860~~ $731

王者歸來－PHP 完全開發範例集, 2/e
~~$940~~ $700

無瑕的程式碼－敏捷軟體開發技巧守則 + 番外篇－專業程式設計師的生存之道 (雙書合購)
~~$2,340~~ $1,825

Raspberry Pi 從入門到應用 + Raspberry Pi rev 2 Model B 512MB (超值限量合購組)
~~$650~~ $585

電腦網際網路, 6/e (國際版)(Computer Networking: A Top-Down Approach, 6/e)(附部分內容光碟)
~~$299~~ $236

一觸即發｜Windows 8.1 玩全手冊
~~$480~~ $408

透視 C語言指標－深度探索記憶體管理核心技術 (Understanding and Using C Pointers)
~~$480~~ $374

設計模式的解析與活用 (Design Patterns Explained: A New Perspective on Object-Oriented Design, 2/e)
~~$2,280~~ $2,166

An Introduction to Mathematical Cryptography (Hardcover)
~~$520~~ $406

培養與鍛鍊程式設計的邏輯腦：世界級程式設計大賽的知識、心得與解題分享, 2/e (CPE 大學程式能力檢定最佳參考用書)
~~$750~~ $638

一次擁有 Linux 雙認證－LPIC Level I + Novell CLA 11 自學手冊, 2/e

商品描述

If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you.

Overview

Integrate Flume with your data sources
Transcode your data en-route in Flume
Route and separate your data using regular expression matching
Configure failover paths and load-balancing to remove single points of failure
Utilize Gzip Compression for files written to HDFS

In Detail

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms.

Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation.

Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume.

It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.

By the end, you should be able to construct a series of Flume agents to transport your streaming data and logs from your systems into Hadoop in near real time.

What you will learn from this book

Understand the Flume architecture
Download and install open source Flume from Apache
Discover when to use a memory or file-backed channel
Understand and configure the Hadoop File System (HDFS) sink
Learn how to use sink groups to create redundant data flows
Configure and use various sources for ingesting data
Inspect data records and route to different or multiple destinations based on payload content
Transform data en-route to Hadoop
Monitor your data flows

Approach

A starter guide that covers Apache Flume in detail.

Who this book is written for

Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

商品描述(中文翻譯)

如果您的角色包括將數據集移入Hadoop，這本書將幫助您更有效地使用Apache Flume。從安裝到自定義，這是一本完整的逐步指南，讓這項服務為您工作。

概述：
- 將Flume與您的數據源集成
- 在Flume中途轉碼數據
- 使用正則表達式匹配路由和分離數據
- 配置故障轉移路徑和負載平衡以消除單點故障
- 將寫入HDFS的文件使用Gzip壓縮

詳細內容：
Apache Flume是一個分佈式、可靠且可用的服務，用於高效地收集、聚合和移動大量日誌數據。它的主要目標是將數據從應用程序傳遞到Apache Hadoop的HDFS。它具有基於流數據流的簡單靈活的架構。它具有多個故障轉移和恢復機制，具有強大的容錯能力。

《Apache Flume: Distributed Log Collection for Hadoop》介紹了HDFS和流數據/日誌的問題，以及Flume如何解決這些問題。本書解釋了Flume的通用架構，包括將數據移動到/從數據庫、NO-SQL數據存儲以及優化性能。本書還包括Flume實施的實際場景。

《Apache Flume: Distributed Log Collection for Hadoop》從Flume的架構概述開始，然後詳細討論每個組件。它引導您完成完整的安裝過程和Flume的編譯。

本書將告訴您如何使用通道和通道選擇器。對於每個架構組件（源、通道、接收器、通道處理器、接收器組等），將詳細介紹各種實現以及配置選項。您可以使用它根據自己的需求自定義Flume。還提供了有關編寫自定義實現的指針，這將幫助您學習和實施它們。

最後，您應該能夠構建一系列Flume代理，將流數據和日誌從系統實時傳輸到Hadoop。

從本書中您將學到：
- 瞭解Flume的架構
- 從Apache下載並安裝開源Flume
- 瞭解何時使用內存或文件支持的通道
- 瞭解並配置Hadoop文件系統（HDFS）接收器
- 學習如何使用接收器組創建冗余數據流
- 配置和使用各種源來輸入數據
- 檢查數據記錄並根據有效負載內容將其路由到不同或多個目的地
- 在傳輸到Hadoop的過程中轉換數據
- 監控數據流

這是一本詳細介紹Apache Flume的入門指南。

本書適合以下讀者：
- 負責及時可靠地將數據集移入Hadoop的人，如軟件工程師、數據庫管理員和數據倉庫管理員。