The Definitive Guide to Apache Flink: Next Generation Data Processing
暫譯: Apache Flink 完全指南:下一代數據處理
Stefan Papp
- 出版商: Apress
- 出版日期: 2016-06-08
- 售價: $1,770
- 貴賓價: 9.5 折 $1,682
- 語言: 英文
- 頁數: 400
- 裝訂: Paperback
- ISBN: 1484214080
- ISBN-13: 9781484214084
海外代購書籍(需單獨結帳)
相關主題
商品描述
Data Processing is one of the core functionalities of distributed and cloud computing. There is a high demand on low latency and high performance computing as well as the support of abstract processing methods such as SQL querying, analytic frameworks or graph processing by data processing engines.
The Definitive Guide to Apache Flink by Papp starts with the history of Big Data processing with Hadoop and explains the shortcomings of Map Reduce. It shows how YARN and Hadoop 2.x changed the game and how new technologies started to compete to become the successor of Map Reduce.
After some detailed information on Tez and Spark and how they try to solve shortcomings of Map Reduce, this book deals with some architectural patterns for creating a solid data processing engine, such as advanced pipelining methods or in-memory caching. It shows how Flink is using these concepts.
Flink programming will be introduced in a hands-on approach. It starts with how to create a ten minutes build and how to run the first "Word Count" with Flink. Then it continues with more advanced topics such as programming more complex programs. All samples are programmed with Java or Scala.
It shows that Apache Flink has the potential to become one of the key technologies for distributed computing. It aims to replace many small technologies with a more powerful one that covers many aspects of Hadoop programming.
商品描述(中文翻譯)
資料處理是分散式和雲端計算的核心功能之一。對於低延遲和高效能計算的需求非常高,同時資料處理引擎也需要支持抽象處理方法,例如 SQL 查詢、分析框架或圖形處理。
Papp 的《Apache Flink 完全指南》從 Hadoop 的大數據處理歷史開始,並解釋了 Map Reduce 的不足之處。書中展示了 YARN 和 Hadoop 2.x 如何改變遊戲規則,以及新技術如何開始競爭成為 Map Reduce 的接班者。
在詳細介紹 Tez 和 Spark 及其如何解決 Map Reduce 的不足之後,本書探討了一些創建穩固資料處理引擎的架構模式,例如先進的管道方法或內存快取。它展示了 Flink 如何使用這些概念。
Flink 程式設計將以實作方式介紹。首先介紹如何創建一個十分鐘的構建,並如何使用 Flink 執行第一個「字數統計」範例。接著將進一步探討更高級的主題,例如編寫更複雜的程式。所有範例均使用 Java 或 Scala 編寫。
本書顯示 Apache Flink 有潛力成為分散式計算的關鍵技術之一。它旨在用一個更強大的技術取代許多小型技術,涵蓋 Hadoop 程式設計的多個方面。