Large-Scale Graph Processing Using Apache Giraph
暫譯: 使用 Apache Giraph 進行大規模圖形處理
Sherif Sakr, Faisal Moeen Orakzai, Ibrahim Abdelaziz, Zuhair Khayyat
- 出版商: Springer
- 出版日期: 2017-01-12
- 售價: $2,610
- 貴賓價: 9.5 折 $2,480
- 語言: 英文
- 頁數: 197
- 裝訂: Hardcover
- ISBN: 3319474308
- ISBN-13: 9783319474304
海外代購書籍(需單獨結帳)
商品描述
This book takes its reader on a journey through Apache Giraph, a popular distributed graph processing platform designed to bring the power of big data processing to graph data. Designed as a step-by-step self-study guide for everyone interested in large-scale graph processing, it describes the fundamental abstractions of the system, its programming models and various techniques for using the system to process graph data at scale, including the implementation of several popular and advanced graph analytics algorithms.
The book is organized as follows: Chapter 1 starts by providing a general background of the big data phenomenon and a general introduction to the Apache Giraph system, its abstraction, programming model and design architecture. Next, chapter 2 focuses on Giraph as a platform and how to use it. Based on a sample job, even more advanced topics like monitoring the Giraph application lifecycle and different methods for monitoring Giraph jobs are explained. Chapter 3 then provides an introduction to Giraph programming, introduces the basic Giraph graph model and explains how to write Giraph programs. In turn, Chapter 4 discusses in detail the implementation of some popular graph algorithms including PageRank, connected components, shortest paths and triangle closing. Chapter 5 focuses on advanced Giraph programming, discussing common Giraph algorithmic optimizations, tunable Giraph configurations that determine the system’s utilization of the underlying resources, and how to write a custom graph input and output format. Lastly, chapter 6 highlights two systems that have been introduced to tackle the challenge of large scale graph processing, GraphX and GraphLab, and explains the main commonalities and differences between these systems and Apache Giraph.
This book serves as an essential reference guide for students, researchers and practitioners in the domain of large scale graph processing. It offers step-by-step guidance, with several code examples and the complete source code available in the related github repository. Students will find a comprehensive introduction to and hands-on practice with tackling large scale graph processing problems using the Apache Giraph system, while researchers will discover thorough coverage of the emerging and ongoing advancements in big graph processing systems.
商品描述(中文翻譯)
這本書帶領讀者探索 Apache Giraph,這是一個流行的分散式圖形處理平台,旨在將大數據處理的力量應用於圖形數據。這本書被設計為一個逐步自學的指南,適合所有對大規模圖形處理感興趣的人,描述了系統的基本抽象、其編程模型以及使用該系統處理圖形數據的各種技術,包括幾個流行和先進的圖形分析算法的實現。
本書的組織結構如下:第一章首先提供大數據現象的一般背景以及對 Apache Giraph 系統的簡介,包括其抽象、編程模型和設計架構。接下來,第二章專注於 Giraph 作為一個平台及其使用方法。基於一個示例作業,還解釋了更高級的主題,如監控 Giraph 應用程序生命週期和不同的 Giraph 作業監控方法。第三章則介紹了 Giraph 編程,介紹了基本的 Giraph 圖形模型並解釋如何編寫 Giraph 程序。接著,第四章詳細討論了一些流行圖形算法的實現,包括 PageRank、連通組件、最短路徑和三角形閉合。第五章專注於高級 Giraph 編程,討論常見的 Giraph 算法優化、可調整的 Giraph 配置(這些配置決定了系統對底層資源的利用)以及如何編寫自定義的圖形輸入和輸出格式。最後,第六章強調了兩個為應對大規模圖形處理挑戰而引入的系統,GraphX 和 GraphLab,並解釋了這些系統與 Apache Giraph 之間的主要共通點和差異。
這本書作為大規模圖形處理領域學生、研究人員和從業者的重要參考指南。它提供逐步的指導,包含多個代碼示例,完整的源代碼可在相關的 GitHub 倉庫中獲得。學生將會發現對使用 Apache Giraph 系統解決大規模圖形處理問題的全面介紹和實踐,而研究人員則會發現對於新興和持續進展的大圖形處理系統的深入覆蓋。