Spark Cookbook (Paperback)
暫譯: Spark 食譜 (平裝本)
Rishi Yadav
- 出版商: Packt Publishing
- 出版日期: 2015-06-19
- 定價: $1,470
- 售價: 9.0 折 $1,323
- 語言: 英文
- 頁數: 221
- 裝訂: Paperback
- ISBN: 1783987065
- ISBN-13: 9781783987061
-
相關分類:
Spark
立即出貨 (庫存=1)
買這商品的人也買了...
-
$780$616 -
$620$527 -
$590$502 -
$450$356 -
$100$95 -
$780$616 -
$420$332 -
$680$537 -
$180$171 -
$2,700$2,565 -
$580$493 -
$480$379 -
$780$616 -
$360$284 -
$190$190 -
$680$578 -
$450$356 -
$580$458 -
$620$484 -
$680$537 -
$520$411 -
$2,550$2,423 -
$420$332 -
$500$395 -
$520$411
商品描述
Over 60 recipes on Spark, covering Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX libraries
About This Book
- Become an expert at graph processing using GraphX
- Use Apache Spark as your single big data compute platform and master its libraries
- Learn with recipes that can be run on a single machine as well as on a production cluster of thousands of machines
Who This Book Is For
If you are a data engineer, an application developer, or a data scientist who would like to leverage the power of Apache Spark to get better insights from big data, then this is the book for you.
What You Will Learn
- Install and configure Apache Spark with various cluster managers
- Set up development environments
- Perform interactive queries using Spark SQL
- Get to grips with real-time streaming analytics using Spark Streaming
- Master supervised learning and unsupervised learning using MLlib
- Build a recommendation engine using MLlib
- Develop a set of common applications or project types, and solutions that solve complex big data problems
- Use Apache Spark as your single big data compute platform and master its libraries
In Detail
By introducing in-memory persistent storage, Apache Spark eliminates the need to store intermediate data in filesystems, thereby increasing processing speed by up to 100 times.
This book will focus on how to analyze large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will cover setting up development environments. You will then cover various recipes to perform interactive queries using Spark SQL and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will then focus on machine learning, including supervised learning, unsupervised learning, and recommendation engine algorithms. After mastering graph processing using GraphX, you will cover various recipes for cluster optimization and troubleshooting.
商品描述(中文翻譯)
超過 60 個關於 Spark 的食譜,涵蓋 Spark Core、Spark SQL、Spark Streaming、MLlib 和 GraphX 函式庫
關於本書
- 成為使用 GraphX 進行圖形處理的專家
- 將 Apache Spark 作為您的單一大數據計算平台,並掌握其函式庫
- 學習可以在單一機器上運行的食譜,以及在數千台機器的生產集群上運行的食譜
本書適合誰閱讀
如果您是數據工程師、應用程式開發人員或數據科學家,希望利用 Apache Spark 的力量從大數據中獲得更好的洞察,那麼這本書就是為您而寫。
您將學到什麼
- 安裝和配置 Apache Spark 與各種集群管理器
- 設置開發環境
- 使用 Spark SQL 執行互動查詢
- 掌握使用 Spark Streaming 進行實時流分析
- 使用 MLlib 精通監督式學習和非監督式學習
- 使用 MLlib 建立推薦引擎
- 開發一組常見的應用程式或專案類型,以及解決複雜大數據問題的解決方案
- 將 Apache Spark 作為您的單一大數據計算平台,並掌握其函式庫
詳細內容
通過引入內存持久存儲,Apache Spark 消除了在檔案系統中存儲中間數據的需要,從而將處理速度提高了多達 100 倍。
本書將重點介紹如何分析大型和複雜的數據集。從安裝和配置 Apache Spark 與各種集群管理器開始,您將涵蓋設置開發環境。接著,您將涵蓋各種食譜,以使用 Spark SQL 執行互動查詢和使用 Twitter Stream 和 Apache Kafka 等各種來源進行實時流處理。然後,您將專注於機器學習,包括監督式學習、非監督式學習和推薦引擎算法。在掌握使用 GraphX 進行圖形處理後,您將涵蓋集群優化和故障排除的各種食譜。