Mastering Apache Spark 2.x - Second Edition (精通 Apache Spark 2.x - 第二版)
Romeo Kienzler
- 出版商: Packt Publishing
- 出版日期: 2017-07-20
- 定價: $1,650
- 售價: 8.0 折 $1,320
- 語言: 英文
- 頁數: 354
- 裝訂: Paperback
- ISBN: 1786462745
- ISBN-13: 9781786462749
-
相關分類:
Spark
立即出貨 (庫存=1)
買這商品的人也買了...
-
$480$379 -
$880$695 -
$990Java: The Complete Reference, 9/e (Paperback)
-
$550$468 -
$420$357 -
$950$950 -
$800Java Deep Learning Essentials (Paperback)
-
$650$553 -
$580$458 -
$650$553 -
$500$395 -
$2,180$2,071 -
$1,320Mastering Java for Data Science
-
$450$356 -
$2,080$1,976 -
$390$332 -
$2,180$2,071 -
$2,380$2,261 -
$2,180$2,071 -
$580$458 -
$490$245 -
$480$408 -
$403深度學習與計算機視覺 : 算法原理、框架應用與代碼實現 (Deep Learning & Computer Vision:Algorithms and Examples)
-
$430$387 -
$780$616
相關主題
商品描述
Advanced analytics on your Big Data with latest Apache Spark 2.x
About This Book
- An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities.
- Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark.
- Master the art of real-time processing with the help of Apache Spark 2.x
Who This Book Is For
If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected.
What You Will Learn
- Examine Advanced Machine Learning and DeepLearning with MLlib, SparkML, SystemML, H2O and DeepLearning4J
- Study highly optimised unified batch and real-time data processing using SparkSQL and Structured Streaming
- Evaluate large-scale Graph Processing and Analysis using GraphX and GraphFrames
- Apply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM Cloud
- Understand internal details of cost based optimizers used in Catalyst, SystemML and GraphFrames
- Learn how specific parameter settings affect overall performance of an Apache Spark cluster
- Leverage Scala, R and python for your data science projects
In Detail
Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and
商品描述(中文翻譯)
使用最新的Apache Spark 2.x在您的大數據上進行高級分析
關於本書
- 結合指導和實際示例的高級指南,擴展最新的Spark功能。
- 使用Spark的高級概念,擴展數據處理能力,以在最短時間內處理大量數據。
- 通過Apache Spark 2.x的幫助,掌握實時處理的技巧。
本書適合對象
如果您是一位具有一定Spark經驗的開發人員,並且希望加強在Spark世界中的知識,那麼本書非常適合您。假設您具有Linux、Hadoop和Spark的基本知識,並且對Scala有合理的了解。
您將學到什麼
- 使用MLlib、SparkML、SystemML、H2O和DeepLearning4J進行高級機器學習和深度學習
- 使用SparkSQL和Structured Streaming進行高度優化的統一批處理和實時數據處理
- 使用GraphX和GraphFrames進行大規模圖形處理和分析
- 使用Jupyter和Zeppelin Notebooks、Docker、Kubernetes和IBM Cloud在彈性部署中應用Apache Spark
- 了解Catalyst、SystemML和GraphFrames中使用的基於成本的優化器的內部細節
- 了解特定參數設置如何影響Apache Spark集群的整體性能
- 在數據科學項目中利用Scala、R和Python
詳細內容
Apache Spark是一個基於內存的集群並行處理系統,提供了各種功能,如圖形處理、機器學習、流處理等。