Mastering Apache Spark 2.x - Second Edition
暫譯: 精通 Apache Spark 2.x - 第二版
Romeo Kienzler
- 出版商: Packt Publishing
- 出版日期: 2017-07-20
- 定價: $1,650
- 售價: 8.0 折 $1,320
- 語言: 英文
- 頁數: 354
- 裝訂: Paperback
- ISBN: 1786462745
- ISBN-13: 9781786462749
-
相關分類:
Spark
立即出貨 (庫存=1)
買這商品的人也買了...
-
$480$379 -
$880$695 -
$990Java: The Complete Reference, 9/e (Paperback)
-
$550$363 -
$420$277 -
$950$950 -
$800Java Deep Learning Essentials (Paperback)
-
$650$553 -
$580$458 -
$650$553 -
$500$395 -
$2,200$2,090 -
$1,320Mastering Java for Data Science
-
$450$383 -
$2,100$1,995 -
$390$257 -
$2,200$2,090 -
$2,400$2,280 -
$2,200$2,090 -
$580$458 -
$490$245 -
$480$408 -
$403深度學習與計算機視覺 : 算法原理、框架應用與代碼實現 (Deep Learning & Computer Vision:Algorithms and Examples)
-
$430$387 -
$780$616
相關主題
商品描述
Advanced analytics on your Big Data with latest Apache Spark 2.x
About This Book
- An advanced guide with a combination of instructions and practical examples to extend the most up-to date Spark functionalities.
- Extend your data processing capabilities to process huge chunk of data in minimum time using advanced concepts in Spark.
- Master the art of real-time processing with the help of Apache Spark 2.x
Who This Book Is For
If you are a developer with some experience with Spark and want to strengthen your knowledge of how to get around in the world of Spark, then this book is ideal for you. Basic knowledge of Linux, Hadoop and Spark is assumed. Reasonable knowledge of Scala is expected.
What You Will Learn
- Examine Advanced Machine Learning and DeepLearning with MLlib, SparkML, SystemML, H2O and DeepLearning4J
- Study highly optimised unified batch and real-time data processing using SparkSQL and Structured Streaming
- Evaluate large-scale Graph Processing and Analysis using GraphX and GraphFrames
- Apply Apache Spark in Elastic deployments using Jupyter and Zeppelin Notebooks, Docker, Kubernetes and the IBM Cloud
- Understand internal details of cost based optimizers used in Catalyst, SystemML and GraphFrames
- Learn how specific parameter settings affect overall performance of an Apache Spark cluster
- Leverage Scala, R and python for your data science projects
In Detail
Apache Spark is an in-memory cluster-based parallel processing system that provides a wide range of functionalities such as graph processing, machine learning, stream processing, and
商品描述(中文翻譯)
使用最新的 Apache Spark 2.x 進行大數據的進階分析
本書介紹
- 一本進階指南,結合指導和實際範例,以擴展最新的 Spark 功能。
- 利用 Spark 中的進階概念,擴展您的數據處理能力,以最短的時間處理大量數據。
- 掌握使用 Apache Spark 2.x 進行即時處理的技術。
本書適合誰閱讀
如果您是一位對 Spark 有一定經驗的開發者,並希望加強您在 Spark 世界中的知識,那麼這本書非常適合您。假設您具備基本的 Linux、Hadoop 和 Spark 知識,並期望您對 Scala 有合理的了解。
您將學到什麼
- 檢視使用 MLlib、SparkML、SystemML、H2O 和 DeepLearning4J 的進階機器學習和深度學習。
- 研究使用 SparkSQL 和結構化流處理進行高度優化的統一批次和即時數據處理。
- 評估使用 GraphX 和 GraphFrames 的大規模圖形處理和分析。
- 在彈性部署中應用 Apache Spark,使用 Jupyter 和 Zeppelin 筆記本、Docker、Kubernetes 和 IBM Cloud。
- 了解 Catalyst、SystemML 和 GraphFrames 中使用的基於成本的優化器的內部細節。
- 學習特定參數設置如何影響 Apache Spark 集群的整體性能。
- 利用 Scala、R 和 Python 進行您的數據科學項目。
詳細內容
Apache Spark 是一個基於內存的集群並行處理系統,提供廣泛的功能,如圖形處理、機器學習、流處理等。