Machine Learning with Spark Second Edition
暫譯: 使用 Spark 的機器學習(第二版)
Rajdeep Dua, Manpreet Singh Ghotra, Nick Pentreath
- 出版商: Packt Publishing
- 出版日期: 2017-04-28
- 定價: $1,570
- 售價: 8.0 折 $1,256
- 語言: 英文
- 頁數: 532
- 裝訂: Paperback
- ISBN: 1785889931
- ISBN-13: 9781785889936
-
相關分類:
Spark、Machine Learning
-
相關翻譯:
Spark機器學習 (第2版) (簡中版)
立即出貨 (庫存 < 4)
相關主題
商品描述
Key Features
- Get to the grips with the latest version of Apache Spark
- Utilize Spark's machine learning library to implement predictive analytics
- Leverage Spark's powerful tools to load, analyze, clean, and transform your data
Book Description
Spark ML is the machine learning module of Spark. It uses in-memory RDDs to process machine learning models faster for clustering, classification, and regression.
This book will teach you about popular machine learning algorithms and their implementation. You will learn how various machine learning concepts are implemented in the context of Spark ML. You will start by installing Spark in a single and multinode cluster. Next you'll see how to execute Scala and Python based programs for Spark ML. Then we will take a few datasets and go deeper into clustering, classification, and regression. Toward the end, we will also cover text processing using Spark ML.
Once you have learned the concepts, they can be applied to implement algorithms in either green-field implementations or to migrate existing systems to this new platform. You can migrate from Mahout or Scikit to use Spark ML.
What you will learn
- Get hands-on with the latest version of Spark ML
- Create your first Spark program with Scala and Python
- Set up and configure a development environment for Spark on your own computer, as well as on Amazon EC2
- Access public machine learning datasets and use Spark to load, process, clean, and transform data
- Use Spark's machine learning library to implement programs by utilizing well-known machine learning models
- Deal with large-scale text data, including feature extraction and using text data as input to your machine learning models
- Write Spark functions to evaluate the performance of your machine learning models
商品描述(中文翻譯)
**主要特點**
- 熟悉最新版本的 Apache Spark
- 利用 Spark 的機器學習庫實現預測分析
- 利用 Spark 的強大工具來加載、分析、清理和轉換數據
**書籍描述**
Spark ML 是 Spark 的機器學習模組。它使用內存中的 RDD 來更快地處理機器學習模型,以進行聚類、分類和回歸。
本書將教您有關流行的機器學習算法及其實現。您將學習各種機器學習概念在 Spark ML 中的實現方式。您將從在單節點和多節點集群中安裝 Spark 開始。接下來,您將看到如何執行基於 Scala 和 Python 的 Spark ML 程式。然後,我們將使用幾個數據集深入探討聚類、分類和回歸。最後,我們還將涵蓋使用 Spark ML 進行文本處理。
一旦您學會了這些概念,您可以將它們應用於在綠地實現中實現算法,或將現有系統遷移到這個新平台。您可以從 Mahout 或 Scikit 遷移以使用 Spark ML。
**您將學到的內容**
- 實際操作最新版本的 Spark ML
- 使用 Scala 和 Python 創建您的第一個 Spark 程式
- 在自己的電腦以及 Amazon EC2 上設置和配置 Spark 的開發環境
- 訪問公共機器學習數據集,並使用 Spark 加載、處理、清理和轉換數據
- 利用 Spark 的機器學習庫通過使用知名的機器學習模型來實現程式
- 處理大規模文本數據,包括特徵提取和將文本數據作為機器學習模型的輸入
- 編寫 Spark 函數以評估您的機器學習模型的性能