Apache Spark 2.x Machine Learning Cookbook
暫譯: Apache Spark 2.x 機器學習食譜

Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

商品描述

Simplify machine learning model implementations with Spark

About This Book

  • Solve the day-to-day problems of data science with Spark
  • This unique cookbook consists of exciting and intuitive numerical recipes
  • Optimize your work by acquiring, cleaning, analyzing, predicting, and visualizing your data

Who This Book Is For

This book is for Scala developers with a fairly good exposure to and understanding of machine learning techniques, but lack practical implementations with Spark. A solid knowledge of machine learning algorithms is assumed, as well as hands-on experience of implementing ML algorithms with Scala. However, you do not need to be acquainted with the Spark ML libraries and ecosystem.

What You Will Learn

  • Get to know how Scala and Spark go hand-in-hand for developers when developing ML systems with Spark
  • Build a recommendation engine that scales with Spark
  • Find out how to build unsupervised clustering systems to classify data in Spark
  • Build machine learning systems with the Decision Tree and Ensemble models in Spark
  • Deal with the curse of high-dimensionality in big data using Spark
  • Implement Text analytics for Search Engines in Spark
  • Streaming Machine Learning System implementation using Spark

In Detail

Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability, and optimization. Learning about algorithms enables a wide range of applications, from everyday tasks such as product recommendations and spam filtering to cutting edge applications such as self-driving cars and personalized medicine. You will gain hands-on experience of applying these principles using Apache Spark, a resilient cluster computing system well suited for large-scale machine learning tasks.

This book begins with a quick overview of setting up the necessary IDEs to facilitate the execution of code examples that will be covered in various chapters. It also highlights some key issues developers face while working with machine learning algorithms on the Spark platform. We progress by uncovering the various Spark APIs and the implementation of ML algorithms with developing classification systems, recommendation engines, text analytics, clustering, and learning systems. Toward the final chapters, we’ll focus on building high-end applications and explain various unsupervised methodologies and challenges to tackle when implementing with big data ML systems.

Style and approach

This book is packed with intuitive recipes supported with line-by-line explanations to help you understand how to optimize your work flow and resolve problems when working with complex data modeling tasks and predictive algorithms. This is a valuable resource for data scientists and those working on large scale data projects.

商品描述(中文翻譯)

簡化機器學習模型實作與 Spark

本書介紹



  • 使用 Spark 解決日常數據科學問題

  • 這本獨特的食譜書包含令人興奮且直觀的數值食譜

  • 透過獲取、清理、分析、預測和視覺化數據來優化您的工作

本書適合誰


本書適合對機器學習技術有相當了解的 Scala 開發者,但缺乏使用 Spark 的實際實作經驗。假設讀者對機器學習算法有扎實的知識,並具備使用 Scala 實作 ML 算法的實務經驗。然而,您不需要熟悉 Spark ML 庫和生態系統。

您將學到什麼



  • 了解 Scala 和 Spark 如何攜手合作,幫助開發者在使用 Spark 開發 ML 系統時

  • 建立一個可擴展的推薦引擎,使用 Spark

  • 了解如何在 Spark 中構建無監督的聚類系統以分類數據

  • 使用決策樹和集成模型在 Spark 中構建機器學習系統

  • 使用 Spark 處理大數據中的高維度詛咒

  • 在 Spark 中實作搜尋引擎的文本分析

  • 使用 Spark 實作串流機器學習系統

詳細內容


機器學習旨在從數據中提取知識,依賴於計算機科學、統計學、概率論和優化的基本概念。學習算法使得從日常任務(如產品推薦和垃圾郵件過濾)到尖端應用(如自駕車和個性化醫療)等各種應用成為可能。您將獲得使用 Apache Spark 的實務經驗,這是一個適合大規模機器學習任務的彈性集群計算系統。


本書首先快速概述設置必要的 IDE,以便執行各章節中將涵蓋的代碼示例。它還強調了開發者在 Spark 平台上使用機器學習算法時面臨的一些關鍵問題。我們將逐步揭示各種 Spark API 及其在開發分類系統、推薦引擎、文本分析、聚類和學習系統中的 ML 算法實作。到最後幾章,我們將專注於構建高端應用,並解釋在實作大數據 ML 系統時需要解決的各種無監督方法和挑戰。

風格與方法


本書充滿直觀的食譜,並附有逐行解釋,幫助您了解如何優化工作流程並解決在處理複雜數據建模任務和預測算法時遇到的問題。這是數據科學家和從事大規模數據項目人員的寶貴資源。

最後瀏覽商品 (20)