Machine Learning with Apache Spark Quick Start Guide: Uncover patterns, derive actionable insights, and learn from big data using MLlib
暫譯: Apache Spark 機器學習快速入門指南:揭示模式、獲取可行見解,並利用 MLlib 從大數據中學習

Jillur Quddus

  • 出版商: Packt Publishing
  • 出版日期: 2018-12-22
  • 售價: $1,400
  • 貴賓價: 9.5$1,330
  • 語言: 英文
  • 頁數: 240
  • 裝訂: Paperback
  • ISBN: 1789346568
  • ISBN-13: 9781789346565
  • 相關分類: Spark大數據 Big-dataMachine Learning
  • 海外代購書籍(需單獨結帳)

商品描述

Combine advanced analytics including Machine Learning, Deep Learning Neural Networks and Natural Language Processing with modern scalable technologies including Apache Spark to derive actionable insights from Big Data in real-time

Key Features

  • Make a hands-on start in the fields of Big Data, Distributed Technologies and Machine Learning
  • Learn how to design, develop and interpret the results of common Machine Learning algorithms
  • Uncover hidden patterns in your data in order to derive real actionable insights and business value

Book Description

Every person and every organization in the world manages data, whether they realize it or not. Data is used to describe the world around us and can be used for almost any purpose, from analyzing consumer habits to fighting disease and serious organized crime. Ultimately, we manage data in order to derive value from it, and many organizations around the world have traditionally invested in technology to help process their data faster and more efficiently.

But we now live in an interconnected world driven by mass data creation and consumption where data is no longer rows and columns restricted to a spreadsheet, but an organic and evolving asset in its own right. With this realization comes major challenges for organizations: how do we manage the sheer size of data being created every second (think not only spreadsheets and databases, but also social media posts, images, videos, music, blogs and so on)? And once we can manage all of this data, how do we derive real value from it?

The focus of Machine Learning with Apache Spark is to help us answer these questions in a hands-on manner. We introduce the latest scalable technologies to help us manage and process big data. We then introduce advanced analytical algorithms applied to real-world use cases in order to uncover patterns, derive actionable insights, and learn from this big data.

What you will learn

  • Understand how Spark fits in the context of the big data ecosystem
  • Understand how to deploy and configure a local development environment using Apache Spark
  • Understand how to design supervised and unsupervised learning models
  • Build models to perform NLP, deep learning, and cognitive services using Spark ML libraries
  • Design real-time machine learning pipelines in Apache Spark
  • Become familiar with advanced techniques for processing a large volume of data by applying machine learning algorithms

Who this book is for

This book is aimed at Business Analysts, Data Analysts and Data Scientists who wish to make a hands-on start in order to take advantage of modern Big Data technologies combined with Advanced Analytics.

Table of Contents

  1. The Big Data Ecosystem
  2. Setting up a Local Development Environment
  3. Artificial Intelligence and Machine Learning
  4. Supervised Learning Using Apache Spark
  5. Unsupervised Learning using Apache Spark
  6. Natural Language Processing using Apache Spark
  7. Deep Learning Using Apache Spark
  8. Real-Time Machine Learning Using Apache Spark

商品描述(中文翻譯)

**結合包括機器學習、深度學習神經網絡和自然語言處理的高級分析,與包括 Apache Spark 在內的現代可擴展技術,從大數據中實時獲取可行的見解**

#### 主要特點
- 在大數據、分散式技術和機器學習領域開始實踐
- 學習如何設計、開發和解釋常見機器學習算法的結果
- 發掘數據中的隱藏模式,以獲取真正可行的見解和商業價值

#### 書籍描述
世界上每個人和每個組織都在管理數據,無論他們是否意識到。數據用來描述我們周圍的世界,幾乎可以用於任何目的,從分析消費者習慣到對抗疾病和嚴重的有組織犯罪。最終,我們管理數據是為了從中獲取價值,世界各地的許多組織傳統上投資於技術,以幫助更快、更有效地處理數據。

但我們現在生活在一個由大量數據創建和消費驅動的互聯世界,數據不再是限制於電子表格的行和列,而是一種有機且不斷演變的資產。隨著這一認識的到來,組織面臨重大挑戰:我們如何管理每秒創建的龐大數據量(不僅僅是電子表格和數據庫,還包括社交媒體帖子、圖像、視頻、音樂、博客等)?一旦我們能夠管理這些數據,我們又如何從中獲取真正的價值?

《使用 Apache Spark 的機器學習》專注於幫助我們以實踐的方式回答這些問題。我們介紹最新的可擴展技術,以幫助我們管理和處理大數據。然後,我們介紹應用於現實世界用例的高級分析算法,以發掘模式、獲取可行的見解,並從這些大數據中學習。

#### 你將學到什麼
- 理解 Spark 在大數據生態系統中的角色
- 理解如何使用 Apache Spark 部署和配置本地開發環境
- 理解如何設計監督式和非監督式學習模型
- 使用 Spark ML 庫構建模型以執行自然語言處理、深度學習和認知服務
- 在 Apache Spark 中設計實時機器學習管道
- 熟悉通過應用機器學習算法處理大量數據的高級技術

#### 本書適合誰
本書旨在針對希望實踐開始,利用現代大數據技術結合高級分析的商業分析師、數據分析師和數據科學家。

#### 目錄
1. 大數據生態系統
2. 設置本地開發環境
3. 人工智慧與機器學習
4. 使用 Apache Spark 的監督式學習
5. 使用 Apache Spark 的非監督式學習
6. 使用 Apache Spark 的自然語言處理
7. 使用 Apache Spark 的深度學習
8. 使用 Apache Spark 的實時機器學習