Modern Scala Projects (現代 Scala 專案)

Ilango Gurusamy

  • 出版商: Packt Publishing
  • 出版日期: 2018-07-30
  • 售價: $2,180
  • 貴賓價: 9.5$2,071
  • 語言: 英文
  • 頁數: 334
  • 裝訂: Paperback
  • ISBN: 1788624114
  • ISBN-13: 9781788624114
  • 相關分類: JVM 語言
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Develop robust, Scala-powered projects with the help of machine learning libraries such as SparkML to harvest meaningful insight

Key Features

  • Gain hands-on experience in building data science projects with Scala
  • Exploit powerful functionalities of machine learning libraries
  • Use machine learning algorithms and decision tree models for enterprise apps

Book Description

Scala, together with the Spark Framework, forms a rich and powerful data processing ecosystem. Modern Scala Projects is a journey into the depths of this ecosystem. The machine learning (ML) projects presented in this book enable you to create practical, robust data analytics solutions, with an emphasis on automating data workflows with the Spark ML pipeline API. This book showcases or carefully cherry-picks from Scala's functional libraries and other constructs to help readers roll out their own scalable data processing frameworks. The projects in this book enable data practitioners across all industries gain insights into data that will help organizations have strategic and competitive advantage.

Modern Scala Projects focuses on the application of supervisory learning ML techniques that classify data and make predictions. You'll begin with working on a project to predict a class of flower by implementing a simple machine learning model. Next, you'll create a cancer diagnosis classification pipeline, followed by projects delving into stock price prediction, spam filtering, fraud detection, and a recommendation engine.

By the end of this book, you will be able to build efficient data science projects that fulfil your software requirements.

What you will learn

  • Create pipelines to extract data or analytics and visualizations
  • Automate your process pipeline with jobs that are reproducible
  • Extract intelligent data efficiently from large, disparate datasets
  • Automate the extraction, transformation, and loading of data
  • Develop tools that collate, model, and analyze data
  • Maintain the integrity of data as data flows become more complex
  • Develop tools that predict outcomes based on “pattern discovery”
  • Build really fast and accurate machine-learning models in Scala

Who this book is for

Modern Scala Projects is for Scala developers who would like to gain some hands-on experience with some interesting real-world projects. Prior programming experience with Scala is necessary.

Table of Contents

  1. Predict the class of a flower from the Iris Dataset
  2. Build a Breast Cancer Prognosis Pipeline with the Power of Spark and Scala
  3. Stock Price Predictions
  4. Build a Spam Classification Pipeline
  5. Build a fraud detection System
  6. Build Flights Performance Prediction Model
  7. Building a Recommendation Engine

商品描述(中文翻譯)

**開發強大的 Scala 驅動專案,利用機器學習庫如 SparkML 獲取有意義的洞察**

**主要特點**

- 獲得使用 Scala 建立資料科學專案的實作經驗
- 發揮機器學習庫的強大功能
- 使用機器學習演算法和決策樹模型來開發企業應用程式

**書籍描述**

Scala 與 Spark 框架共同形成了一個豐富且強大的資料處理生態系統。《現代 Scala 專案》是深入探索這個生態系統的旅程。本書中呈現的機器學習 (ML) 專案使您能夠創建實用且穩健的資料分析解決方案,重點在於使用 Spark ML pipeline API 自動化資料工作流程。本書精心挑選了 Scala 的函數庫和其他結構,幫助讀者推出自己的可擴展資料處理框架。本書中的專案使各行各業的資料從業者能夠獲得資料洞察,幫助組織獲得戰略和競爭優勢。

《現代 Scala 專案》專注於應用監督式學習的 ML 技術,這些技術用於分類資料和進行預測。您將從一個專案開始,通過實作一個簡單的機器學習模型來預測花的類別。接下來,您將創建一個癌症診斷分類管道,然後進行股票價格預測、垃圾郵件過濾、詐騙檢測和推薦引擎等專案。

在本書結束時,您將能夠建立滿足您軟體需求的高效資料科學專案。

**您將學到的內容**

- 創建管道以提取資料或進行分析和視覺化
- 使用可重複的作業自動化您的流程管道
- 從大型、異質的資料集中高效提取智能資料
- 自動化資料的提取、轉換和加載
- 開發工具以彙總、建模和分析資料
- 隨著資料流變得更加複雜,維護資料的完整性
- 開發基於「模式發現」的預測工具
- 在 Scala 中構建快速且準確的機器學習模型

**本書適合誰**

《現代 Scala 專案》適合希望獲得一些有趣的實際專案經驗的 Scala 開發者。需要具備 Scala 的先前程式設計經驗。

**目錄**

1. 從 Iris 數據集預測花的類別
2. 利用 Spark 和 Scala 建立乳腺癌預後管道
3. 股票價格預測
4. 建立垃圾郵件分類管道
5. 建立詐騙檢測系統
6. 建立航班性能預測模型
7. 建立推薦引擎