Building Machine Learning Powered Applications: Going from Idea to Product
暫譯: 構建機器學習驅動的應用程式:從構想到產品

Ameisen, Emmanuel

買這商品的人也買了...

相關主題

商品描述

Learn the skills necessary to design, build, and deploy applications powered by machine learning. Through the course of this hands-on book, you'll build an example ML-driven application from initial idea to deployed product. Data scientists, software engineers, and product managers with little or no ML experience will learn the tools, best practices, and challenges involved in building a real-world ML application step-by-step.

Author Emmanuel Ameisen, who worked as a data scientist at Zipcar and led Insight Data Science's AI program, demonstrates key ML concepts with code snippets, illustrations, and screenshots from the book's example application.

The first part of this guide shows you how to plan and measure success for an ML application. Part II shows you how to build a working ML model, and Part III explains how to improve the model until it fulfills your original vision. Part IV covers deployment and monitoring strategies.

This book will help you:

  • Determine your product goal and set up a machine learning problem
  • Build your first end-to-end pipeline quickly and acquire an initial dataset
  • Train and evaluate your ML model and address performance bottlenecks
  • Deploy and monitor models in a production environment

商品描述(中文翻譯)

學習設計、建構和部署由機器學習驅動的應用程式所需的技能。在這本實作導向的書籍中,您將從初始構想到部署產品,建立一個範例的機器學習(ML)驅動應用程式。對於幾乎沒有機器學習經驗的資料科學家、軟體工程師和產品經理,這本書將逐步介紹建立真實世界機器學習應用程式所需的工具、最佳實踐和挑戰。

作者Emmanuel Ameisen曾在Zipcar擔任資料科學家並領導Insight Data Science的AI計畫,他透過程式碼片段、插圖和書中範例應用程式的截圖來展示關鍵的機器學習概念。

本指南的第一部分將教您如何規劃和衡量機器學習應用程式的成功。第二部分將展示如何建立一個可運作的機器學習模型,第三部分則解釋如何改進模型,直到它實現您的原始願景。第四部分涵蓋部署和監控策略。

這本書將幫助您:

- 確定您的產品目標並設置機器學習問題
- 快速建立您的第一個端到端管道並獲取初始數據集
- 訓練和評估您的機器學習模型並解決性能瓶頸
- 在生產環境中部署和監控模型

作者簡介

Emmanuel Ameisen has worked for years as a Data Scientist. He implemented and deployed predictive analytics and machine learning solutions for Local Motion and Zipcar. Recently, Emmanuel has led Insight Data Science's AI program where he oversaw more than a hundred machine learning projects. Emmanuel holds graduate degrees in artificial intelligence, computer engineering, and management from three of France's top schools.

作者簡介(中文翻譯)

艾曼紐·阿梅森(Emmanuel Ameisen)擔任數據科學家多年。他為 Local Motion 和 Zipcar 實施並部署了預測分析和機器學習解決方案。最近,艾曼紐領導了 Insight Data Science 的人工智慧(AI)計劃,負責監督超過一百個機器學習專案。艾曼紐擁有法國三所頂尖學校的人工智慧、計算機工程和管理碩士學位。

目錄大綱

How to Contact Us
Acknowledgments
I. Find the Correct ML Approach
1. From Product Goal to ML Framing
Estimate What Is Possible
Models
Data
Framing the ML Editor
Trying to Do It All with ML: An End-to-End Framework
The Simplest Approach: Being the Algorithm
Middle Ground: Learning from Our Experience
Monica Rogati: How to Choose and Prioritize ML Projects
Conclusion

2. Create a Plan
Measuring Success
Business Performance
Model Performance
Freshness and Distribution Shift
Speed
Estimate Scope and Challenges
Leverage Domain Expertise
Stand on the Shoulders of Giants
ML Editor Planning
Initial Plan for an Editor
Always Start with a Simple Model
To Make Regular Progress: Start Simple
Start with a Simple Pipeline
Pipeline for the ML Editor
Conclusion

II. Build a Working Pipeline
3. Build Your First End-to-End Pipeline
The Simplest Scaffolding
Prototype of an ML Editor
Parse and Clean Data
Tokenizing Text
Generating Features
Test Your Workflow
User Experience
Modeling Results
ML Editor Prototype Evaluation
Model
User Experience
Conclusion

4. Acquire an Initial Dataset
Iterate on Datasets
Do Data Science
Explore Your First Dataset
Be Efficient, Start Small
Insights Versus Products
A Data Quality Rubric
Label to Find Data Trends
Summary Statistics
Explore and Label Efficiently
Be the Algorithm
Data Trends
Let Data Inform Features and Models
Build Features Out of Patterns
ML Editor Features
Robert Munro: How Do You Find, Label, and Leverage Data?
Conclusion

III. Iterate on Models
5. Train and Evaluate Your Model
The Simplest Appropriate Model
Simple Models
From Patterns to Models
Split Your Dataset
ML Editor Data Split
Judge Performance
Evaluate Your Model: Look Beyond Accuracy
Contrast Data and Predictions
Confusion Matrix
ROC Curve
Calibration Curve
Dimensionality Reduction for Errors
The Top-k Method
Other Models
Evaluate Feature Importance
Directly from a Classifier
Black-Box Explainers
Conclusion

6. Debug Your ML Problems
Software Best Practices
ML-Specific Best Practices
Debug Wiring: Visualizing and Testing
Start with One Example
Test Your ML Code
Debug Training: Make Your Model Learn
Task Difficulty
Optimization Problems
Debug Generalization: Make Your Model Useful
Data Leakage
Overfitting
Consider the Task at Hand
Conclusion

7. Using Classifiers for Writing Recommendations
Extracting Recommendations from Models
What Can We Achieve Without a Model?
Extracting Global Feature Importance
Using a Model’s Score
Extracting Local Feature Importance
Comparing Models
Version 1: The Report Card
Version 2: More Powerful, More Unclear
Version 3: Understandable Recommendations
Generating Editing Recommendations
Conclusion

IV. Deploy and Monitor
8. Considerations When Deploying Models
Data Concerns
Data Ownership
Data Bias
Systemic Bias
Modeling Concerns
Feedback Loops
Inclusive Model Performance
Considering Context
Adversaries
Abuse Concerns and Dual-Use
Chris Harland: Shipping Experiments
Conclusion

9. Choose Your Deployment Option
Server-Side Deployment
Streaming Application or API
Batch Predictions
Client-Side Deployment
On Device
Browser Side
Federated Learning: A Hybrid Approach
Conclusion

10. Build Safeguards for Models
Engineer Around Failures
Input and Output Checks
Model Failure Fallbacks
Engineer for Performance
Scale to Multiple Users
Model and Data Life Cycle Management
Data Processing and DAGs
Ask for Feedback
Chris Moody: Empowering Data Scientists to Deploy Models
Conclusion

11. Monitor and Update Models
Monitoring Saves Lives
Monitoring to Inform Refresh Rate
Monitor to Detect Abuse
Choose What to Monitor
Performance Metrics
Business Metrics
CI/CD for ML
A/B Testing and Experimentation
Other Approaches
Conclusion
Index

目錄大綱(中文翻譯)

How to Contact Us

Acknowledgments

I. Find the Correct ML Approach

1. From Product Goal to ML Framing

Estimate What Is Possible

Models

Data

Framing the ML Editor

Trying to Do It All with ML: An End-to-End Framework

The Simplest Approach: Being the Algorithm

Middle Ground: Learning from Our Experience

Monica Rogati: How to Choose and Prioritize ML Projects

Conclusion

2. Create a Plan

Measuring Success

Business Performance

Model Performance

Freshness and Distribution Shift

Speed

Estimate Scope and Challenges

Leverage Domain Expertise

Stand on the Shoulders of Giants

ML Editor Planning

Initial Plan for an Editor

Always Start with a Simple Model

To Make Regular Progress: Start Simple

Start with a Simple Pipeline

Pipeline for the ML Editor

Conclusion

II. Build a Working Pipeline

3. Build Your First End-to-End Pipeline

The Simplest Scaffolding

Prototype of an ML Editor

Parse and Clean Data

Tokenizing Text

Generating Features

Test Your Workflow

User Experience

Modeling Results

ML Editor Prototype Evaluation

Model

User Experience

Conclusion

4. Acquire an Initial Dataset

Iterate on Datasets

Do Data Science

Explore Your First Dataset

Be Efficient, Start Small

Insights Versus Products

A Data Quality Rubric

Label to Find Data Trends

Summary Statistics

Explore and Label Efficiently

Be the Algorithm

Data Trends

Let Data Inform Features and Models

Build Features Out of Patterns

ML Editor Features

Robert Munro: How Do You Find, Label, and Leverage Data?

Conclusion

III. Iterate on Models

5. Train and Evaluate Your Model

The Simplest Appropriate Model

Simple Models

From Patterns to Models

Split Your Dataset

ML Editor Data Split

Judge Performance

Evaluate Your Model: Look Beyond Accuracy

Contrast Data and Predictions

Confusion Matrix

ROC Curve

Calibration Curve

Dimensionality Reduction for Errors

The Top-k Method

Other Models

Evaluate Feature Importance

Directly from a Classifier

Black-Box Explainers

Conclusion

6. Debug Your ML Problems

Software Best Practices

ML-Specific Best Practices

Debug Wiring: Visualizing and Testing

Start with One Example

Test Your ML Code

Debug Training: Make Your Model Learn

Task Difficulty

Optimization Problems

Debug Generalization: Make Your Model Useful

Data Leakage

Overfitting

Consider the Task at Hand

Conclusion

7. Using Classifiers for Writing Recommendations

Extracting Recommendations from Models

What Can We Achieve Without a Model?

Extracting Global Feature Importance

Using a Model’s Score

Extracting Local Feature Importance

Comparing Models

Version 1: The Report Card

Version 2: More Powerful, More Unclear

Version 3: Understandable Recommendations

Generating Editing Recommendations

Conclusion

IV. Deploy and Monitor

8. Considerations When Deploying Models

Data Concerns

Data Ownership

Data Bias

Systemic Bias

Modeling Concerns

Feedback Loops

Inclusive Model Performance

Considering Context

Adversaries

Abuse Concerns and Dual-Use

Chris Harland: Shipping Experiments

Conclusion

9. Choose Your Deployment Option

Server-Side Deployment

Streaming Application or API

Batch Predictions

Client-Side Deployment

On Device

Browser Side

Federated Learning: A Hybrid Approach

Conclusion

10. Build Safeguards for Models

Engineer Around Failures

Input and Output Checks

Model Failure Fallbacks

Engineer for Performance

Scale to Multiple Users

Model and Data Life Cycle Management

Data Processing and DAGs

Ask for Feedback

Chris Moody: Empowering Data Scientists to Deploy Models

Conclusion

11. Monitor and Update Models

Monitoring Saves Lives

Monitoring to Inform Refresh Rate

Monitor to Detect Abuse

Choose What to Monitor

Performance Metrics

Business Metrics

CI/CD for ML

A/B Testing and Experimentation

Other Approaches

Conclusion

Index