Active Machine Learning with Python: Refine and elevate data quality over quantity with active learning

Masson-Forsythe, Margaux

  • 出版商: Packt Publishing
  • 出版日期: 2024-03-29
  • 售價: $1,840
  • 貴賓價: 9.5$1,748
  • 語言: 英文
  • 頁數: 176
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1835464947
  • ISBN-13: 9781835464946
  • 相關分類: Python程式語言Machine Learning
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Use active machine learning with Python to improve the accuracy of predictive models, streamline the data analysis process, and adapt to evolving data trends, fostering innovation and progress across diverse fields

Key Features

  • Learn how to implement a pipeline for optimal model creation from large datasets and at lower costs
  • Gain profound insights within your data while achieving greater efficiency and speed
  • Apply your knowledge to real-world use cases and solve complex ML problems
  • Purchase of the print or Kindle book includes a free PDF eBook

Book Description

Building accurate machine learning models requires quality data—lots of it. However, for most teams, assembling massive datasets is time-consuming, expensive, or downright impossible. Led by Margaux Masson-Forsythe, a seasoned ML engineer and advocate for surgical data science and climate AI advancements, this hands-on guide to active machine learning demonstrates how to train robust models with just a fraction of the data using Python's powerful active learning tools.

You’ll master the fundamental techniques of active learning, such as membership query synthesis, stream-based sampling, and pool-based sampling and gain insights for designing and implementing active learning algorithms with query strategy and Human-in-the-Loop frameworks. Exploring various active machine learning techniques, you’ll learn how to enhance the performance of computer vision models like image classification, object detection, and semantic segmentation and delve into a machine AL method for selecting the most informative frames for labeling large videos, addressing duplicated data. You’ll also assess the effectiveness and efficiency of active machine learning systems through performance evaluation.

By the end of the book, you’ll be able to enhance your active learning projects by leveraging Python libraries, frameworks, and commonly used tools.

What you will learn

  • Master the fundamentals of active machine learning
  • Understand query strategies for optimal model training with minimal data
  • Tackle class imbalance, concept drift, and other data challenges
  • Evaluate and analyze active learning model performance
  • Integrate active learning libraries into workflows effectively
  • Optimize workflows for human labelers
  • Explore the finest active learning tools available today

Who this book is for

Ideal for data scientists and ML engineers aiming to maximize model performance while minimizing costly data labeling, this book is your guide to optimizing ML workflows and prioritizing quality over quantity. Whether you’re a technical practitioner or team lead, you’ll benefit from the proven methods presented in this book to slash data requirements and iterate faster.

Basic Python proficiency and familiarity with machine learning concepts such as datasets and convolutional neural networks is all you need to get started.

商品描述(中文翻譯)

使用Python進行主動式機器學習,以提高預測模型的準確性,簡化數據分析過程,並適應不斷變化的數據趨勢,促進不同領域的創新和進步。

主要特點:

- 學習如何從大型數據集中以更低的成本實現最佳模型創建的流程
- 在實現更高效和更快速度的同時,深入了解數據內部的洞察力
- 將知識應用於實際應用案例,解決複雜的機器學習問題
- 購買印刷版或Kindle電子書,可獲得免費的PDF電子書

書籍描述:

構建準確的機器學習模型需要高質量的數據,而這需要大量的數據。然而,對於大多數團隊來說,收集龐大的數據集是耗時、昂貴或根本不可能的。由經驗豐富的機器學習工程師和手術數據科學和氣候AI進步的倡導者Margaux Masson-Forsythe領導,這本實踐指南將展示如何使用Python強大的主動學習工具,僅使用數據的一小部分來訓練強大的模型。

您將掌握主動學習的基本技術,如成員查詢合成、基於流的抽樣和基於池的抽樣,並獲得設計和實施具有查詢策略和人在迴圈框架的主動學習算法的洞察力。通過探索各種主動機器學習技術,您將學習如何提高計算機視覺模型(如圖像分類、物體檢測和語義分割)的性能,並深入研究一種用於選擇大型視頻標籤的最具信息性的幀的機器AL方法,解決了重複數據的問題。您還將通過性能評估評估主動機器學習系統的有效性和效率。

通過閱讀本書,您將能夠利用Python庫、框架和常用工具來增強您的主動學習項目。

您將學到什麼:

- 掌握主動機器學習的基本原理
- 了解最小數據的最佳模型訓練的查詢策略
- 解決類別不平衡、概念漂移和其他數據挑戰
- 評估和分析主動學習模型的性能
- 有效地將主動學習庫集成到工作流程中
- 優化人工標記者的工作流程
- 探索當今最好的主動學習工具

本書適合對最大化模型性能並最小化昂貴數據標記的數據科學家和機器學習工程師,這本書是您優化機器學習工作流程並優先考慮質量而不是數量的指南。無論您是技術從業者還是團隊負責人,您都將從本書中提供的經過驗證的方法中受益,以減少數據需求並加快迭代速度。

只需具備基本的Python熟練度和對數據集和卷積神經網絡等機器學習概念的熟悉即可開始閱讀。

目錄大綱

  1. Introducing Active Machine Learning
  2. Designing Query Strategy Frameworks
  3. Managing the Human in the Loop
  4. Applying Active Learning to Computer Vision
  5. Leveraging Active Learning for Big Data
  6. Evaluating and Enhancing Efficiency
  7. Utilizing Tools and Packages for Active Learning

目錄大綱(中文翻譯)

- 介紹主動機器學習
- 設計查詢策略框架
- 管理人在迴圈中的角色
- 將主動學習應用於電腦視覺
- 利用主動學習處理大數據
- 評估和增強效率
- 利用工具和套件進行主動學習