Data-Centric Machine Learning with Python: The ultimate guide to engineering and deploying high-quality models based on good data
暫譯: 以 Python 為中心的數據機器學習:基於良好數據工程與部署高品質模型的終極指南
Christensen, Jonas, Bajaj, Nakul, Gosada, Manmohan
- 出版商: Packt Publishing
- 出版日期: 2024-02-29
- 售價: $2,040
- 貴賓價: 9.5 折 $1,938
- 語言: 英文
- 頁數: 378
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1804618128
- ISBN-13: 9781804618127
-
相關分類:
Python、程式語言、Machine Learning
海外代購書籍(需單獨結帳)
相關主題
商品描述
Join the data-centric revolution and master the concepts, techniques, and algorithms shaping the future of AI and ML development, using Python
Key Features:
- Grasp the principles of data centricity and apply them to real-world scenarios
- Gain experience with quality data collection, labeling, and synthetic data creation using Python
- Develop essential skills for building reliable, responsible, and ethical machine learning solutions
- Purchase of the print or Kindle book includes a free PDF eBook
Book Description:
In the rapidly advancing data-driven world where data quality is pivotal to the success of machine learning and artificial intelligence projects, this critically timed guide provides a rare, end-to-end overview of data-centric machine learning (DCML), along with hands-on applications of technical and non-technical approaches to generating deeper and more accurate datasets.
This book will help you understand what data-centric ML/AI is and how it can help you to realize the potential of 'small data'. Delving into the building blocks of data-centric ML/AI, you'll explore the human aspects of data labeling, tackle ambiguity in labeling, and understand the role of synthetic data. From strategies to improve data collection to techniques for refining and augmenting datasets, you'll learn everything you need to elevate your data-centric practices. Through applied examples and insights for overcoming challenges, you'll get a roadmap for implementing data-centric ML/AI in diverse applications in Python.
By the end of this book, you'll have developed a profound understanding of data-centric ML/AI and the proficiency to seamlessly integrate common data-centric approaches in the model development lifecycle to unlock the full potential of your machine learning projects by prioritizing data quality and reliability.
What You Will Learn:
- Understand the impact of input data quality compared to model selection and tuning
- Recognize the crucial role of subject-matter experts in effective model development
- Implement data cleaning, labeling, and augmentation best practices
- Explore common synthetic data generation techniques and their applications
- Apply synthetic data generation techniques using common Python packages
- Detect and mitigate bias in a dataset using best-practice techniques
- Understand the importance of reliability, responsibility, and ethical considerations in ML/AI
Who this book is for:
This book is for data science professionals and machine learning enthusiasts looking to understand the concept of data-centricity, its benefits over a model-centric approach, and the practical application of a best-practice data-centric approach in their work. This book is also for other data professionals and senior leaders who want to explore the tools and techniques to improve data quality and create opportunities for small data ML/AI in their organizations.
商品描述(中文翻譯)
加入以數據為中心的革命,掌握塑造人工智慧(AI)和機器學習(ML)開發未來的概念、技術和算法,使用 Python
主要特色:
- 掌握以數據為中心的原則並應用於實際情境
- 獲得使用 Python 進行高品質數據收集、標註和合成數據創建的經驗
- 發展建立可靠、負責任和倫理的機器學習解決方案所需的基本技能
- 購買印刷版或 Kindle 書籍可獲得免費 PDF 電子書
書籍描述:
在這個快速發展的數據驅動世界中,數據質量對機器學習和人工智慧專案的成功至關重要,這本時機恰當的指南提供了以數據為中心的機器學習(DCML)的罕見端到端概述,並結合了技術和非技術方法的實際應用,以生成更深層和更準確的數據集。
這本書將幫助你理解什麼是以數據為中心的 ML/AI,以及它如何幫助你實現「小數據」的潛力。深入探討以數據為中心的 ML/AI 的基礎構件,你將探索數據標註的人為因素,解決標註中的模糊性,並理解合成數據的角色。從改善數據收集的策略到精煉和增強數據集的技術,你將學習提升以數據為中心的實踐所需的一切。通過應用範例和克服挑戰的見解,你將獲得在 Python 中實施以數據為中心的 ML/AI 的路線圖。
在本書結束時,你將對以數據為中心的 ML/AI 形成深刻的理解,並具備無縫整合常見以數據為中心的方法於模型開發生命周期的能力,從而通過優先考慮數據質量和可靠性來釋放機器學習專案的全部潛力。
你將學到什麼:
- 理解輸入數據質量對模型選擇和調整的影響
- 認識主題專家在有效模型開發中的關鍵角色
- 實施數據清理、標註和增強的最佳實踐
- 探索常見的合成數據生成技術及其應用
- 使用常見的 Python 套件應用合成數據生成技術
- 使用最佳實踐技術檢測和減輕數據集中的偏見
- 理解在 ML/AI 中可靠性、責任和倫理考量的重要性
本書適合誰:
這本書適合數據科學專業人士和機器學習愛好者,旨在理解以數據為中心的概念、其相對於以模型為中心的方法的優勢,以及在工作中實踐最佳數據中心方法的應用。本書也適合其他數據專業人士和高層領導,想要探索改善數據質量的工具和技術,並在其組織中創造小數據 ML/AI 的機會。