Building Data Science Solutions with Anaconda: A comprehensive starter guide to building robust and complete models
暫譯: 使用 Anaconda 建立資料科學解決方案:全面的入門指南以構建穩健且完整的模型

Meador, Dan

  • 出版商: Packt Publishing
  • 出版日期: 2022-05-27
  • 售價: $1,620
  • 貴賓價: 9.5$1,539
  • 語言: 英文
  • 頁數: 330
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1800568789
  • ISBN-13: 9781800568785
  • 相關分類: Data Science
  • 立即出貨 (庫存=1)

商品描述

Key Features

  • Learn from an AI patent-holding engineering manager with deep experience in Anaconda tools and OSS
  • Get to grips with critical aspects of data science such as bias in datasets and interpretability of models
  • Gain a deeper understanding of the AI/ML landscape through real-world examples and practical analogies

Book Description

You might already know that there's a wealth of data science and machine learning resources available on the market, but what you might not know is how much is left out by most of these AI resources. This book not only covers everything you need to know about algorithm families but also ensures that you become an expert in everything, from the critical aspects of avoiding bias in data to model interpretability, which have now become must-have skills.

In this book, you'll learn how using Anaconda as the easy button, can give you a complete view of the capabilities of tools such as conda, which includes how to specify new channels to pull in any package you want as well as discovering new open source tools at your disposal. You'll also get a clear picture of how to evaluate which model to train and identify when they have become unusable due to drift. Finally, you'll learn about the powerful yet simple techniques that you can use to explain how your model works.

By the end of this book, you'll feel confident using conda and Anaconda Navigator to manage dependencies and gain a thorough understanding of the end-to-end data science workflow.

What you will learn

  • Install packages and create virtual environments using conda
  • Understand the landscape of open source software and assess new tools
  • Use scikit-learn to train and evaluate model approaches
  • Detect bias types in your data and what you can do to prevent it
  • Grow your skillset with tools such as NumPy, pandas, and Jupyter Notebooks
  • Solve common dataset issues, such as imbalanced and missing data
  • Use LIME and SHAP to interpret and explain black-box models

Who this book is for

If you're a data analyst or data science professional looking to make the most of Anaconda's capabilities and deepen your understanding of data science workflows, then this book is for you. You don't need any prior experience with Anaconda, but a working knowledge of Python and data science basics is a must.

商品描述(中文翻譯)

#### 主要特點

- 向一位擁有AI專利的工程經理學習,他在Anaconda工具和開源軟體(OSS)方面擁有深厚的經驗
- 理解數據科學中的關鍵方面,例如數據集中的偏見和模型的可解釋性
- 通過實際案例和實用類比,深入了解AI/ML的全景

#### 書籍描述

您可能已經知道市場上有大量的數據科學和機器學習資源,但您可能不知道大多數這些AI資源中遺漏了多少內容。本書不僅涵蓋了您需要了解的所有算法家族,還確保您成為一位專家,掌握從避免數據偏見到模型可解釋性等關鍵技能,這些技能如今已成為必備技能。

在本書中,您將學習如何使用Anaconda作為簡易按鈕,全面了解工具如conda的功能,包括如何指定新的通道以拉取您想要的任何包,以及發現可用的新開源工具。您還將清楚了解如何評估要訓練的模型,並識別何時因漂移而變得無法使用。最後,您將學習到強大而簡單的技術,以解釋您的模型如何運作。

在本書結束時,您將能夠自信地使用conda和Anaconda Navigator來管理依賴關係,並全面了解端到端的數據科學工作流程。

#### 您將學到什麼

- 使用conda安裝包和創建虛擬環境
- 理解開源軟體的全景並評估新工具
- 使用scikit-learn訓練和評估模型方法
- 檢測數據中的偏見類型以及您可以採取的預防措施
- 使用NumPy、pandas和Jupyter Notebooks擴展您的技能
- 解決常見的數據集問題,例如不平衡和缺失數據
- 使用LIME和SHAP來解釋和說明黑箱模型

#### 本書適合誰

如果您是一位數據分析師或數據科學專業人士,想要充分利用Anaconda的功能並加深對數據科學工作流程的理解,那麼本書適合您。您不需要有Anaconda的先前經驗,但必須具備Python和數據科學基礎的工作知識。

作者簡介

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.

作者簡介(中文翻譯)

丹·梅多(Dan Meador)是 Anaconda 的工程經理,也是 Conda 的創建者以及 Anaconda 的開源倡導者。擁有工程和客戶面對角色的背景,他能夠迅速適應任何職位。他在從 Fortune 10 到初創公司的多個角色中,展現了作為領導者和追隨者的卓越表現。

目錄大綱

Table of Contents

  1. Understanding the AI/ML Landscape
  2. Analyzing Open Source Software
  3. Using Anaconda Distribution to Manage Packages
  4. Working with Jupyter Notebooks and NumPy
  5. Cleaning and Visualizing Data
  6. Overcoming Bias in AI/ML
  7. Choosing the Best AI Algorithm
  8. Dealing with Common Data Problems
  9. Building a Regression Model with scikit-learn
  10. Explainable AI - Using LIME and SHAP
  11. Tuning Hyperparameters and Versioning Your Model

目錄大綱(中文翻譯)

Table of Contents


  1. Understanding the AI/ML Landscape

  2. Analyzing Open Source Software

  3. Using Anaconda Distribution to Manage Packages

  4. Working with Jupyter Notebooks and NumPy

  5. Cleaning and Visualizing Data

  6. Overcoming Bias in AI/ML

  7. Choosing the Best AI Algorithm

  8. Dealing with Common Data Problems

  9. Building a Regression Model with scikit-learn

  10. Explainable AI - Using LIME and SHAP

  11. Tuning Hyperparameters and Versioning Your Model