Python Feature Engineering Cookbook : Over 70 recipes for creating, engineering, and transforming features to build machine learning models (Paperback)
暫譯: Python 特徵工程食譜:超過 70 種創建、工程和轉換特徵以構建機器學習模型的食譜 (平裝本)

Galli, Soledad

  • 出版商: Packt Publishing
  • 出版日期: 2022-10-31
  • 售價: $1,650
  • 貴賓價: 9.5$1,568
  • 語言: 英文
  • 頁數: 386
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1804611301
  • ISBN-13: 9781804611302
  • 相關分類: Python程式語言Machine Learning
  • 立即出貨 (庫存=1)

買這商品的人也買了...

相關主題

商品描述

Create end-to-end, reproducible feature engineering pipelines that can be deployed into production using open-source Python libraries

Key Features

  • Learn and implement feature engineering best practices
  • Reinforce your learning with the help of multiple hands-on recipes
  • Build end-to-end feature engineering pipelines that are performant and reproducible

Book Description

Feature engineering, the process of transforming variables and creating features, albeit time-consuming, ensures that your machine learning models perform seamlessly. This second edition of Python Feature Engineering Cookbook will take the struggle out of feature engineering by showing you how to use open source Python libraries to accelerate the process via a plethora of practical, hands-on recipes.

This updated edition begins by addressing fundamental data challenges such as missing data and categorical values, before moving on to strategies for dealing with skewed distributions and outliers. The concluding chapters show you how to develop new features from various types of data, including text, time series, and relational databases. With the help of numerous open source Python libraries, you'll learn how to implement each feature engineering method in a performant, reproducible, and elegant manner.

By the end of this Python book, you will have the tools and expertise needed to confidently build end-to-end and reproducible feature engineering pipelines that can be deployed into production.

What you will learn

  • Impute missing data using various univariate and multivariate methods
  • Encode categorical variables with one-hot, ordinal, and count encoding
  • Handle highly cardinal categorical variables
  • Transform, discretize, and scale your variables
  • Create variables from date and time with pandas and Feature-engine
  • Combine variables into new features
  • Extract features from text as well as from transactional data with Featuretools
  • Create features from time series data with tsfresh

Who this book is for

This book is for machine learning and data science students and professionals, as well as software engineers working on machine learning model deployment, who want to learn more about how to transform their data and create new features to train machine learning models in a better way.

商品描述(中文翻譯)

建立端到端、可重現的特徵工程管道,並使用開源的 Python 函式庫將其部署到生產環境中

主要特點


  • 學習並實施特徵工程的最佳實踐

  • 透過多個實作食譜加強您的學習

  • 構建高效且可重現的端到端特徵工程管道

書籍描述

特徵工程是轉換變數和創建特徵的過程,雖然耗時,但能確保您的機器學習模型無縫運行。本書《Python 特徵工程食譜》的第二版將通過展示如何使用開源的 Python 函式庫來加速這一過程,幫助您克服特徵工程的困難,提供大量實用的實作食譜。

本更新版首先解決基本的數據挑戰,例如缺失數據和類別值,然後轉向處理偏斜分佈和異常值的策略。最後幾章將展示如何從各種類型的數據中開發新特徵,包括文本、時間序列和關聯數據庫。在眾多開源 Python 函式庫的幫助下,您將學會如何以高效、可重現且優雅的方式實施每種特徵工程方法。

在本書結束時,您將擁有自信構建端到端且可重現的特徵工程管道所需的工具和專業知識,並能將其部署到生產環境中。

您將學到什麼


  • 使用各種單變量和多變量方法填補缺失數據

  • 使用獨熱編碼、序數編碼和計數編碼對類別變數進行編碼

  • 處理高基數的類別變數

  • 轉換、離散化和縮放您的變數

  • 使用 pandas 和 Feature-engine 從日期和時間創建變數

  • 將變數組合成新特徵

  • 使用 Featuretools 從文本和交易數據中提取特徵

  • 使用 tsfresh 從時間序列數據中創建特徵

本書適合誰

本書適合機器學習和數據科學的學生及專業人士,以及從事機器學習模型部署的軟體工程師,旨在幫助他們了解如何轉換數據並創建新特徵,以更好地訓練機器學習模型。

目錄大綱

  1. Imputing Missing Data
  2. Encoding Categorical Variables
  3. Transforming Numerical Variables
  4. Performing Variable Discretization
  5. Working with Outliers
  6. Extracting Features from Date and Time
  7. Performing Feature Scaling
  8. Creating New Features
  9. Extracting Features from Relational Data with Featuretools
  10. Creating Features from Time Series with tsfresh
  11. Extracting Features from Text Variables

目錄大綱(中文翻譯)


  1. Imputing Missing Data

  2. Encoding Categorical Variables

  3. Transforming Numerical Variables

  4. Performing Variable Discretization

  5. Working with Outliers

  6. Extracting Features from Date and Time

  7. Performing Feature Scaling

  8. Creating New Features

  9. Extracting Features from Relational Data with Featuretools

  10. Creating Features from Time Series with tsfresh

  11. Extracting Features from Text Variables