Machine Learning for Tabular Data: Xgboost, Deep Learning, and AI
暫譯: 表格數據的機器學習:Xgboost、深度學習與人工智慧

Ryan, Mark, Massaron, Luca

  • 出版商: Manning
  • 出版日期: 2025-03-25
  • 售價: $2,340
  • 貴賓價: 9.5$2,223
  • 語言: 英文
  • 頁數: 504
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1633438546
  • ISBN-13: 9781633438545
  • 相關分類: 人工智慧Machine LearningDeepLearning
  • 海外代購書籍(需單獨結帳)

商品描述

Business runs on tabular data in databases, spreadsheets, and logs. Crunch that data using deep learning, gradient boosting, and other machine learning techniques.

Machine Learning for Tabular Data teaches you to train insightful machine learning models on common tabular business data sources such as spreadsheets, databases, and logs. You'll discover how to use XGBoost and LightGBM on tabular data, optimize deep learning libraries like TensorFlow and PyTorch for tabular data, and use cloud tools like Vertex AI to create an automated MLOps pipeline.

Machine Learning for Tabular Data will teach you how to:

- Pick the right machine learning approach for your data
- Apply deep learning to tabular data
- Deploy tabular machine learning locally and in the cloud
- Pipelines to automatically train and maintain a model

Machine Learning for Tabular Data covers classic machine learning techniques like gradient boosting, and more contemporary deep learning approaches. By the time you're finished, you'll be equipped with the skills to apply machine learning to the kinds of data you work with every day.

Foreword by Antonio Gulli.

Purchase of the print book includes a free eBook in PDF and ePub formats from Manning Publications.

About the technology

Machine learning can accelerate everyday business chores like account reconciliation, demand forecasting, and customer service automation--not to mention more exotic challenges like fraud detection, predictive maintenance, and personalized marketing. This book shows you how to unlock the vital information stored in spreadsheets, ledgers, databases and other tabular data sources using gradient boosting, deep learning, and generative AI.

About the book

Machine Learning for Tabular Data delivers practical ML techniques to upgrade every stage of the business data analysis pipeline. In it, you'll explore examples like using XGBoost and Keras to predict short-term rental prices, deploying a local ML model with Python and Flask, and streamlining workflows using large language models (LLMs). Along the way, you'll learn to make your models both more powerful and more explainable.

What's inside

- Master XGBoost
- Apply deep learning to tabular data
- Deploy models locally and in the cloud
- Build pipelines to train and maintain models

About the reader

For readers experienced with Python and the basics of machine learning.

About the author

Mark Ryan is the AI Lead of the Developer Knowledge Platform at Google. A three-time Kaggle Grandmaster, Luca Massaron is a Google Developer Expert (GDE) in machine learning and AI. He has published 17 other books.

Table of Contents

Part 1
1 Understanding tabular data
2 Exploring tabular datasets
3 Machine learning vs. deep learning
Part 2
4 Classical algorithms for tabular data
5 Decision trees and gradient boosting
6 Advanced feature processing methods
7 An end-to-end example using XGBoost
Part 3
8 Getting started with deep learning with tabular data
9 Deep learning best practices
10 Model deployment
11 Building a machine learning pipeline
12 Blending gradient boosting and deep learning
A Hyperparameters for classical machine learning models
B K-nearest neighbors and support vector machines

商品描述(中文翻譯)

**商業運作依賴於數據庫、電子表格和日誌中的表格數據。利用深度學習、梯度提升和其他機器學習技術來分析這些數據。**

*《表格數據的機器學習》* 教你如何在常見的表格商業數據來源(如電子表格、數據庫和日誌)上訓練有洞察力的機器學習模型。你將學會如何在表格數據上使用 XGBoost 和 LightGBM,優化深度學習庫(如 TensorFlow 和 PyTorch)以處理表格數據,並使用雲端工具(如 Vertex AI)來創建自動化的 MLOps 管道。

*《表格數據的機器學習》* 將教你如何:
- 為你的數據選擇合適的機器學習方法
- 將深度學習應用於表格數據
- 在本地和雲端部署表格機器學習
- 建立管道以自動訓練和維護模型

*《表格數據的機器學習》* 涵蓋了經典的機器學習技術,如梯度提升,以及更現代的深度學習方法。當你完成這本書時,你將具備將機器學習應用於你每天處理的數據的技能。

前言由 **Antonio Gulli** 撰寫。

購買印刷版書籍可獲得 Manning Publications 提供的免費 PDF 和 ePub 格式電子書。

**關於技術**

機器學習可以加速日常商業工作,如帳戶對賬、需求預測和客戶服務自動化,更不用說像詐騙檢測、預測性維護和個性化行銷等更具挑戰性的任務。本書展示了如何利用梯度提升、深度學習和生成式 AI 解鎖存儲在電子表格、賬本、數據庫和其他表格數據來源中的重要信息。

**關於本書**

*《表格數據的機器學習》* 提供實用的機器學習技術,以升級商業數據分析管道的每個階段。在書中,你將探索使用 XGBoost 和 Keras 預測短期租金價格、使用 Python 和 Flask 部署本地機器學習模型,以及利用大型語言模型(LLMs)簡化工作流程的範例。在這個過程中,你將學會如何使你的模型更強大且更具可解釋性。

**內容概覽**

- 精通 XGBoost
- 將深度學習應用於表格數據
- 在本地和雲端部署模型
- 建立管道以訓練和維護模型

**關於讀者**

適合對 Python 和機器學習基礎有經驗的讀者。

**關於作者**

**Mark Ryan** 是 Google 開發者知識平台的 AI 負責人。三次獲得 Kaggle 大師稱號的 **Luca Massaron** 是 Google 的機器學習和 AI 開發者專家(GDE),他已出版 17 本其他書籍。

**目錄**

第一部分
1 理解表格數據
2 探索表格數據集
3 機器學習與深度學習

第二部分
4 表格數據的經典算法
5 決策樹和梯度提升
6 高級特徵處理方法
7 使用 XGBoost 的端到端範例

第三部分
8 開始使用表格數據的深度學習
9 深度學習最佳實踐
10 模型部署
11 建立機器學習管道
12 結合梯度提升和深度學習
A 經典機器學習模型的超參數
B K 最近鄰和支持向量機

作者簡介

Mark Ryan is a Data Science Manager at Intact Insurance. He holds a Master's degree in Computer Science from the University of Toronto.

Luca Massaron is a data scientist with more than a decade of experience in transforming data into smarter artifacts, solving real-world problems, and generating value for businesses and stakeholders. He is the author of bestselling books on AI, machine learning, and algorithms.

作者簡介(中文翻譯)

馬克·瑞安是Intact Insurance的數據科學經理。他擁有多倫多大學的計算機科學碩士學位。

盧卡·馬薩隆是一位數據科學家,擁有超過十年的經驗,專注於將數據轉化為更智能的產物,解決現實世界的問題,並為企業和利益相關者創造價值。他是關於人工智慧、機器學習和算法的暢銷書作者。