The Data Science Workshop - Second Edition: Learn how you can build machine learning models and create your own real-world data science projects
暫譯: 數據科學工作坊(第二版):學習如何構建機器學習模型並創建自己的實際數據科學項目
So, Anthony, Joseph, Thomas V., John, Robert Thas
- 出版商: Packt Publishing
- 出版日期: 2020-08-28
- 售價: $1,920
- 貴賓價: 9.5 折 $1,824
- 語言: 英文
- 頁數: 824
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1800566921
- ISBN-13: 9781800566927
-
相關分類:
Machine Learning、Data Science
海外代購書籍(需單獨結帳)
商品描述
Key Features
- Gain a full understanding of the model production and deployment process
- Build your first machine learning model in just five minutes and get a hands-on machine learning experience
- Understand how to deal with common challenges in data science projects
Book Description
Where there's data, there's insight. With so much data being generated, there is immense scope to extract meaningful information that'll boost business productivity and profitability. By learning to convert raw data into game-changing insights, you'll open new career paths and opportunities.
The Data Science Workshop begins by introducing different types of projects and showing you how to incorporate machine learning algorithms in them. You'll learn to select a relevant metric and even assess the performance of your model. To tune the hyperparameters of an algorithm and improve its accuracy, you'll get hands-on with approaches such as grid search and random search.
Next, you'll learn dimensionality reduction techniques to easily handle many variables at once, before exploring how to use model ensembling techniques and create new features to enhance model performance. In a bid to help you automatically create new features that improve your model, the book demonstrates how to use the automated feature engineering tool. You'll also understand how to use the orchestration and scheduling workflow to deploy machine learning models in batch.
By the end of this book, you'll have the skills to start working on data science projects confidently. By the end of this book, you'll have the skills to start working on data science projects confidently.
What you will learn
- Explore the key differences between supervised learning and unsupervised learning
- Manipulate and analyze data using scikit-learn and pandas libraries
- Understand key concepts such as regression, classification, and clustering
- Discover advanced techniques to improve the accuracy of your model
- Understand how to speed up the process of adding new features
- Simplify your machine learning workflow for production
Who this book is for
This is one of the most useful data science books for aspiring data analysts, data scientists, database engineers, and business analysts. It is aimed at those who want to kick-start their careers in data science by quickly learning data science techniques without going through all the mathematics behind machine learning algorithms. Basic knowledge of the Python programming language will help you easily grasp the concepts explained in this book.
商品描述(中文翻譯)
#### 主要特點
- 獲得對模型生產和部署過程的全面理解
- 在短短五分鐘內建立您的第一個機器學習模型,並獲得實作機器學習的經驗
- 理解如何應對數據科學項目中的常見挑戰
#### 書籍描述
數據之處,洞察便在。隨著大量數據的生成,提取有意義的信息以提升業務生產力和盈利能力的潛力巨大。通過學習將原始數據轉換為改變遊戲規則的洞察,您將開啟新的職業道路和機會。
《數據科學工作坊》首先介紹不同類型的項目,並展示如何在其中融入機器學習算法。您將學會選擇相關的指標,甚至評估模型的性能。為了調整算法的超參數並提高其準確性,您將親自體驗網格搜索(grid search)和隨機搜索(random search)等方法。
接下來,您將學習降維技術,以便輕鬆處理多個變量,然後探索如何使用模型集成技術和創建新特徵來增強模型性能。為了幫助您自動創建改善模型的新特徵,本書演示了如何使用自動特徵工程工具。您還將理解如何使用編排和調度工作流程來批量部署機器學習模型。
到本書結束時,您將具備自信地開始進行數據科學項目的技能。
#### 您將學到什麼
- 探索監督學習和非監督學習之間的主要差異
- 使用 scikit-learn 和 pandas 庫操作和分析數據
- 理解回歸、分類和聚類等關鍵概念
- 發現提高模型準確性的高級技術
- 理解如何加快添加新特徵的過程
- 簡化您的機器學習生產工作流程
#### 本書適合誰
這是一本對於有志成為數據分析師、數據科學家、數據庫工程師和商業分析師的讀者非常有用的數據科學書籍。它旨在幫助那些希望快速學習數據科學技術而不必深入了解機器學習算法背後所有數學的人。對 Python 程式語言的基本知識將幫助您輕鬆掌握本書中解釋的概念。
目錄大綱
Table of Contents
- Introduction to Data Science in Python
- Regression
- Binary Classification
- Multiclass Classification with RandomForest
- Performing Your First Cluster Analysis
- How to Assess Performance
- The Generalization of Machine Learning Models
- Hyperparameter Tuning
- Interpreting a Machine Learning Model
- Analyzing a Dataset
- Data Preparation
- Feature Engineering
- Imbalanced Datasets
- Dimensionality Reduction
- Ensemble Learning
目錄大綱(中文翻譯)
Table of Contents
- Introduction to Data Science in Python
- Regression
- Binary Classification
- Multiclass Classification with RandomForest
- Performing Your First Cluster Analysis
- How to Assess Performance
- The Generalization of Machine Learning Models
- Hyperparameter Tuning
- Interpreting a Machine Learning Model
- Analyzing a Dataset
- Data Preparation
- Feature Engineering
- Imbalanced Datasets
- Dimensionality Reduction
- Ensemble Learning