Python Data Analysis : Perform data collection, data processing, wrangling, visualization, and model building using Python, 3/e (Paperback)
暫譯: Python 數據分析:使用 Python 進行數據收集、數據處理、數據清理、可視化和模型建立,第 3 版(平裝本)
Navlani, Avinash, Fandango, Armando, Idris, Ivan
- 出版商: Packt Publishing
- 出版日期: 2021-02-05
- 售價: $1,670
- 貴賓價: 9.5 折 $1,587
- 語言: 英文
- 頁數: 478
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1789955246
- ISBN-13: 9781789955248
-
相關分類:
Python、程式語言、Data Science
-
相關翻譯:
Python數據分析(第3版) (簡中版)
買這商品的人也買了...
-
$880$695 -
$1,880$1,786 -
$500$390 -
$1,700$1,615 -
$1,570$1,492
商品描述
Understand data analysis pipelines using machine learning algorithms and techniques with this practical guide
Key Features
- Prepare and clean your data to use it for exploratory analysis, data manipulation, and data wrangling
- Discover supervised, unsupervised, probabilistic, and Bayesian machine learning methods
- Get to grips with graph processing and sentiment analysis
Book Description
Data analysis enables you to generate value from small and big data by discovering new patterns and trends, and Python is one of the most popular tools for analyzing a wide variety of data. With this book, you'll get up and running using Python for data analysis by exploring the different phases and methodologies used in data analysis and learning how to use modern libraries from the Python ecosystem to create efficient data pipelines.
Starting with the essential statistical and data analysis fundamentals using Python, you'll perform complex data analysis and modeling, data manipulation, data cleaning, and data visualization using easy-to-follow examples. You'll then understand how to conduct time series analysis and signal processing using ARMA models. As you advance, you'll get to grips with smart processing and data analytics using machine learning algorithms such as regression, classification, Principal Component Analysis (PCA), and clustering. In the concluding chapters, you'll work on real-world examples to analyze textual and image data using natural language processing (NLP) and image analytics techniques, respectively. Finally, the book will demonstrate parallel computing using Dask.
By the end of this data analysis book, you'll be equipped with the skills you need to prepare data for analysis and create meaningful data visualizations for forecasting values from data.
What you will learn
- Explore data science and its various process models
- Perform data manipulation using NumPy and pandas for aggregating, cleaning, and handling missing values
- Create interactive visualizations using Matplotlib, Seaborn, and Bokeh
- Retrieve, process, and store data in a wide range of formats
- Understand data preprocessing and feature engineering using pandas and scikit-learn
- Perform time series analysis and signal processing using sunspot cycle data
- Analyze textual data and image data to perform advanced analysis
- Get up to speed with parallel computing using Dask
Who this book is for
This book is for data analysts, business analysts, statisticians, and data scientists looking to learn how to use Python for data analysis. Students and academic faculties will also find this book useful for learning and teaching Python data analysis using a hands-on approach. A basic understanding of math and working knowledge of the Python programming language will help you get started with this book.
商品描述(中文翻譯)
透過這本實用指南了解使用機器學習演算法和技術的數據分析管道
主要特點
- 準備和清理數據,以便用於探索性分析、數據操作和數據整理
- 探索監督式、非監督式、概率性和貝葉斯機器學習方法
- 掌握圖形處理和情感分析
書籍描述
數據分析使您能夠從小數據和大數據中產生價值,通過發現新的模式和趨勢,而 Python 是分析各種數據的最受歡迎工具之一。通過這本書,您將學會使用 Python 進行數據分析,探索數據分析中使用的不同階段和方法論,並學習如何使用 Python 生態系統中的現代庫來創建高效的數據管道。
從使用 Python 的基本統計和數據分析基礎開始,您將執行複雜的數據分析和建模、數據操作、數據清理和數據可視化,並使用易於跟隨的範例。然後,您將了解如何使用 ARMA 模型進行時間序列分析和信號處理。隨著進步,您將掌握使用機器學習演算法(如回歸、分類、主成分分析 (PCA) 和聚類)進行智能處理和數據分析。在最後幾章中,您將針對文本和圖像數據進行實際案例分析,分別使用自然語言處理 (NLP) 和圖像分析技術。最後,本書將演示如何使用 Dask 進行並行計算。
在這本數據分析書的結尾,您將具備準備數據進行分析和創建有意義的數據可視化以預測數據值所需的技能。
您將學到什麼
- 探索數據科學及其各種過程模型
- 使用 NumPy 和 pandas 進行數據操作,以聚合、清理和處理缺失值
- 使用 Matplotlib、Seaborn 和 Bokeh 創建互動式可視化
- 以多種格式檢索、處理和存儲數據
- 了解使用 pandas 和 scikit-learn 進行數據預處理和特徵工程
- 使用太陽黑子週期數據進行時間序列分析和信號處理
- 分析文本數據和圖像數據以進行高級分析
- 使用 Dask 快速掌握並行計算
本書適合誰
本書適合數據分析師、商業分析師、統計學家和數據科學家,旨在學習如何使用 Python 進行數據分析。學生和學術機構也會發現這本書對於學習和教授 Python 數據分析的實踐方法非常有用。對數學的基本理解和對 Python 程式語言的工作知識將幫助您開始使用這本書。
作者簡介
Avinash Navlani has over 8 years of experience working in data science and AI. Currently, he is working as a senior data scientist, improving products and services for customers by using advanced analytics, deploying big data analytical tools, creating and maintaining models, and onboarding compelling new datasets. Previously, he was a university lecturer, where he trained and educated people in data science subjects such as Python for analytics, data mining, machine learning, database management, and NoSQL. Avinash has been involved in research activities in data science and has been a keynote speaker at many conferences in India.
Armando Fandango creates AI-empowered products by leveraging his expertise in deep learning, machine learning, distributed computing, and computational methods and has provided thought leadership roles as the chief data scientist and director at start-ups and large enterprises. He has advised high-tech AI-based start-ups. Armando has authored books such as Python Data Analysis - Second Edition and Mastering TensorFlow, Packt Publishing. He has also published research in international journals and conferences.
Ivan Idris has an MSc in experimental physics. His graduation thesis had a strong emphasis on applied computer science. After graduating, he worked for several companies as a Java developer, data warehouse developer, and QA analyst. His main professional interests are business intelligence, big data, and cloud computing. Ivan Idris enjoys writing clean, testable code and interesting technical articles. Ivan Idris is the author of NumPy 1.5. Beginner's Guide and NumPy Cookbook by Packt Publishing.
作者簡介(中文翻譯)
Avinash Navlani 擁有超過 8 年的數據科學和人工智慧經驗。目前,他擔任高級數據科學家,通過使用先進的分析技術、部署大數據分析工具、創建和維護模型以及引入引人注目的新數據集來改善客戶的產品和服務。之前,他是一名大學講師,教授數據科學相關課程,如用於分析的 Python、數據挖掘、機器學習、數據庫管理和 NoSQL。Avinash 參與了數據科學的研究活動,並在印度的多個會議上擔任主題演講者。
Armando Fandango 利用他在深度學習、機器學習、分散式計算和計算方法方面的專業知識創造 AI 驅動的產品,並在初創企業和大型企業中擔任首席數據科學家和主管,提供思想領導角色。他曾為高科技 AI 初創企業提供建議。Armando 著有《Python 數據分析 - 第二版》和《掌握 TensorFlow》,由 Packt Publishing 出版。他還在國際期刊和會議上發表了研究成果。
Ivan Idris 擁有實驗物理學碩士學位。他的畢業論文強調應用計算機科學。畢業後,他在幾家公司擔任 Java 開發人員、數據倉庫開發人員和 QA 分析師。他的主要專業興趣是商業智能、大數據和雲計算。Ivan Idris 喜歡編寫乾淨、可測試的代碼和有趣的技術文章。Ivan Idris 是《NumPy 1.5 初學者指南》和《NumPy 食譜》的作者,均由 Packt Publishing 出版。
目錄大綱
Table of Contents
- Getting Started with Python Libraries
- NumPy and Pandas
- Statistics
- Linear Algebra
- Data Visualization
- Retrieving, Processing, and Storing Data
- Cleaning Messy Data
- Signal Processing and Time Series
- Supervised Learning – Regression Analysis
- Supervised Learning – Classification Techniques
- Unsupervised Learning – PCA and Clustering
- Analyzing Textual Data
- Analyzing Image Data
- Parallel Computing using Dask
目錄大綱(中文翻譯)
Table of Contents
- Getting Started with Python Libraries
- NumPy and Pandas
- Statistics
- Linear Algebra
- Data Visualization
- Retrieving, Processing, and Storing Data
- Cleaning Messy Data
- Signal Processing and Time Series
- Supervised Learning – Regression Analysis
- Supervised Learning – Classification Techniques
- Unsupervised Learning – PCA and Clustering
- Analyzing Textual Data
- Analyzing Image Data
- Parallel Computing using Dask