Python Data Science Essentials - Second Edition
暫譯: Python 數據科學必備 - 第二版

Alberto Boschetti, Luca Massaron

商品描述

Key Features

  • Quickly get familiar with data science using Python 3.5
  • Save time (and effort) with all the essential tools explained
  • Create effective data science projects and avoid common pitfalls with the help of examples and hints dictated by experience

Book Description

Fully expanded and upgraded, the second edition of Python Data Science Essentials takes you through all you need to know to suceed in data science using Python. Get modern insight into the core of Python data, including the latest versions of Jupyter notebooks, NumPy, pandas and scikit-learn. Look beyond the fundamentals with beautiful data visualizations with Seaborn and ggplot, web development with Bottle, and even the new frontiers of deep learning with Theano and TensorFlow.

Dive into building your essential Python 3.5 data science toolbox, using a single-source approach that will allow to to work with Python 2.7 as well. Get to grips fast with data munging and preprocessing, and all the techniques you need to load, analyse, and process your data. Finally, get a complete overview of principal machine learning algorithms, graph analysis techniques, and all the visualization and deployment instruments that make it easier to present your results to an audience of both data science experts and business users.

What you will learn

  • Set up your data science toolbox using a Python scientific environment on Windows, Mac, and Linux
  • Get data ready for your data science project
  • Manipulate, fix, and explore data in order to solve data science problems
  • Set up an experimental pipeline to test your data science hypotheses
  • Choose the most effective and scalable learning algorithm for your data science tasks
  • Optimize your machine learning models to get the best performance
  • Explore and cluster graphs, taking advantage of interconnections and links in your data

About the Author

Alberto Boschetti is a data scientist with expertise in signal processing and statistics. He holds a PhD in telecommunication engineering and currently lives and works in London. In his work projects, he faces challenges ranging from natural language processing (NLP), behavioral analysis, and machine learning to distributed processing. He is very passionate about his job and always tries to stay updated about the latest developments in data science technologies, attending meet-ups, conferences, and other events.

Luca Massaron is a data scientist and marketing research director specializing in multivariate statistical analysis, machine learning, and customer insight, with over a decade of experience of solving real-world problems and generating value for stakeholders by applying reasoning, statistics, data mining, and algorithms. From being a pioneer of web audience analysis in Italy to achieving the rank of a top ten Kaggler, he has always been very passionate about every aspect of data and its analysis, and also about demonstrating the potential of data-driven knowledge discovery to both experts and non-experts. Favoring simplicity over unnecessary sophistication, Luca believes that a lot can be achieved in data science just by doing the essentials.

Table of Contents

  1. First Steps
  2. Data Munging
  3. The Data Pipeline
  4. Machine Learning
  5. Social Network Analysis
  6. Visualization, Insights, and Results
  7. Strengthen Your Python Foundations

商品描述(中文翻譯)

**主要特點**
- 快速熟悉使用 Python 3.5 的資料科學
- 透過解釋所有必要工具來節省時間(和精力)
- 創建有效的資料科學專案,並透過經驗提供的範例和提示來避免常見的陷阱

**書籍描述**
《Python 資料科學精要》第二版經過全面擴充和升級,帶您了解使用 Python 成功進入資料科學所需的所有知識。獲得有關 Python 資料核心的現代見解,包括 Jupyter notebooks、NumPy、pandas 和 scikit-learn 的最新版本。超越基礎,使用 Seaborn 和 ggplot 進行美麗的資料視覺化,使用 Bottle 進行網頁開發,甚至探索 Theano 和 TensorFlow 的深度學習新領域。

深入建立您的基本 Python 3.5 資料科學工具箱,使用單一來源的方法,讓您也能使用 Python 2.7。快速掌握資料清理和預處理,以及加載、分析和處理資料所需的所有技術。最後,全面了解主要的機器學習演算法、圖形分析技術,以及所有使您更容易向資料科學專家和商業用戶展示結果的視覺化和部署工具。

**您將學到的內容**
- 在 Windows、Mac 和 Linux 上使用 Python 科學環境設置您的資料科學工具箱
- 為您的資料科學專案準備資料
- 操作、修正和探索資料以解決資料科學問題
- 設置實驗管道以測試您的資料科學假設
- 為您的資料科學任務選擇最有效和可擴展的學習演算法
- 優化您的機器學習模型以獲得最佳性能
- 探索和聚類圖形,利用資料中的互連和鏈接

**關於作者**
**Alberto Boschetti** 是一位資料科學家,專長於信號處理和統計學。他擁有電信工程的博士學位,目前居住和工作於倫敦。在他的工作專案中,他面臨的挑戰包括自然語言處理(NLP)、行為分析、機器學習和分散式處理。他對自己的工作充滿熱情,並始終努力保持對資料科學技術最新發展的了解,參加聚會、會議和其他活動。

**Luca Massaron** 是一位資料科學家和市場研究總監,專注於多變量統計分析、機器學習和客戶洞察,擁有超過十年的經驗,通過應用推理、統計、資料挖掘和演算法來解決現實世界的問題並為利益相關者創造價值。從成為意大利網路受眾分析的先驅到成為前十名的 Kaggler,他對資料及其分析的每個方面都充滿熱情,並且也熱衷於向專家和非專家展示資料驅動的知識發現的潛力。Luca 偏好簡單而非不必要的複雜性,認為在資料科學中,僅僅通過做基本的事情就能取得很多成就。

**目錄**
1. 初步步驟
2. 資料清理
3. 資料管道
4. 機器學習
5. 社交網路分析
6. 視覺化、洞察和結果
7. 加強您的 Python 基礎