Thoughtful Data Science: Working with data by creating visually intuitive insights with Jupyter and Pixiedust
暫譯: 深思熟慮的資料科學:透過 Jupyter 和 Pixiedust 創建直觀的視覺化洞察來處理資料

David Taieb

  • 出版商: Packt Publishing
  • 出版日期: 2018-07-30
  • 定價: $1,540
  • 售價: 8.0$1,232
  • 語言: 英文
  • 頁數: 490
  • 裝訂: Paperback
  • ISBN: 178883996X
  • ISBN-13: 9781788839969
  • 相關分類: Data Science
  • 立即出貨 (庫存 < 3)

相關主題

商品描述

Approaching the practice of data science by scripting your own data pipeline and dashboards

Key Features

  • David teaches how to build a new data pipeline using Pixiedust
  • How to get the most out of Jupyter notebooks
  • Think about the data and their visualisations, before worrying about the algorithms

Book Description

Data science has become the one scientific endeavor every business has to contend with today. We also need to learn why data algorithms work, but even more importantly, we need to be able to create new insights from our data that we can actually work with. The why is addressed in many publications today, but it is not easy to create insights such that the data scientist does not look like a mountebank creating opaque notebook code before getting to the visually compelling bits of data science: the data science process itself has to be transparent, easy to understand, and it has to be straightforward to optimise.

David Taieb created Pixiedust in Python to be able to teach non-data scientists to use Jupyter notebooks, without having to slog through the considerable amount of Jupyter code required to be able to create simple and sometimes not-so-simple insights into data. It is possible to use Pixiedust by just writing a few lines in HTML and CSS, while retaining the ability to drop or remove algorithms and visualisation options, adjust the data pipeline to the requirements posed by the data or just get some very quick results. The case studies represent a carefully graded ladder of progress, ranging all the way from data mined from social media to geo-analytical data helpful in business decision making.

It is, however, possible to use both Python and Scala to add features to the Pixiedust data pipeline, and ultimately, to bring the power of the Spark big data framework to the data scientist.

What you will learn

  • How to write basic Pixiedust dashboards
  • Building your own data pipelines without writing connecting pipeline code
  • Learn how to use Jupyter notebooks without the pain
  • Create compelling data visualisations in Pixiedust
  • Write applications running on Spark, without writing Spark code

Who This Book Is For

To produce a functioning Pixiedust dashboard, only a modicum of HMTL and CSS is required. Fluency in data interpretation and visualization is also a necessary, since this book is addressed to data professionals, e.g. business and general data analysts. The later chapters also much to offer to the budding data scientist, and to developers on a path to becoming data scientists, since they get to play with Python code running in Jupyter notebooks.

商品描述(中文翻譯)

**以編寫自己的數據管道和儀表板來接近數據科學的實踐**

#### 主要特點
- David 教授如何使用 Pixiedust 建立新的數據管道
- 如何充分利用 Jupyter notebooks
- 在擔心算法之前,先思考數據及其可視化

#### 書籍描述
數據科學已成為當今每個企業必須面對的科學努力。我們也需要了解為什麼數據算法有效,但更重要的是,我們需要能夠從數據中創造出可以實際使用的新見解。這個「為什麼」在當今的許多出版物中都有提及,但創造見解並不容易,以至於數據科學家看起來像是一位騙子,在創建不透明的 notebook 代碼之前,才能到達數據科學中視覺上引人注目的部分:數據科學過程本身必須是透明的、易於理解的,並且必須能夠輕鬆優化。

David Taieb 在 Python 中創建了 Pixiedust,以便能夠教導非數據科學家使用 Jupyter notebooks,而無需經歷大量的 Jupyter 代碼,這些代碼是創建簡單甚至有時不那麼簡單的數據見解所必需的。使用 Pixiedust 只需編寫幾行 HTML 和 CSS,就可以保留刪除或移除算法和可視化選項的能力,根據數據提出的要求調整數據管道,或僅僅獲得一些非常快速的結果。案例研究代表了一個精心設計的進步階梯,範圍從社交媒體挖掘的數據到對商業決策有幫助的地理分析數據。

然而,使用 Python 和 Scala 來為 Pixiedust 數據管道添加功能是可能的,最終將 Spark 大數據框架的力量帶給數據科學家。

#### 你將學到的內容
- 如何編寫基本的 Pixiedust 儀表板
- 在不編寫連接管道代碼的情況下構建自己的數據管道
- 學習如何無痛使用 Jupyter notebooks
- 在 Pixiedust 中創建引人注目的數據可視化
- 編寫在 Spark 上運行的應用程序,而無需編寫 Spark 代碼

#### 本書適合誰
要生成一個功能正常的 Pixiedust 儀表板,只需少量的 HTML 和 CSS。流利的數據解釋和可視化能力也是必要的,因為本書是針對數據專業人士,例如商業和一般數據分析師。後面的章節對於新興的數據科學家以及正在成為數據科學家的開發人員也有很多幫助,因為他們可以在 Jupyter notebooks 中玩 Python 代碼。