Principles of Data Wrangling: Practical Techniques for Data Preparation
暫譯: 數據整理原則:數據準備的實用技術

Tye Rattenbury, Joseph M. Hellerstein, Jeffrey Heer, Sean Kandel, Connor Carreras

  • 出版商: O'Reilly
  • 出版日期: 2017-08-08
  • 售價: $1,560
  • 貴賓價: 9.5$1,482
  • 語言: 英文
  • 頁數: 94
  • 裝訂: Paperback
  • ISBN: 1491938927
  • ISBN-13: 9781491938928
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

A key task that any aspiring data-driven organization needs to learn is data wrangling, the process of converting raw data into something truly useful. This practical guide provides business analysts with an overview of various data wrangling techniques and tools, and puts the practice of data wrangling into context by asking, "What are you trying to do and why?"

Wrangling data consumes roughly 50-80% of an analyst’s time before any kind of analysis is possible. Written by key executives at Trifacta, this book walks you through the wrangling process by exploring several factors—time, granularity, scope, and structure—that you need to consider as you begin to work with data. You’ll learn a shared language and a comprehensive understanding of data wrangling, with an emphasis on recent agile analytic processes used by many of today’s data-driven organizations.

Appreciate the importance—and the satisfaction—of wrangling data the right way.

  • Understand what kind of data is available
  • Choose which data to use and at what level of detail
  • Meaningfully combine multiple sources of data
  • Decide how to distill the results to a size and shape that can drive downstream analysis

商品描述(中文翻譯)

一個任何有志於數據驅動的組織需要學習的關鍵任務是數據整理(data wrangling),這是一個將原始數據轉換為真正有用的東西的過程。本實用指南為商業分析師提供了各種數據整理技術和工具的概述,並通過詢問「你想要做什麼,為什麼?」將數據整理的實踐置於上下文中。

數據整理大約佔據分析師在進行任何類型分析之前的50-80%的時間。這本書由Trifacta的主要高管撰寫,帶領您了解數據整理過程,探索在開始處理數據時需要考慮的幾個因素——時間、粒度、範圍和結構。您將學習到一種共同的語言和對數據整理的全面理解,特別強調當今許多數據驅動組織所使用的最新敏捷分析流程。

欣賞以正確的方式整理數據的重要性和滿足感。

- 了解可用的數據類型
- 選擇使用哪些數據以及使用的詳細程度
- 有意義地結合多個數據來源
- 決定如何將結果提煉成可以推動後續分析的大小和形狀