Bad Data Handbook: Cleaning Up The Data So You Can Get Back To Work (Paperback)
暫譯: 壞數據手冊:清理數據以便您能夠回到工作中 (平裝本)

Q. Ethan McCallum

買這商品的人也買了...

商品描述

What is bad data? Some people consider it a technical phenomenon, like missing values or malformed records, but bad data includes a lot more. In this handbook, data expert Q. Ethan McCallum has gathered 19 colleagues from every corner of the data arena to reveal how they’ve recovered from nasty data problems.

From cranky storage to poor representation to misguided policy, there are many paths to bad data. Bottom line? Bad data is data that gets in the way. This book explains effective ways to get around it.

Among the many topics covered, you’ll discover how to:

  • Test drive your data to see if it’s ready for analysis
  • Work spreadsheet data into a usable form
  • Handle encoding problems that lurk in text data
  • Develop a successful web-scraping effort
  • Use NLP tools to reveal the real sentiment of online reviews
  • Address cloud computing issues that can impact your analysis effort
  • Avoid policies that create data analysis roadblocks
  • Take a systematic approach to data quality analysis

商品描述(中文翻譯)

什麼是壞數據?有些人認為這是一種技術現象,例如缺失值或格式錯誤的記錄,但壞數據的範疇遠不止於此。在這本手冊中,數據專家 Q. Ethan McCallum 聚集了來自數據領域各個角落的 19 位同事,揭示他們如何從棘手的數據問題中恢復過來。

從不穩定的儲存到不良的表現,再到誤導的政策,壞數據的成因有很多。總之,壞數據是妨礙分析的數據。這本書解釋了有效的解決方法。

在涵蓋的眾多主題中,您將發現如何:

- 測試您的數據以確定其是否準備好進行分析
- 將電子表格數據轉換為可用的形式
- 處理潛藏在文本數據中的編碼問題
- 開發成功的網頁爬蟲計劃
- 使用 NLP 工具揭示在線評論的真實情感
- 解決可能影響您分析工作的雲計算問題
- 避免創造數據分析障礙的政策
- 採取系統化的方法進行數據質量分析

最後瀏覽商品 (20)