Agile Data Science: Building Data Analytics Applications with Hadoop (Paperback)

Russell Jurney




Mining big data requires a deep investment in people and time. How can you be sure you’re building the right models? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.

  • Create analytics applications by using the agile big data development methodology
  • Build value from your data in a series of agile sprints, using the data-value stack
  • Gain insight by using several data structures to extract multiple features from a single dataset
  • Visualize data with charts, and expose different aspects through interactive reports
  • Use historical data to predict the future, and translate predictions into action
  • Get feedback from users after each sprint to keep your project on track



使用輕量級工具,如Python、Apache Pig和D3.js庫,您的團隊將創建一個敏捷的環境,用於探索數據,從開始在自己的電子郵件收件箱中挖掘的示例應用程序開始。您將學習一種迭代的方法,使您能夠根據數據所告訴您的內容快速更改分析類型。本書中的所有示例代碼都可作為可運行的Heroku應用程序提供。

- 使用敏捷大數據開發方法論創建分析應用程序
- 通過數據價值堆棧在一系列敏捷迭代中從數據中獲取價值
- 使用多種數據結構從單個數據集中提取多個特徵
- 使用圖表可視化數據,通過交互式報告展示不同方面
- 使用歷史數據預測未來,並將預測轉化為行動
- 在每個迭代後從用戶獲得反饋,以保持項目進度。
