Practical Data Analysis, 2/e (Paperback)
暫譯: 實用數據分析(第二版)

Hector Cuesta, Dr. Sampath Kumar

商品描述

Key Features

  • Learn to use various data analysis tools and algorithms to classify, cluster, visualize, simulate, and forecast your data
  • Apply Machine Learning algorithms to different kinds of data such as social networks, time series, and images
  • A hands-on guide to understanding the nature of data and how to turn it into insight

Book Description

Beyond buzzwords like Big Data or Data Science, there are a great opportunities to innovate in many businesses using data analysis to get data-driven products. Data analysis involves asking many questions about data in order to discover insights and generate value for a product or a service.

This book explains the basic data algorithms without the theoretical jargon, and you'll get hands-on turning data into insights using machine learning techniques. We will perform data-driven innovation processing for several types of data such as text, Images, social network graphs, documents, and time series, showing you how to implement large data processing with MongoDB and Apache Spark.

What you will learn

  • Acquire, format, and visualize your data
  • Build an image-similarity search engine
  • Generate meaningful visualizations anyone can understand
  • Get started with analyzing social network graphs
  • Find out how to implement sentiment text analysis
  • Install data analysis tools such as Pandas, MongoDB, and Apache Spark
  • Get to grips with Apache Spark
  • Implement machine learning algorithms such as classification or forecasting

About the Author

Hector Cuesta is founder and Chief Data Scientist at Dataxios, a machine intelligence research company. Holds a BA in Informatics and a M.Sc. in Computer Science. He provides consulting services for data-driven product design with experience in a variety of industries including financial services, retail, fintech, e-learning and Human Resources. He is an enthusiast of Robotics in his spare time.

Dr. Sampath Kumar works as an assistant professor and head of Department of Applied Statistics at Telangana University. He has completed M.Sc., M.Phl., and Ph. D. in statistics. He has five years of teaching experience for PG course. He has more than four years of experience in the corporate sector. His expertise is in statistical data analysis using SPSS, SAS, R, Minitab, MATLAB, and so on. He is an advanced programmer in SAS and matlab software. He has teaching experience in different, applied and pure statistics subjects such as forecasting models, applied regression analysis, multivariate data analysis, operations research, and so on for M.Sc. students. He is currently supervising Ph.D. scholars.

Table of Contents

  1. Getting Started
  2. Preprocessing Data
  3. Getting to Grips with Visualization
  4. Text Classification
  5. Similarity-Based Image Retrieval
  6. Simulation of Stock Prices
  7. Predicting Gold Prices
  8. Working with Support Vector Machines
  9. Modeling Infectious Diseases with Cellular Automata
  10. Working with Social Graphs
  11. Working with Twitter Data
  12. Data Processing and Aggregation with MongoDB
  13. Working with MapReduce
  14. Online Data Analysis with Jupyter and Wakari
  15. Understanding Data Processing using Apache Spark

商品描述(中文翻譯)

**主要特點**

- 學習使用各種數據分析工具和算法來分類、聚類、可視化、模擬和預測您的數據
- 將機器學習算法應用於社交網絡、時間序列和圖像等不同類型的數據
- 一本實用指南,幫助您理解數據的本質以及如何將其轉化為洞察

**書籍描述**

超越像大數據(Big Data)或數據科學(Data Science)這樣的流行詞,利用數據分析在許多業務中創新有著巨大的機會,以獲得數據驅動的產品。數據分析涉及對數據提出許多問題,以發現洞察並為產品或服務創造價值。

本書解釋了基本的數據算法,沒有理論術語,您將實際操作將數據轉化為洞察,使用機器學習技術。我們將對多種類型的數據進行數據驅動的創新處理,例如文本、圖像、社交網絡圖、文檔和時間序列,向您展示如何使用MongoDB和Apache Spark實現大數據處理。

**您將學到的內容**

- 獲取、格式化和可視化您的數據
- 建立圖像相似性搜索引擎
- 生成任何人都能理解的有意義的可視化
- 開始分析社交網絡圖
- 瞭解如何實現情感文本分析
- 安裝數據分析工具,如Pandas、MongoDB和Apache Spark
- 熟悉Apache Spark
- 實現機器學習算法,如分類或預測

**關於作者**

**Hector Cuesta** 是Dataxios的創始人和首席數據科學家,這是一家機器智能研究公司。他擁有資訊學學士學位和計算機科學碩士學位。他提供數據驅動產品設計的諮詢服務,並在金融服務、零售、金融科技、電子學習和人力資源等多個行業擁有經驗。他在空閒時間熱衷於機器人技術。

**Dr. Sampath Kumar** 擔任泰倫加納大學應用統計系的助理教授和系主任。他已完成統計學的碩士、哲學碩士和博士學位。他擁有五年的研究生課程教學經驗,並在企業界擁有四年以上的經驗。他的專長是使用SPSS、SAS、R、Minitab、MATLAB等進行統計數據分析。他是SAS和MATLAB軟件的高級程序員。他在不同的應用和純統計學科方面擁有教學經驗,如預測模型、應用回歸分析、多變量數據分析、運籌學等,並目前指導博士生。

**目錄**

1. 開始使用
2. 數據預處理
3. 熟悉可視化
4. 文本分類
5. 基於相似性的圖像檢索
6. 股票價格模擬
7. 預測黃金價格
8. 使用支持向量機
9. 使用細胞自動機建模傳染病
10. 使用社交圖
11. 使用Twitter數據
12. 使用MongoDB進行數據處理和聚合
13. 使用MapReduce
14. 使用Jupyter和Wakari進行在線數據分析
15. 使用Apache Spark理解數據處理