Data Mining Methods and Models
暫譯: 資料探勘方法與模型

Daniel T. Larose

  • 出版商: Wiley
  • 出版日期: 2006-01-30
  • 售價: $980
  • 語言: 英文
  • 頁數: 344
  • 裝訂: Hardcover
  • ISBN: 0471666564
  • ISBN-13: 9780471666561
  • 相關分類: Data-mining
  • 下單後立即進貨 (約5~7天)

買這商品的人也買了...

相關主題

商品描述

Description  

Apply powerful Data Mining Methods and Models to Leverage your Data for Actionable Results

Data Mining Methods and Models provides:

  • The latest techniques for uncovering hidden nuggets of information
  • The insight into how the data mining algorithms actually work
  • The hands-on experience of performing data mining on large data sets

Data Mining Methods and Models:

  • Applies a "white box" methodology, emphasizing an understanding of the model structures underlying the softwareWalks the reader through the various algorithms and provides examples of the operation of the algorithms on actual large data sets, including a detailed case study, "Modeling Response to Direct-Mail Marketing"
  • Tests the reader's level of understanding of the concepts and methodologies, with over 110 chapter exercises
  • Demonstrates the Clementine data mining software suite, WEKA open source data mining software, SPSS statistical software, and Minitab statistical software
  • Includes a companion Web site, www.dataminingconsultant.com, where the data sets used in the book may be downloaded, along with a comprehensive set of data mining resources. Faculty adopters of the book have access to an array of helpful resources, including solutions to all exercises, a PowerPoint® presentation of each chapter, sample data mining course projects and accompanying data sets, and multiple-choice chapter quizzes.

With its emphasis on learning by doing, this is an excellent textbook for students in business, computer science, and statistics, as well as a problem-solving reference for data analysts and professionals in the field.

 

Table of Contents

Preface.

1. Dimension Reduction Methods.

Need for Dimension Reduction in Data Mining.

Principal Components Analysis.

Factor Analysis.

User-Defined Composites.

2. Regression Modeling.

Example of Simple Linear Regression.

Least-Squares Estimates.

Coefficient or Determination.

Correlation Coefficient.

The ANOVA Table.

Outliers, High Leverage Points, and Influential Observations.

The Regression Model.

Inference in Regression.

Verifying the Regression Assumptions.

An Example: The Baseball Data Set.

An Example: The California Data Set.

Transformations to Achieve Linearity.

3. Multiple Regression and Model Building.

An Example of Multiple Regression.

The Multiple Regression Model.

Inference in Multiple Regression.

Regression with Categorical Predictors.

Multicollinearity.

Variable Selection Methods.

An Application of Variable Selection Methods.

Mallows’ C p Statistic.

Variable Selection Criteria.

Using the Principal Components as Predictors in Multiple Regression.

4. Logistic Regression.

A Simple Example of Logistic Regression.

Maximum Likelihood Estimation.

Interpreting Logistic Regression Output.

Inference: Are the Predictors Significant?

Interpreting the Logistic Regression Model.

Interpreting a Logistic Regression Model for a Dichotomous Predictor.

Interpreting a Logistic Regression Model for a Polychotomous Predictor.

Interpreting a Logistic Regression Model for a Continuous Predictor.

The Assumption of Linearity.

The Zero-Cell Problem.

Multiple Logistic Regression.

Introducing Higher Order terms to Handle Non-Linearity.

Validating the Logistic Regression Model.

WEKA: Hands-On Analysis Using Logistic Regression.

5. Naïve Bayes and Bayesian Networks.

The Bayesian Approach.

The Maximum a Posteriori (MAP) Classification.

The Posterior Odds Ratio.

Balancing the Data.

Naïve Bayes Classification.

Numeric Predictors for Naïve Bayes Classification.

WEKA: Hands-On Analysis Using Naïve Bayes.

Bayesian Belief Networks.

Using the Bayesian Network to Find Probabilities.

WEKA: Hands-On Analysis Using Bayes Net.

6. Genetic Algorithms.

Introduction to Genetic Algorithms.

The Basic Framework of a Genetic Algorithm.

A Simple Example of Genetic Algorithms at Work.

Modifications and Enhancements: Selection.

Modifications and enhancements: Crossover.

Genetic Algorithms for Real-Valued Variables.

Using Genetic Algorithms to Train a Neural Network.

WEKA: Hands-On Analysis Using Genetic Algorithms.

7. Case Study: Modeling Response to Direct-Mail Marketing.

The Cross-Industry Standard Process for Data Mining: CRISP-DM.

Business Understanding Phase.

Data Understanding and Data Preparation Phases.

The Modeling Phase and the Evaluation Phase.

Index.

商品描述(中文翻譯)

描述
應用強大的資料探勘方法和模型來利用您的資料以獲得可行的結果
《資料探勘方法與模型》提供:
- 最新的技術來揭示隱藏的信息
- 對資料探勘演算法實際運作的深入了解
- 在大型資料集上進行資料探勘的實作經驗

《資料探勘方法與模型》:
- 採用「白盒」方法論,強調對軟體底層模型結構的理解,帶領讀者了解各種演算法,並提供演算法在實際大型資料集上的運作範例,包括詳細的案例研究「對直接郵件行銷的回應建模」
- 測試讀者對概念和方法論的理解程度,包含超過110個章節練習
- 演示Clementine資料探勘軟體套件、WEKA開源資料探勘軟體、SPSS統計軟體和Minitab統計軟體
- 包含一個伴隨的網站www.dataminingconsultant.com,讀者可以下載書中使用的資料集,並獲得全面的資料探勘資源。採用本書的教職員可獲得一系列有用的資源,包括所有練習的解答、每章的PowerPoint®簡報、範例資料探勘課程專案及其伴隨的資料集,以及多選題章節測驗。

本書強調「做中學」,是商業、計算機科學和統計學學生的優秀教科書,也是資料分析師和相關專業人士的問題解決參考書。

目錄
前言
1. 維度縮減方法
- 資料探勘中對維度縮減的需求
- 主成分分析
- 因子分析
- 使用者定義的合成
2. 迴歸建模
- 簡單線性迴歸的範例
- 最小平方估計
- 決定係數
- 相關係數
- ANOVA表
- 異常值、高杠桿點和影響觀察值
- 迴歸模型
- 迴歸推論
- 驗證迴歸假設
- 範例:棒球資料集
- 範例:加州資料集
- 轉換以達成線性
3. 多重迴歸與模型建構
- 多重迴歸的範例
- 多重迴歸模型
- 多重迴歸的推論
- 具有類別預測變數的迴歸
- 多重共線性
- 變數選擇方法
- 變數選擇方法的應用
- Mallows’ C p統計量
- 變數選擇標準
- 在多重迴歸中使用主成分作為預測變數
4. 邏輯迴歸
- 邏輯迴歸的簡單範例
- 最大似然估計
- 解釋邏輯迴歸輸出
- 推論:預測變數是否顯著?
- 解釋邏輯迴歸模型
- 解釋二元預測變數的邏輯迴歸模型
- 解釋多元預測變數的邏輯迴歸模型
- 解釋連續預測變數的邏輯迴歸模型
- 線性假設
- 零細胞問題
- 多重邏輯迴歸
- 引入高階項以處理非線性
- 驗證邏輯迴歸模型
- WEKA:使用邏輯迴歸的實作分析
5. 天真貝葉斯與貝葉斯網路
- 貝葉斯方法
- 最大後驗 (MAP) 分類
- 後驗比率
- 平衡資料
- 天真貝葉斯分類
- 天真貝葉斯分類的數值預測變數
- WEKA:使用天真貝葉斯的實作分析
- 貝葉斯信念網路
- 使用貝葉斯網路尋找機率
- WEKA:使用貝葉斯網路的實作分析
6. 遺傳演算法
- 遺傳演算法簡介
- 遺傳演算法的基本框架
- 遺傳演算法的簡單範例
- 修改與增強:選擇
- 修改與增強:交叉
- 用於實值變數的遺傳演算法
- 使用遺傳演算法訓練神經網路
- WEKA:使用遺傳演算法的實作分析
7. 案例研究:對直接郵件行銷的回應建模
- 跨行業標準資料探勘流程:CRISP-DM
- 商業理解階段
- 資料理解與資料準備階段
- 建模階段與評估階段
- 索引