Semi-Supervised Learning(Hardcover)
暫譯: 半監督學習(精裝版)
Olivier Chapelle, Bernhard Schölkopf, Alexander Zien
- 出版商: MIT
- 出版日期: 2006-09-22
- 售價: $890
- 貴賓價: 9.5 折 $846
- 語言: 英文
- 頁數: 528
- 裝訂: Hardcover
- ISBN: 0262033585
- ISBN-13: 9780262033589
-
相關分類:
人工智慧、Data Science、Machine Learning
無法訂購
買這商品的人也買了...
-
$1,107Bioinformatics: The Machine Learning Approach, 2/e (Hardcover)
-
$149$127 -
$199$179 -
$890$703 -
$1,200$1,176 -
$780$663 -
$2,550$2,423 -
$650$507 -
$550$435 -
$650$507 -
$750$593 -
$680$578 -
$1,200$948 -
$1,620$1,539 -
$680$578 -
$580$493 -
$540$427 -
$490$417 -
$1,280$1,254 -
$1,320Data Analysis with Open Source Tools (Paperback)
-
$1,590$1,511 -
$600$474 -
$250深度學習:Java語言實現(Java Deep Learning Essentials)
-
$158人工智能中的深度結構學習 (Learning Deep Architectures for Ai)
-
$520$411
商品描述
Description
In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research.
Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction.
Olivier Chapelle and Alexander Zien are Research Scientists and Bernhard Schölkopf is Professor and Director at the Max Planck Institute for Biological Cybernetics in Tübingen. Schölkopf is coauthor of Learning with Kernels (MIT Press, 2002) and is a coeditor of Advances in Kernel Methods: Support Vector Learning (1998), Advances in Large-Margin Classifiers (2000), and Kernel Methods in Computational Biology (2004), all published by The MIT Press.
Bernhard Schölkopf is Professor and Director at the Max Planck Institute for Biological Cybernetics in Tübingen and Program Chair of the 2005 NIPS Conference.
Table of Contents
Series Foreword xi
Preface xiii
1. Introduction to Semi-Supervised Learning
1.1 Supervised, Unsupervised, and Semi-Supervised Learning 1
1.2 When Can Semi-Supervised Learning Work? 4
1.3 Classes of Algorithms and Organization of This Book 8
I. Generative Models 13
2. A Taxonomy for Semi-Supervised Learning Methods
Matthias Seeger 15
2.1 The Semi-Supervised Learning Problem 15
2.2 Paradigms for Semi-Supervised Learning 17
2.3 Examples 22
2.4 Conclusions 31
3. Semi-Supervised Text Classification Using EM
N. C. Nigam, Andrew McCallum and Tom Mitchell 33
3.1 Introduction 33
3.2 A Generative Model for Text 35
3.3 Experminental Results with Basic EM 41
3.4 Using a More Expressive Generative Model 43
3.5 Overcoming the Challenges of Local Maxima 49
3.6 Conclusions and Summary 54
4. Risks of Semi-Supervised Learning
Fabio Cozman and Ira Cohen 57
4.1 Do Unlabled Data Improve or Degrade Classification Performance? 57
4.2 Understanding Unlabeled Data: Asymptotic Bias 59
4.3 The Asymptotic Analysis of Generative Smei-Supervised Learning 63
4.4 The Value of Labeled and Unlabeled Data 67
4.5 Finite Sample Effects 69
4.6 Model Search and Robustness 70
4.7 Conclusion 71
5. Probabilistic Semi-Supervised Cluster with Constraints
Sugato Basu, Mikhail Bilenko, Arindam Banerjee and Raymond J. Mooney 73
5.1 Introduction 74
5.2 HMRF Model for Semi-Supervised Clustering 75
5.3 HMRF-KMeans Algorithm 81
5.4 Active Learning for Constraint Acquistion 93
5.5 Experimental Results 96
5.6 Related Work 100
5.7 Conclusions 101
II. Low-Density Separation 103
6. Transductive Support Vector Machines
Thorsten Joachims 105
6.1 Introduction 105
6.2 Transductive Support Vector Machines 108
6.3 Why Use Margin on the Test Set? 111
6.4 Experiments and Applications of the TSVMs 112
6.5 Solving the TSVM Optimization Problem 114
6.6 Connection to Related Approaches 116
6.7 Summary and Conclusions 116
7. Semi-Supervised Learning Using Semi-Definite Programming
Tijl De Bie and Nello Cristianini 119
7.1 Relaxing SVM transduction 119
7.2 An Approximation for Speedup 126
7.3 General Semi-Supervised Learning Settings 128
7.4 Empirical Results 129
7.5 Summary and Outlook 133
Appendix:
The Extended Schur Complement Lemma 134
8. Gaussian Processes and the Null-Category Noise Model
Neil D. Lawrence and Michael I. Jordan 137
8.1 Introduction 137
8.2 The Noise Model 141
8.3 Process Model and the Effect of the Null-Category 143
8.4 Posterior Inference and Prediction 145
8.5 Results 147
8.6 Discussion 149
9. Entropy Regularization
Yves Grandvalet and Yoshua Bengio 151
9.1 Introduction 151
9.2 Derivation of the Criterion 152
9.3 Optimization Algorithms 155
9.4 Related Methods 158
9.5 Experiments 160
9.6 Conclusion 166
Appendix
Proof of Theorem 9.1 166
10. Data-Dependent Regularization
Adrian Corduneanu and Tommi S. Jaakkola 169
10.1 Introduction 169
10.2 Information Regularization on Metric Spaces 174
10.3 Information Regularization and Relational Data 182
10.4 Discussion 189
III. Graph-Based Models 191
11. Label Propogation and Quadratic Criterion
Yoshua Bengio, Olivier Delalleau and Nicolas Le Roux 193
11.1 Introduction 193
11.2 Label Propogation on a Similarity Graph 194
11.3 Quadratic Cost Criterion 198
11.4 From Transduction to Induction 205
11.5 Incorporating Class Prior Knowledge 205
11.6 Curse of Dimensionality for Semi-Supervised Learning 206
11.7 Discussion 215
12. The Geometric Basis of Semi-Supervised Learning
Vikas Sindhwani, Misha Belkin and Partha Niyogi 217
12.1 Introduction 217
12.2 Incorporating Geometry in Regularization 220
12.3 Algorithms 224
12.4 Data-Dependent Kernels for Semi-Supervised Learning 229
12.5 Linear Methods for Large-Scale Semi-Supervised Learning 231
12.6 Connections to Other Algorithms and Related Work 232
12.7 Future Directions 234
13. Discrete Regularization
Dengyong Zhou and Bernhard Schölkopf 237
13.1 Introduction 237
13.2 Discrete Analysis 239
13.3 Discrete Regularization 245
13.4 Conclusion 249
14. Semi-Supervised Learning with Conditional Harmonic Mixing
Christopher J. C. Burges and John C. Platt 251
14.1 Introduction 251
14.2 Conditional Harmonic Mixing 255
14.3 Learning in CHM Models 256
14.4 Incorporating Prior Knowledge 261
14.5 Learning the Conditionals 261
14.6 Model Averaging 262
14.7 Experiments 263
14.8 Conclusions 273
IV. Change of Representation 275
15. Graph Kernels by Spectral Transforms
Xiaojin Zhu, Jaz Kandola, John Lafferty and Zoubin Ghahramani 277
15.1 The Graph Laplacian 278
15.2 Kernels by Spectral Transforms 280
15.3 Kernel Alignment 281
15.4 Optimizing Alignment Using QCQP for Semi-Supervised Learning 282
15.5 Semi-Supervised Kernels with Order Restraints 283
15.6 Experimental Results 285
15.7 Conclusion 289
16. Spectral Methods for Dimensionality Reduction
Lawrence K. Saul, Kilian Weinberger, Fei Sha and Jihun Ham 293
16.1 Introduction 293
16.2 Linear Methods 295
16.3 Graph-Based Methods 297
16.4 Kernel Methods 303
16.5 Discussion 306
17. Modifying Distances
Alon Orlitsky and Sajama 309
17.1 Introduction 309
17.2 Estimating DBD Metrics 312
17.3 Computing DBD Metrics 321
17.4 Semi-Supervised Learning Using Density-Based Metrics 327
17.5 Conclusions and Future Work 329
V. Semi-Supervised Learning in Practice 331
18. Large-Scale Algorithms
Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux 333
18.1 Introduction 333
18.2 Cost Approximations 334
18.3 Subset Selection 337
18.4 Discussion 340
19. Semi-Supervised Protein Classification Using Cluster Kernels
Jason Weston, Christina Leslie, Eugene Ie and William S. Noble 343
19.1 Introduction 343
19.2 Representation and Kernels for Protein Sequences 345
19.3 Semi-Supervised Kernels for Protein Sequences 348
19.4 Experiments 352
19.5 Discussion 358
20. Prediction of Protein Function from Networks
Hyunjung Shin and Koji Tsuda 361
20.1 Introduction 361
20.2 Graph-Based Semi-Supervised Learning 364
20.3 Combining Multiple Graphs 366
20.4 Experiments on Function Prediction of Proteins 369
20.5 Conclusion and Outlook 374
21. Analysis of Benchmarks 377
21.1 The Benchmark 377
21.2 Application of SSL Methods 383
21.3 Results and Discussion 390
VI. Perspectives 395
22. An Augmented PAC Model for Semi-Supervised Learning
Maria-Florina Balcan and Avrim Blum 397
22.1 Introduction 398
22.2 A Formal Framework 400
22.3 Sample Complexity Results 403
22.4 Algorithmic Results 412
22.5 Related Models and Discussion 416
23. Metric-Based Approaches for Semi-Supervised Regression and Classification
Dale Schuurmans, Finnegan Southey, Dana Wilkinson and Yuhong Guo 421
23.1 Introduction 421
23.2 Metric Structure of Supervised Learning 423
23.3 Model Selection 426
23.4 Regularization 436
23.5 Classification 445
23.6 Conclusion 449
24. Transductive Inference and Semi-Supervised Learning
Vladimir Vapnik 453
24.1 Problem Settings 453
24.2 Problem of Generalization in Inductive and Transductive Inference 455
24.3 Structure of the VC Bounds and Transductive Inference 457
24.4 The Symmetrization Lemma and Transductive Inference 458
24.5 Bounds for Transductive Inference 459
24.6 The Structural Risk Minimization Principle for Induction and Transduction 460
24.7 Combinatorics in Transductive Inference 462
24.8 Measures of Size of Equivalence Classes 463
24.9 Algorithms for Inductive and Transductive SVMs 465
24.10 Semi-Supervised Learning 470
24.11 Conclusion:
Transductive Inference and the New Problems of Inference 470
24.12 Beyond Transduction: Selective Inference 471
25. A Discussion of Semi-Supervised Learning and Transduction 473
References 479
Notation and Symbols 499
Contributors 503
Index 509
商品描述(中文翻譯)
**描述**
在機器學習領域,半監督學習(Semi-Supervised Learning, SSL)位於監督學習(所有訓練範例都有標籤)和非監督學習(沒有標籤數據)之間。近年來,對SSL的興趣增加,特別是在標籤數據稀缺的應用領域,如圖像、文本和生物信息學。本書是對SSL的首次全面概述,介紹了最先進的算法、該領域的分類法、選定的應用、基準實驗以及對當前和未來研究的展望。《半監督學習》首先介紹了該領域的關鍵假設和理念:平滑性、聚類或低密度分離、流形結構和傳導。書中的核心是根據算法策略組織的SSL方法介紹。在檢視生成模型後,書中描述了實現低密度分離假設的算法、基於圖的方法以及執行兩步學習的算法。接著,書中討論了SSL的應用,並通過分析廣泛的基準實驗結果為SSL從業者提供指導。最後,書中探討了SSL研究的有趣方向。書的結尾討論了半監督學習與傳導之間的關係。
Olivier Chapelle和Alexander Zien是研究科學家,Bernhard Schölkopf是德國圖賓根的馬克斯·普朗克生物控制論研究所的教授和所長。Schölkopf是《Learning with Kernels》(MIT Press, 2002)的合著者,也是《Advances in Kernel Methods: Support Vector Learning》(1998)、《Advances in Large-Margin Classifiers》(2000)和《Kernel Methods in Computational Biology》(2004)的共同編輯,這些書籍均由MIT Press出版。
Bernhard Schölkopf是德國圖賓根的馬克斯·普朗克生物控制論研究所的教授和所長,並且是2005年NIPS會議的程序主席。
**目錄**
系列前言 xi
前言 xiii
1. 半監督學習簡介
1.1 監督學習、非監督學習和半監督學習 1
1.2 何時可以使用半監督學習? 4
1.3 算法類別及本書組織 8
I. 生成模型 13
2. 半監督學習方法的分類法
Matthias Seeger 15
2.1 半監督學習問題 15
2.2 半監督學習的範式 17
2.3 範例 22
2.4 結論 31
3. 使用EM的半監督文本分類
N. C. Nigam, Andrew McCallum和Tom Mitchell 33
3.1 簡介 33
3.2 文本的生成模型 35
3.3 基本EM的實驗結果 41
3.4 使用更具表達力的生成模型 43
3.5 克服局部極值的挑戰 49
3.6 結論與摘要 54
4. 半監督學習的風險
Fabio Cozman和Ira Cohen 57
4.1 無標籤數據是否改善或降低分類性能? 57
4.2 理解無標籤數據:漸近偏差 59
4.3 生成半監督學習的漸近分析 63
4.4 標籤和無標籤數據的價值 67
4.5 有限樣本效應 69
4.6 模型搜索與穩健性 70
4.7 結論 71
5. 帶約束的概率半監督聚類
Sugato Basu, Mikhail Bilenko, Arindam Banerjee和Raymond J. Mooney 73
5.1 簡介 74
5.2 半監督聚類的HMRF模型 75
5.3 HMRF-KMeans算法 81
5.4 約束獲取的主動學習 93
5.5 實驗結果 96
5.6 相關工作 100
5.7 結論 101
II. 低密度分離 103
6. 傳導支持向量機
Thorsten Joachims 105
6.1 簡介 105
6.2 傳導支持向量機 108
6.3 為什麼在測試集上使用邊際? 111
6.4 TSVM的實驗和應用 112
6.5 解決TSVM優化問題 114
6.6 與相關方法的聯繫 116
6.7 摘要與結論 116
7. 使用半正定規劃的半監督學習
Tijl De Bie和Nello Cristianini 119
7.1 放鬆SVM傳導 119
7.2 加速的近似 126
7.3 一般的半監督學習設置 128
7.4 實證結果 129
7.5 摘要與展望 133
附錄:
擴展的Schur補充引理 134
8. 高斯過程與零類別噪聲模型
Neil D. Lawrence和Michael I. Jordan 137
8.1 簡介 137
8.2 噪聲模型 141
8.3 過程模型與零類別的影響 143
8.4 後驗推斷與預測 145
8.5 結果 147
8.6 討論 149
9. 熵正則化
Yves Grandvalet和Yoshua Bengio 151
9.1 簡介 151
9.2 標準的推導 152
9.3 優化算法 155
9.4 相關方法 158
9.5 實驗 160
9.6 結論 166
附錄
定理9.1的證明 166
10. 數據依賴的正則化
Adrian Corduneanu和Tommi S. Jaakkola 169
10.1 簡介 169
10.2 在度量空間上的信息正則化 174
10.3 信息正則化與關係數據 182
10.4 討論 189
III. 基於圖的模型 191
11. 標籤傳播與二次標準
Yoshua Bengio, Olivier Delalleau和Nicolas Le Roux 193
11.1 簡介 193
11.2 在相似性圖上的標籤傳播 194
11.3 二次成本標準 198
11.4 從傳導到歸納 205
11.5 融入類別先驗知識 205
11.6 半監督學習的維度詛咒 206
11.7 討論 215
12. 半監督學習的幾何基礎
Vikas Sindhwani, Misha Belkin和Partha Niyogi 217
12.1 簡介 217
12.2 在正則化中融入幾何 220
12.3 算法 224
12.4 半監督學習的數據依賴核 229
12.5 大規模半監督學習的線性方法 231
12.6 與其他算法及相關工作的聯繫 232
12.7 未來方向 234
13. 離散正則化
Dengyong Zhou和Bernhard Schölkopf 237
13.1 簡介 237
13.2 離散分析 239
13.3 離散正則化 245
13.4 結論 249
14. 使用條件諧波混合的半監督學習
Christopher J. C. Burges和John C. Platt 251
14.1 簡介 251
14.2 條件諧波混合 255
14.3 在CHM模型中的學習 256
14.4 融入先驗知識 261
14.5 學習條件 261
14.6 模型平均 262
14.7 實驗 263
14.8 結論 273
IV. 表示的變更 275
15. 通過頻譜變換的圖核
Xiaojin Zhu, Jaz Kandola, John Lafferty和Zoubin Ghahramani 277
15.1 圖拉普拉斯算子 278
15.2 通過頻譜變換的核 280
15.3 核對齊 281
15.4 使用QCQP優化對齊以進行半監督學習 282
15.5 帶有順序約束的半監督核 283
15.6 實驗結果 285
15.7 結論 289
16. 用於降維的頻譜方法
Lawrence K. Saul, Kilian Weinberger, Fei Sha和Jihun Ham 293
16.1 簡介 293
16.2 線性方法 295
16.3 基於圖的方法 297
16.4 核方法 303
16.5 討論 306
17. 修改距離
Alon Orlitsky和Sajama 309
17.1 簡介 309
17.2 估計DBD度量 312
17.3 計算DBD度量 321
17.4 使用基於密度的度量的半監督學習 327
17.5 結論與未來工作 329
V. 半監督學習的實踐 331
18. 大規模算法
Olivier Delalleau, Yoshua Bengio和Nicolas Le Roux 333
18.1 簡介 333
18.2 成本近似 334
18.3 子集選擇 337
18.4 討論 340
19. 使用聚類核的半監督蛋白質分類
Jason Weston, Christina Leslie, Eugene Ie和William S. Noble 343
19.1 簡介 343
19.2 蛋白質序列的表示和核 345
19.3 蛋白質序列的半監督核 348
19.4 實驗 352
19.5 討論 358
20. 從網絡預測蛋白質功能
Hyunjung Shin和Koji Tsuda 361
20.1 簡介 361
20.2 基於圖的半監督學習 364
20.3 結合多個圖 366
20.4 蛋白質功能預測的實驗 369
20.5 結論與展望 374
21. 基準分析 377
21.1 基準 377
21.2 SSL方法的應用 383
21.3 結果與討論 390
VI. 展望 395
22. 半監督學習的增強PAC模型
Maria-Florina Balcan和Avrim Blum 397
22.1 簡介 398
22.2 一個正式框架 400
22.3 樣本複雜度結果 403
22.4 算法結果 412
22.5 相關模型與討論 416
23. 基於度量的半監督回歸和分類方法
Dale Schuurmans, Finnegan Southey, Dana Wilkinson和Yuhong Guo 421
23.1 簡介 421
23.2 度量