Python 機器學習手冊:從數據預處理到深度學習 (Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning)
Chris Albon 著作 韓慧昌,林然,徐江 譯
- 出版商: 電子工業
- 出版日期: 2019-07-01
- 定價: $534
- 售價: 5.6 折 $299
- 語言: 簡體中文
- ISBN: 7121369621
- ISBN-13: 9787121369629
-
相關分類:
Machine Learning、DeepLearning
- 此書翻譯自: Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning
-
相關翻譯:
Python 機器學習錦囊妙計 (Machine Learning with Python Cookbook) (繁中版)
立即出貨
買這商品的人也買了...
-
$505Python 金融大數據分析 (Python for Finance)
-
$648$616 -
$474$450 -
$390$332 -
$327$311 -
$690$538 -
$607機器學習實戰:基於 Scikit-Learn 和 TensorFlow (Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques for Building Intelligent Systems)
-
$880$695 -
$352機器學習:使用 OpenCV 和 Python 進行智能圖像處理 (Machine Learning for OpenCV)
-
$301$283 -
$403統計強化學習:現代機器學習方法 (Statistical Reinforcement Learning: Modern Machine Learning Approaches)
-
$352機器學習:算法背後的理論與優化
-
$352電腦視覺算法:基於 OpenCV 的電腦應用開發 (Hands-On Algorithms for Computer Vision)
-
$650$514 -
$301精通特徵工程 (Feature Engineering for Machine Learning: Principles and Techniques for Data Scientists)
-
$505$475 -
$403OpenCV 4 電腦視覺項目實戰, 2/e
-
$580$458 -
$780$616 -
$331Python 機器學習算法 : 原理、實現與案例
-
$1,000$790 -
$680$537 -
$780$616 -
$890$703 -
$650$429
相關主題
商品描述
這是一本關於Python的圖書,採用基於任務的方式來介紹如何在機器學習中使用Python。書中有近200個獨立的解決方案(並提供了相關代碼,讀者可以復制並粘貼這些代碼,用在自己的程序中),針對的都是數據科學家或機器學習工程師在構建模型時可能遇到的最常見任務,涵蓋最簡單的矩陣和向量運算到特徵工程以及神經網絡的構建。本書不是機器學習的入門書,適合熟悉機器學習的理論和概念的讀者擺在案頭作為參考,他們可以借鑒書中的代碼,快速解決在機器學習的日常開發中遇到的挑戰。
作者簡介
韓慧昌,畢業於北京科技大學,ThoughtWorks高級諮詢師,有多個大型企業AI項目經驗。
林然,有6年多的開發經驗、4年多Python開發經驗,在航空、零售、物流、汽車、通訊等多個行業應用過機器學習算法。
徐江,畢業於瑞典皇家理工學院的系統生物學專業,曾就職於Thoughtworks軟件技術有限公司。
目錄大綱
第1章向量、矩陣和數組.......................................... ........................... 1
1.0簡介.................... .................................................. ...............................1
1.1創建一個向量.............. .................................................. ......................1
1.2創建一個矩陣....................... .................................................. .............2
1.3創建一個稀疏矩陣............................... ................................................3
1.4選擇元素................................................ ..............................................5
1.5展示一個矩陣的屬性............................................... ............................6
1.6對多個元素同時應用某個操作........................................ ....................7
1.7找到最大值和最小值...................... .................................................. ...8
1.8計算平均值、方差和標準差..................................... ...........................9
1.9矩陣變形................... .................................................. .......................10
1.10轉置向量或矩陣.................... ........................................... 11
1.11展開一個矩陣.. .................................................. ................................12
1.12計算矩陣的秩............ .................................................. ......................13
1.13計算行列式.............................................. ..........................................14
1.14獲取矩陣的對角線元素................................................. ....................14
1.15計算矩陣的跡........................ .................................................. ..........15
1.16計算特徵值和特徵向量................................ .....................................16
1.17計算點積........ .................................................. .................................17
1.18矩陣的相加或相減........ .................................................. ..................18
1.19矩陣的乘法........................... .................................................. ...........19
1.20計算矩陣的逆............................................. .......................................20
1.21生成隨機數...... .................................................. ................................21
第2章加載數據............ .................................................. .................. 23
2.0簡介............................. .................................................. ....................23
2.1加載樣本數據集........................ .................................................. ......23
2.2創建仿真數據集...................................... ..........................................25
2.3加載CSV文件... .................................................. .............................28
2.4加載Excel文件.............................................. ...................................29
2.5加載JSON文件.......... .................................................. .....................29
2.6查詢SQL數據庫........................ .................................................. .....31
第3章數據整理....................................... ......................................... 33
3.0簡介...... .................................................. ...........................................33
3.1創建一個數據幀. .................................................. .............................34
3.2描述數據................. .................................................. .........................35
3.3瀏覽數據幀.............................................. ..........................................37
3.4根據條件語句來選擇行.................................................. ...................39
3.5替換值........................... .................................................. ..................40
3.6重命名列........................... .................................................. ...............41
3.7計算最小值、最大值、總和、平均值與計數值................... .............43
3.8查找唯一值................................ .................................................. ......44
3.9處理缺失值....................................... .................................................45
3.10刪除一列............................................... ............................................47
3.11刪除一行.. .................................................. ........................................48
3.12刪除重複行..... .................................................. .................................49
3.13根據值對行分組.......... .................................................. ....................51
3.14按時間段對行分組...................... .................................................. ....52
3.15遍歷一個列的數據....................................... .....................................54
3.16對一列的所有元素應用某個函數.. .................................................. ..55
3.17對所有分組應用一個函數........................................... ......................56
3.18連接多個數據幀..................... .................................................. .........57
3.19合併兩個數據幀.................................. ..............................................59
第4章處理數值型數據.............................................. ........................ 63
4.0簡介....................... .................................................. ..........................63
4.1特徵的縮放................... .................................................. ...................63
4.2特徵的標準化.......................... .................................................. ........65
4.3歸一化觀察值............................................ ........................................66
4.4生成多項式和交互特徵... .................................................. ................69
4.5轉換特徵.............................. .................................................. ............70
4.6識別異常值................................. .................................................. .....71
4.7處理異常值........................................ ................................................73
4.8將特徵離散化.............................................. ......................................75
4.9使用聚類的方式將觀察值分組. .................................................. .......77
4.10刪除帶有缺失值的觀察值......................................... ........................79
4.11填充缺失值..................... .................................................. .................81
第5章處理分類數據.......................... ............................................... 83
5.0簡介.................................................. .................................................83
5.1對nominal型分類特徵編碼........................................... ...................84
5.2對ordinal分類特徵編碼........................ ............................................86
5.3對特徵字典編碼.................................................. ..............................88
5.4填充缺失的分類值............................................ .................................91
5.5處理不均衡分類........... .................................................. ...................93
第6章處理文本......................... .................................................. ..... 97
6.0簡介.......................................... .................................................. .......97
6.1清洗文本....................................... .................................................. ...97
6.2解析並清洗HTML ......................................... ...................................99
6.3移除標點.......... .................................................. .............................. 100
6.4文本分詞............................................... ........................................... 101
6.5刪除停止詞(stop word)......................................... 102
6.6提取詞幹.. .................................................. ...................................... 103
6.7標註詞性........ .................................................. ................................ 104
6.8將文本編碼成詞袋(Bag of Words)..... ........................................... 107
6.9按單詞的重要性加權....................................... 109
第7章處理日期和時間.. .................................................. ................ 113
7.0簡介................................................ ................................................. 113
7.1把字符串轉換成日期........................................... .............. 113
7.2處理時區................................ .................................................. ........ 115
7.3選擇日期和時間.................................... .......................................... 116
7.4將日期數據切分成多個特徵................................................ ............ 117
7.5計算兩個日期之間的時間差............................ ................................ 118
7.6對一周內的各天進行編碼........ .................................................. ..... 119
7.7創建一個滯後的特徵............................................ ........... 120
7.8使用滾動時間窗口................................. .......................................... 121
7.9處理時間序列中的缺失值................................................. .............. 123
第8章圖像處理.............................. ................................................ 127
8.0簡介................................................. ................................................ 127
8.1加載圖像................................................ .......................................... 128
8.2保存圖像.... .................................................. .................................... 130
8.3調整圖像大小.............................................. .................................... 131
8.4裁剪圖像.......... .................................................. .............................. 132
8.5平滑處理圖像............... .................................................. ................. 133
8.6圖像銳化............................ .................................................. ............ 136
8.7提升對比度.................................. ................................ 138
8.8顏色分離.............. .................................................. .......................... 140
8.9圖像二值化.................. ........................ 142
8.10移除背景............................................. 144
8.11邊緣檢測............................................... ........................................... 148
8.12角點檢測.. ............................... 150
8.13為機器學習創建特徵............ ..................................... 153
8.14將顏色平均值編碼成特徵.... .................................................. ......... 156
8.15將色彩直方圖編碼成特徵................................ ............................... 157
第9章利用特徵提取進行特徵降維........ ........................................... 161
9.0簡介.... .................................................. ........................................... 161
9.1使用主成分進行特徵降維.......................................... ..................... 161
9.2對線性不可分數據進行特徵降維................... ................................. 164
9.3通過最大化類間可分性進行特徵降維... .......................................... 166
9.4使用矩陣分解法進行特徵降維...................................... 169
9.5對稀疏數據進行特徵降維. .................................................. ............ 170
第10章使用特徵選擇進行降維............................ ............................ 173
10.0簡介................... ........................................ 173
10.1數值型特徵方差的閾值化. ..................................... 173
10.2二值特徵的方差閾值化.... ........................................ 175
10.3處理高度相關性的特徵.......................................... 176
10.4刪除與分類任務不相關的特徵......................................... ............. 178
10.5遞歸式特徵消除............................... ............................................. 180
第11章模型評估................................................. ........................... 183
11.0簡介.................... .................................................. 183
11.1交叉驗證模型.......................................... 183
11.2創建一個基準回歸模型........................................ 187
11.3創建一個基準分類模型.................................. 188
11.4評估二元分類器........ ........................................ 190
11.5評估二元分類器的閾值..................................... 193
11.6評估多元分類器................................................. ......... 197
11.7分類器性能的可視化.................................. ................................... 198
11.8評估回歸模型.......... ................................... 201
11.9評估聚類模型......... .................................................. . 203
11.10創建自定義評估指標.......................................... ........................... 204
11.11可視化訓練集規模的影響............... .............................................. 206
11.12生成對評估指標的報告.............................................. ...... 208
11.13可視化超參數值的效果........................................... ...... 209
第12章模型選擇...................................... ...................................... 213
12.0簡介......... ........................................... 213
12.1使用窮舉搜索選擇最佳模型............................................... ........... 213
12.2使用隨機搜索選擇最佳模型.............................. ............................ 216
12.3從多種學習算法中選擇最佳模型.......... ........ 218
12.4將數據預處理加入模型選擇過程.............................. 220
12.5用並行化加速模型選擇................................. 221
12.6使用針對特定算法的方法加速模型選擇....................................... 223
12.7模型選擇後的性能評估............................ 224
第13章線性回歸......... .................................................. ................. 227
13.0簡介.............................. .......... 227
13.1擬合一條直線.................................. ........ 227
13.2處理特徵之間的影響.................................. ................................... 229
13.3擬合非線性關係........ .................................................. .................. 231
13.4通過正則化減少方差......................... ............................................ 233
13.5使用套索回歸減少特徵.............................................. 235
第14章樹和森林............................................ ................................ 237
14.0簡介............... ................ 237
14.1訓練決策樹分類器........................... .............................................. 237
14.2訓練決策樹回歸模型............................................... ...................... 239
14.3可視化決策樹模型...................... .................................................. . 240
14.4訓練隨機森林分類器.......................................... ........................... 243
14.5訓練隨機森林回歸模型................ ............ 244
14.6識別隨機森林中的重要特徵............................. ............................. 245
14.7選擇隨機森林中的重要特徵.......................................... ................ 248
14.8處理不均衡的分類........................... .............................................. 249
14.9控制決策樹的規模............................................... .......................... 250
14.10通過boosting提高性能.................. .............................................. 252
14.11使用袋外誤差(Out-of-Bag Error)評估隨機森林模型................ 253
第15章KNN ............... .................................................. .................. 255
15.0簡介............................. ...................................... 255
15.1找到一個觀察值的最近鄰... .............................................. 255
15.2創建一個KNN分類器............................................ ....................... 258
15.3確定最佳的鄰域點集的大小............... ........................................... 260
15.4創建一個基於半徑的最近鄰分類器......................... 261
第16章邏輯回歸............... .................................................. ........... 263
16.0簡介.................................... ........................... 263
16.1訓練二元分類器................ .................................................. .......... 263
16.2訓練多元分類器.................................. .......................................... 265
16.3通過正則化來減小方差............................................. 266
16.4在超大數據集上訓練分類器......................................... ................. 267
16.5處理不均衡的分類.......................... ............................................... 269
第17章支持向量機.............................................. ........................... 271
17.0簡介.................... ................................................. 271
17.1訓練一個線性分類器............................................ ......................... 271
17.2使用核函數處理線性不可分的數據............... ...................... 274
17.3計算預測分類的概率..................... ................................................ 278
17.4識別支持向量............................................... ........ 279
17.5處理不均衡的分類............................................ ............................. 281
第18章樸素貝葉斯............. .................................................. .......... 283
18.0簡介..................................... ........................ 283
18.1為連續的數據訓練分類器................. ............................ 284
18.2為離散數據和計數數據訓練分類器........... .................... 286
18.3為具有二元特徵的數據訓練樸素貝葉斯分類器.............. .............. 287
18.4校準預測概率............................... ......... 288
第19章聚類................................... ................................................ 291
19.0簡介................................................. ............... 291
19.1使用K-Means聚類算法.......................................... ...................... 291
19.2加速K-Means聚類.................... .................................................. .. 294
19.3使用Meanshift聚類算法......................................... ...................... 295
19.4使用DBSCAN聚類算法..................... .......................................... 296
19.5使用層次合併聚類算法.......................................... 298
第20章神經網絡.. .................................................. ........................ 301
20.0簡介....................... ............................................... 301
20.1為神經網絡預處理數據............................................. ....... 302
20.2設計一個神經網絡............................................. ............................ 304
20.3訓練一個二元分類器.............. .................................................. ..... 307
20.4訓練一個多元分類器...................................... ............................... 309
20.5訓練一個回歸模型............. .................................................. .......... 311
20.6做預測.................................... .................................................. ..... 313
20.7可視化訓練歷史........................................ .................................... 315
20.8通過權重調節減少過擬合..... ................................ 318
20.9通過提前結束減少過擬合......... ............................... 320
20.10通過Dropout減少過擬合........................................... .................. 322
20.11保存模型訓練過程.......................... ............................................... 324
20.12使用k折交叉驗證評估神經網絡........................................... ..... 326
20.13調校神經網絡....................................... ................................. 328
20.14可視化神經網絡............ .................................................. .............. 331
20.15圖像分類................................ .................................................. ..... 333
20.16通過圖像增強來改善卷積神經網絡的性能.............................. 337
20.17文本分類............................................... ........................................ 339
第21章保存和加載訓練後的模型................................................. .... 343
21.0簡介........................................... ............................................ 343
21.1保存和加載scikit -learn模型............................................... .......... 343
21.2保存和加載Keras模型................................. ................................. 345