並行程序設計
劉軼、楊海龍
買這商品的人也買了...
-
$446PCI Express 體系結構導讀
-
$480$379 -
$708$673 -
$301路由與交換技術(華為信息與網絡技術學院指定教材)/ICT認證系列叢書
-
$390$371 -
$880$748 -
$474$450 -
$648$616 -
$454高性能架構之道:分佈式、並發編程、數據庫調優、緩存設計、IO模型、前端優化、高可用
-
$505嚴肅的密碼學:實用現代加密術
-
$414$393 -
$834$792 -
$280$266 -
$454Wi-Fi 7 開發參考:技術原理、標準和應用
-
$454互聯網技術十講
-
$407GPT 圖解 : 大模型是怎樣構建的
-
$301智能邊緣計算
-
$594$564 -
$556高效能並行運行時系統:設計與實現
-
$352AI Agent:AI的下一個風口
-
$356通信系統實戰筆記 無處不在的信號處理
-
$774$735 -
$654$621 -
$834$792 -
$708$673
相關主題
商品描述
目錄大綱
目錄
第 1 章
並行程序設計概述
.
........................................................................1
1.1 並行性概述 ..................................................................................................1
1.2 如何衡量計算速度 ......................................................................................3
1.3 並行計算系統基本知識 ..............................................................................6
1.3.1 弗林分類 .........................................................................................6
1.3.2 共享內存系統與消息傳遞系統 .....................................................8
1.3.3 幾種常見的並行計算系統 ...........................................................10
1.3.4 互連網絡 .......................................................................................15
1.3.5 多級存儲體系結構 .......................................................................16
1.4 並行編程語言/接口分類 ........................................................................17
1.5 浮點數格式 ................................................................................................19
1.6 例子程序 ....................................................................................................21
1.6.1 矩陣相乘 .......................................................................................21
1.6.2 規約和掃描 ...................................................................................22
1.7 小結 ............................................................................................................26
習題 ............................................................................................................27
第 2 章
共享內存系統並行編程
.
.............................................................28
2.1 共享內存系統中的並行模型 ....................................................................28
2.1.1 多線程並行概述 ...........................................................................29
2.1.2 同步與互斥的概念 .......................................................................30
2.2 OpenMP編程 ............................................................................................31
2.2.1 概述 ...............................................................................................31
2.2.2 OpenMP的基本命令 .....................................................................33
並行程序設計
X
2.2.3 共享工作構造及其組合 ...............................................................................35
2.2.4 線程間同步與互斥 .......................................................................................40
2.2.5 常用子句 .......................................................................................................43
2.2.6 OpenMP 示例程序:級數法計算圓周率 ....................................................51
2.2.7 task 工作構造 ................................................................................................52
2.3 Pthreads編程 .............................................................................................................57
2.3.1 Pthreads 簡介 .................................................................................................57
2.3.2 線程的創建和終止 .......................................................................................57
2.3.3 線程互斥 .......................................................................................................63
2.3.4 Pthreads 示例程序:級數法計算圓周率 .....................................................67
2.3.5 線程同步 .......................................................................................................69
2.3.6 Pthreads 示例程序:生產者–消費者 ...........................................................76
2.3.7 線程死鎖與鎖粒度 .......................................................................................79
2.4 面向多核系統的新型編程語言/接口 ....................................................................82
2.4.1 Cilk與Cilk++ .................................................................................................82
2.4.2 TBB ................................................................................................................85
2.5 小結 ............................................................................................................................88
習題 ............................................................................................................................88
第 3 章
消息傳遞系統並行編程
.
...........................................................................90
3.1 MPI 簡介 ...................................................................................................................90
3.1.1 MPI 是什麽? ...............................................................................................90
3.1.2 MPI 的並行模式 ...........................................................................................91
3.1.3 一個簡單的MPI 程序 ...................................................................................92
3.1.4 MPI 基本環境 ...............................................................................................93
3.1.5 通信子、進程組、進程號 ...........................................................................95
3.1.6 MPI 數據類型 ...............................................................................................96
3.1.7 MPI 通信簡介 ...............................................................................................98
3.2 點對點通信 ................................................................................................................99
3.2.1 標準通信模式 .............................................................................................100
3.2.2 緩存通信模式 .............................................................................................104
3.2.3 同步通信模式 .............................................................................................106
3.2.4 就緒通信模式 .............................................................................................106
3.2.5 四種通信模式小結 .....................................................................................107
3.2.6 組合發送接收 .............................................................................................108
3.2.7 非阻塞通信 .................................................................................................109
3.3 集合通信 ..................................................................................................................117
3.3.1 集合通信概述 .............................................................................................117
3.3.2 數據廣播MPI_Bcast ...................................................................................118
3.3.3 數據分發MPI_Scatter .................................................................................119
3.3.4 數據收集MPI_Gather .................................................................................121
3.3.5 組收集MPI_Allgather .................................................................................123
3.3.6 全互換MPI_Alltoall ....................................................................................124
3.3.7 規約MPI_Reduce ........................................................................................126
3.3.8 組規約MPI_Allreduce .................................................................................130
3.3.9 掃描MPI_Scan .............................................................................................130
3.3.10 柵欄MPI_Barrier .......................................................................................131
3.4 一個MPI示例程序 ................................................................................................132
3.4.1 數值積分的計算 .........................................................................................132
3.4.2 基於數值積分的圓周率計算程序 .............................................................133
3.4.3 MPI牆鐘時間 ..............................................................................................134
3.5 進程組和通信子 ......................................................................................................135
3.5.1 組管理 .........................................................................................................136
3.5.2 通信子管理 .................................................................................................138
3.5.3 組間通信子 .................................................................................................140
3.6 MPI與多線程 .........................................................................................................141
3.6.1 如何在MPI程序中使用多線程 ..................................................................141
3.6.2 MPI+OpenMP示例程序 ..............................................................................142
3.6.3 分析和討論 .................................................................................................144
3.7 進程拓撲 ..................................................................................................................145
3.7.1 進程拓撲簡介 .............................................................................................145
3.7.2 創建進程拓撲 .............................................................................................146
3.7.3 進程拓撲相關的通信函數 .........................................................................149
3.8 PGAS編程及語言 ..................................................................................................150
3.9 作業管理系統及使用 ..............................................................................................156
3.9.1 作業管理系統簡介 .....................................................................................156
3.9.2 Slurm簡介 ....................................................................................................156
3.9.3 在Slurm中以作業方式執行程序 ................................................................158
3.9.4 Slurm的作業腳本 ........................................................................................160
3.9.5 在Slurm中以其他方式執行程序 ................................................................161
3.9.6 Slurm常用命令 ............................................................................................162
3.10 小結 ........................................................................................................................166
習題 .........................................................................................................................167
第 4 章
異構系統並行編程
.
..................................................................................169
4.1 異構系統編程概述 ..................................................................................................169
4.2 面向NVIDIA GPU的CUDA編程 .......................................................................170
4.2.1 CUDA概述 ..................................................................................................170
4.2.2 Hello World程序:CUDA程序的基本形態 ..............................................172
4.2.3 兩個整數相加程序:CPU-GPU數據交換 ................................................173
4.2.4 向量求和程序:CUDA多線程 ..................................................................176
4.2.5 CUDA線程組織 ..........................................................................................177
4.2.6 CUDA內存層次與變量修飾符 ..................................................................181
4.2.7 函數修飾符 .................................................................................................184
4.2.8 CUDA流 ......................................................................................................185
4.2.9 性能優化 .....................................................................................................192
4.2.10 CUDA統一內存空間 ................................................................................197
4.2.11 使用多GPU ................................................................................................198
4.3 OpenCL編程 ...........................................................................................................200
4.3.1 OpenCL概述 ................................................................................................200
4.3.2 OpenCL程序的執行流程及相關API .........................................................202
4.3.3 OpenCL示例程序一:向量求和 ................................................................211
4.3.4 OpenCL的執行模型與線程組織 ................................................................215
4.3.5 OpenCL的內存層次結構 ............................................................................218
4.3.6 OpenCL示例程序二:矩陣相乘 ................................................................220
4.4 面向申威處理器的Athread編程 ...........................................................................222
4.4.1 申威處理器及其編程簡介 .........................................................................222
4.4.2 Hello World程序:Athread程序的基本形態 .............................................223
4.4.3 Athread變量的局部存儲空間屬性 .............................................................225
4.4.4 Athread主–從核編程接口 ...........................................................................225
4.4.5 Athread寄存器通信 .....................................................................................229
4.4.6 Athread版的Cannon並行矩陣相乘 ............................................................230
4.5 OpenACC編程 ........................................................................................................234
4.5.1 OpenACC概述 .............................................................................................234
4.5.2 OpenACC語法 .............................................................................................234
4.5.3 OpenACC循環並行性 .................................................................................237
4.5.4 基於申威處理器的OpenACC編程 .............................................................238
4.6 小結 ..........................................................................................................................246
習題 ..........................................................................................................................246
第 5 章
並行程序性能優化
.
.................................................................................248
5.1 Amdahl定律 ............................................................................................................248
5.2 影響性能的主要因素 ..............................................................................................250
5.2.1 並行開銷 .....................................................................................................250
5.2.2 負載均衡 .....................................................................................................251
5.2.3 並行粒度 .....................................................................................................252
5.2.4 並行劃分 .....................................................................................................252
5.2.5 依賴關系 .....................................................................................................253
5.2.6 局部性 .........................................................................................................254
5.3 並行程序的可擴展性及性能優化方法 ..................................................................255
5.3.1 什麽是並行程序的可擴展性? .................................................................255
5.3.2 確保並行程序可擴展性的重要原則:獨立計算塊 .................................256
5.3.3 數據劃分對性能和可擴展性的影響 .........................................................259
5.3.4 其他常用性能優化方法 .............................................................................264
5.4 PCAM並行設計方法 .............................................................................................266
5.4.1 劃分 .............................................................................................................266
5.4.2 通信 .............................................................................................................268
5.4.3 組合 .............................................................................................................270
5.4.4 映射 .............................................................................................................271
5.5 小結 ..........................................................................................................................272
習題 ..........................................................................................................................272
第 6 章
典型並行應用算法
.
.................................................................................274
6.1 矩陣相乘 ..................................................................................................................274
6.1.1 基於分塊的並行矩陣相乘 .........................................................................274
6.1.2 改進的分塊矩陣相乘——Cannon算法 .....................................................275
6.1.3 支持矩陣相乘的專用硬件——脈動陣列 .................................................277
6.2 線性方程組的直接求解 ..........................................................................................279
6.2.1 線性方程組及其求解方法簡介 .................................................................279
6.2.2 三角方程組的回代求解 .............................................................................281
6.2.3 高斯消去法 .................................................................................................281
6.2.4 LU分解算法 ................................................................................................282
6.2.5 並行LU分解:逐行交錯條帶劃分和塊–循環分配 ..................................285
6.3 線性方程組的迭代求解 ..........................................................................................286
6.3.1 經典迭代求解方法 .....................................................................................286
6.3.2 共軛梯度求解方法 .....................................................................................289
6.3.3 迭代法求解示例:偏微分方程求解 .........................................................295
6.3.4 幾種迭代法的並行性討論 .........................................................................298
6.3.5 稀疏矩陣的壓縮數據格式 .........................................................................299
6.4 快速排序 ..................................................................................................................301
6.5 快速傅里葉變換 ......................................................................................................303
6.5.1 算法背景 .....................................................................................................303
6.5.2 算法原理 .....................................................................................................303
6.5.3 遞歸算法轉換為迭代算法 .........................................................................306
6.5.4 並行算法 .....................................................................................................307
6.6 基礎線性代數庫和軟件包 ......................................................................................309
6.6.1 線性代數算法庫BLAS ...............................................................................309
6.6.2 線性代數軟件包LAPACK ..........................................................................312
6.7 小結 ..........................................................................................................................314
習題 ..........................................................................................................................314
附錄A
英文縮寫詞
.
................................................................................................316
參考文獻...................................................................................................................318