Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency (Paperback)
Kunle Olukotun
- 出版商: Morgan & Claypool
- 出版日期: 2007-12-01
- 售價: $1,590
- 貴賓價: 9.5 折 $1,511
- 語言: 英文
- 頁數: 154
- 裝訂: Paperback
- ISBN: 159829122X
- ISBN-13: 9781598291223
-
相關分類:
大數據 Big-data、雲端運算
海外代購書籍(需單獨結帳)
買這商品的人也買了...
-
$1,250$1,225 -
$620$490 -
$480$379 -
$380$323 -
$620$527 -
$960$758 -
$640$506 -
$880$695 -
$780$663 -
$600$474 -
$820$648 -
$620$490 -
$530$419 -
$490$417 -
$590$502 -
$490$417 -
$1,580$1,501 -
$290$261 -
$450$351 -
$490$417 -
$834$792 -
$490$417 -
$680$537 -
$556大規模語言模型:從理論到實踐
-
$534$507
相關主題
商品描述
Chip multiprocessors - also called multi-core microprocessors or CMPs for short - are now the only way to build high-performance microprocessors, for a variety of reasons. Large uniprocessors are no longer scaling in performance, because it is only possible to extract a limited amount of parallelism from a typical instruction stream using conventional superscalar instruction issue techniques. In addition, one cannot simply ratchet up the clock speed on today's processors, or the power dissipation will become prohibitive in all but water-cooled systems. Compounding these problems is the simple fact that with the immense numbers of transistors available on today's microprocessor chips, it is too costly to design and debug ever-larger processors every year or two. CMPs avoid these problems by filling up a processor die with multiple, relatively simpler processor cores instead of just one huge core. The exact size of a CMPs cores can vary from very simple pipelines to moderately complex superscalar processors, but once a core has been selected the CMPs performance can easily scale across silicon process generations simply by stamping down more copies of the hard-to-design, high-speed processor core in each successive chip generation. In addition, parallel code execution, obtained by spreading multiple threads of execution across the various cores, can achieve significantly higher performance than would be possible using only a single core. While parallel threads are already common in many useful workloads, there are still important workloads that are hard to divide into parallel threads. The low inter-processor communication latency between the cores in a CMP helps make a much wider range of applications viable candidates for parallel execution than was possible with conventional, multi-chip multiprocessors; nevertheless, limited parallelism in key applications is the main factor limiting acceptance of CMPs in some types of systems.
商品描述(中文翻譯)
多核心晶片(Chip multiprocessors,簡稱CMP)現在是建構高效能微處理器的唯一方式,原因有很多。大型單核心處理器的性能已經無法再提升,因為傳統的超純量指令發射技術只能從典型指令流中提取有限的並行性。此外,現今的處理器無法單純提高時脈速度,否則功耗將變得難以控制,除非使用水冷系統。這些問題的加劇還在於,如今微處理器晶片上的晶體管數量非常龐大,每一到兩年設計和除錯一個更大的處理器變得成本過高。CMP通過在處理器晶片上填充多個相對簡單的處理器核心,而不是一個巨大的核心,來避免這些問題。CMP的核心大小可以從非常簡單的流水線到相對複雜的超純量處理器不等,但一旦選定了核心,CMP的性能可以通過在每一代晶片中增加更多難以設計的高速處理器核心的複製來輕鬆擴展到下一代矽製程。此外,通過將多個執行緒分散到各個核心上進行並行執行,並行代碼執行可以實現比僅使用單個核心更高的性能。雖然並行執行在許多實用工作負載中已經很常見,但仍然有一些重要的工作負載很難分成並行執行的執行緒。CMP中核心之間的低互連通信延遲有助於使更廣泛的應用程序成為並行執行的可行候選,這在傳統的多晶片多處理器中是不可能的;然而,關鍵應用程序中的有限並行性是限制某些類型系統接受CMP的主要因素。