Parallel Database Techniques

Mahdi Abdelguerfi, Kam-Fai Wong

  • 出版商: Wiley
  • 出版日期: 1998-08-13
  • 售價: $3,590
  • 貴賓價: 9.5$3,411
  • 語言: 英文
  • 頁數: 230
  • 裝訂: Hardcover
  • ISBN: 0818683988
  • ISBN-13: 9780818683985
  • 相關分類: 資料庫
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

相關主題

商品描述

Description:

The use of parallel processing technology in the next generation of Database Management Systems (DBMSs) makes it possible to meet new and challenging requirements. Database technology in rapidly expanding new application areas brings unique challenges such as increased functionality and efficient handling of very large heterogeneous databases.

Abdelguerfi and Wong present the latest techniques in parallel relational databases illustrating high-performance achievements in parallel database systems. The text is structured according to the overall architecture of a parallel database system presenting various techniques that may be adopted to the design of parallel database software and hardware execution environments. These techniques can directly or indirectly lead to high-performance parallel database implementation.

The book's main focus follows the authors' engineering model: A survey of parallel query optimization techniques for requests involving multi-way joins; A new technique for a join operation that can be adopted in the local optimization stage; A framework for recovery in parallel database systems using the ACTA formalism; The architectural details of NCR's new Petabyte multimedia database system; A description of the Super Database Computer (SDC-II); A case study for a shared-nothing parallel database server that analyzes and compares the effectiveness of five data placement techniques.

Table of Contents:

1 Introduction.

1.1 Background.

1.2 Parallel Database Systems.

1.2.1 Computation Model.

1.2.2 Engineering Model.

1.3 About this Manuscript.

Bibliography.

I: Request Manager.

2 Designing an Optimizer for Parallel Relational Systems.

2.1 Introduction.

2.2 Overall Design Issues.

2.2.1 Design a Simple Parallel Execution Model.

2.2.2 The Two-Phase Approach.

2.2.3 Parallelizing is Adding Information!

2.2.4 Two-Phase versus Parallel Approaches.

2.3 Parallelization.

2.3.1 Kinds of Parallelism.

2.3.2 Specifying Parallel Execution.

2.4 Search Space.

2.4.1 Slicing Hash Join Trees.

2.4.2 Search Space Size.

2.4.3 Heuristics.

2.4.4 The Two-Phase Heuristics.

2.5 Cost Model.

2.5.1 Exceptions to the Principle of Optimality.

2.5.2 Resources.

2.5.3 Skew and Size Model.

2.5.4 The Cost Function.

2.6 Search Strategies.

2.6.1 Deterministic Search Strategies.

2.6.2 Randomized Strategies.

2.7 Conclusion.

Bibliography.

3 New Approaches to Parallel Join Utilizing Page Connectivity Information.

3.1 Introduction.

3.2 The Environment and a Motivating Example.

3.3 The Methodology.

3.3.1 Definition of Parameters.

3.3.2 The Balancing Algorithm.

3.3.3 Schedules for Reading Join Components and Data Pages.

3.4 Performance Analysis.

3.4.1 The Evaluation Method.

3.4.2 Evaluation Results.

3.5 Concluding Remarks and Future Work.

Bibliography.

4 A Performance Evaluation Tool for Parallel Database Systems.

4.1 Introduction.

4.2 Performance Evaluation Methods.

4.2.1 Analytical Modeling.

4.2.2 Benchmarks.

4.2.3 Observations.

4.3 The Software Testpilot.

4.3.1 The Experiment Specification.

4.3.2 The Performance Assessment Cycle.

4.3.3 The System Interface.

4.4 The Software Testpilot and Oracle/Ncube.

4.4.1 Database System Performance Assessment.

4.4.2 The Oracle/Ncube Interface.

4.5 Preliminary Results.

4.6 Conclusion.

Bibliography.

5 Load Placement in Distributed High-Performance Database Systems.

5.1 Introduction.

5.2 Investigated System.

5.2.1 System Architecture.

5.2.2 Load Scenarios.

5.2.3 Trace Analysis.

5.2.4 Load Setup.

5.3 Load Placement Strategies Investigated.

5.4 Scheduling Strategies for Transactions.

5.5 Simulation Results.

5.5.1 Influence of Scheduling.

5.5.2 Evaluation of the Load Placement Strategies.

5.5.3 Lessons Learned.

5.5.4 Decision Parameters Used.

5.6 Conclusion and Open Issues.

Bibliography.

II: Parallel Machine Architecture.

6 Modeling Recovery in Client-Server Database Systems.

6.1 Introduction.

6.2 Uniprocessor Recovery and Formal.

Approach to Modeling Recovery.

6.2.1 Basic Formal Concepts.

6.2.2 Logging Mechanisms.

6.2.3 Runtime Policies for Ensuring Correctness.

6.2.4 Data Structures Maintained for Efficient Recovery.

6.2.5 Restart Recovery--The ARIES Approach.

6.3 LSN Sequencing Techniques for Multinode Systems.

6.4 Recovery in Client-Server Database Systems.

6.4.1 Client-Server EXODUS (ESM-CS).

6.4.2 Client-Server ARIES (ARIES/CSA).

6.4.3 Shared Nothing Clients with Disks (CD).

6.4.4 Summary of Recovery Approaches in Client-Server Architectures.

6.5 Conclusion.

Bibliography.

7 Parallel Strategies for a Petabyte Multimedia Database Computer.

7.1 Introduction.

7.2 Multimedia Data Warehouse, Databases, and Applications.

7.2.1 Three Waves of Multimedia Database Development.

7.2.2 National Medical Practice Knowledge Bank Application.

7.3 Massively Parallel Architecture, Infrastructure, and Technology.

7.3.1 Parallelism.

7.4 Teradata-MM Architecture, Framework, and New Concepts.

7.4.1 Teradata-MM Architecture.

7.4.2 Key New Concepts.

7.4.3 SQL3.

7.4.4 Federated Coordinator.

7.4.5 Teradata Multimedia Object Server.

7.5 Parallel UDF Execution Analysis.

7.5.1 UDF Optimizations.

7.5.2 PRAGMA Facility.

7.5.3 UDF Value Persistence Facility.

7.5.4 Spatial Indices for Content-Based Querying.

7.6 Conclusion.

Bibliography.

8 The MEDUSA Project.

8.1 Introduction.

8.2 Indexing and Data Partitioning.

8.2.1 Standard Systems.

8.2.2 Grid Files.

8.3 Dynamic Load Balancing.

8.3.1 Data Access Frequency.

8.3.2 Data Distribution.

8.3.3 Query Partitioning.

8.4 The MEDUSA Project.

8.4.1 The MEDUSA Architecture.

8.4.2 Software.

8.4.3 Grid File Implementation.

8.4.4 Load Balancing Strategy.

8.5 MEDUSA Performance Results.

8.5.1 Test Configuration.

8.5.2 Transaction Throughput.

8.5.3 Speedup.

8.5.4 Load Balancing Test Results.

8.6 Conclusions.

Bibliography.

III: Partitioned Data Store.

9 System Software of the Super Database Computer SDC-II.

9.1 Introduction.

9.2 Architectural Overview of the SDC-II.

9.3 Design and Organization of the SDC-II System Software.

9.3.1 Parallel Execution Model.

9.3.2 I/O Model and Buffer Management Strategy for Bulk Data Transfer.

9.3.3 Process Model and Efficient Flow Control Mechanism.

9.3.4 Structure of the System Software Components.

9.4 Evaluation of the SDC-II System.

9.4.1 Details of a Sample Query Processing.

9.4.2 Comparison with Commercial Systems.

9.5 Conclusion.

Bibliography.

10 Data Placement in Parallel Database Systems.

10.1 Introduction.

10.2 Overview of Data Placement Strategies.

10.2.1 Declustering and Redistribution.

10.2.2 Placement.

10.3 Effects of Data Placement.

10.3.1 STEADY and TPC-C.

10.3.2 Dependence on Number of Processing Elements.

10.3.3 Dependence on Database Size.

10.4 Conclusions.

Bibliography.

Contributors.

商品描述(中文翻譯)

描述:
在下一代資料庫管理系統(DBMS)中使用平行處理技術,使得滿足新的挑戰性需求成為可能。資料庫技術在快速擴展的新應用領域中帶來了獨特的挑戰,例如增加功能性和有效處理非常大的異構資料庫。

Abdelguerfi 和 Wong 提出了平行關聯資料庫的最新技術,展示了平行資料庫系統中的高效能成就。文本的結構依據平行資料庫系統的整體架構,介紹了可採用於平行資料庫軟體和硬體執行環境設計的各種技術。這些技術可以直接或間接導致高效能的平行資料庫實作。

本書的主要焦點遵循作者的工程模型:對涉及多路連接的請求的平行查詢優化技術進行調查;一種可在本地優化階段採用的連接操作新技術;使用 ACTA 形式主義的平行資料庫系統恢復框架;NCR 新的 Petabyte 多媒體資料庫系統的架構細節;超資料庫計算機(SDC-II)的描述;一個共享無物平行資料庫伺服器的案例研究,分析和比較五種資料放置技術的有效性。

目錄:
1 引言。
1.1 背景。
1.2 平行資料庫系統。
1.2.1 計算模型。
1.2.2 工程模型。
1.3 關於本手稿。
參考文獻。
I:請求管理器。
2 為平行關聯系統設計優化器。
2.1 引言。
2.2 整體設計問題。
2.2.1 設計簡單的平行執行模型。
2.2.2 雙階段方法。
2.2.3 平行化即是增加資訊!
2.2.4 雙階段與平行方法的比較。
2.3 平行化。
2.3.1 平行性的種類。
2.3.2 指定平行執行。
2.4 搜索空間。
2.4.1 切片哈希連接樹。
2.4.2 搜索空間大小。
2.4.3 啟發式。
2.4.4 雙階段啟發式。
2.5 成本模型。
2.5.1 最優性原則的例外情況。
2.5.2 資源。