Learning Pentaho Data Integration 8 CE - Third Edition: An end-to-end guide to exploring, transforming, and integrating your data across multiple sources
暫譯: 學習 Pentaho 數據整合 8 CE - 第三版:全面指南,探索、轉換及整合來自多個來源的數據
Maria Carina Roldan
- 出版商: Packt Publishing
- 出版日期: 2017-12-05
- 售價: $2,220
- 貴賓價: 9.5 折 $2,109
- 語言: 英文
- 頁數: 500
- 裝訂: Paperback
- ISBN: 178829243X
- ISBN-13: 9781788292436
海外代購書籍(需單獨結帳)
商品描述
Get up and running with the Pentaho Data Integration tool using this hands-on, easy-to-read guide
Key Features
- Manipulate your data by exploring, transforming, validating, and integrating it using Pentaho Data Integration 8 CE
- A comprehensive guide exploring the features of Pentaho Data Integration 8 CE
- Connect to any database engine, explore the databases, and perform all kind of operations on relational databases
Book Description
Pentaho Data Integration(PDI) is an intuitive and graphical environment packed with drag-and-drop design and powerful Extract-Tranform-Load (ETL) capabilities. This book shows and explains the new interactive features of Spoon, the revamped look and feel, and the newest features of the tool including transformations and jobs Executors and the invaluable Metadata Injection capability.
We begin with the installation of PDI software and then move on to cover all the key PDI concepts. Each of the chapter introduces new features, enabling you to gradually get practicing with the tool. First, you will learn to do all kind of data manipulation and work with simple plain files. Then, the book teaches you how you can work with relational databases inside PDI. Moreover, you will be given a primer on data warehouse concepts and you will learn how to load data in a data warehouse. During the course of this book, you will be familiarized with its intuitive, graphical and drag-and-drop design environment.
By the end of this book, you will learn everything you need to know in order to meet your data manipulation requirements. Besides, your will be given best practices and advises for designing and deploying your projects.
What you will learn
- Explore the features and capabilities of Pentaho Data Integration 8 Community Edition
- Install and get started with PDI
- Learn the ins and outs of Spoon, the graphical designer tool
- Learn to get data from all kind of data sources, such as plain files, Excel spreadsheets, databases, and XML files
- Use Pentaho Data Integration to perform CRUD (create, read, update, and delete) operations on relationaldatabases
- Populate a data mart with Pentaho Data Integration
- Use Pentaho Data Integration to organize files and folders, run daily processes, deal with errors, and more
Who This Book Is For
This book is a must-have for software developers, business intelligence analysts, IT students, or anyone involved or interested in developing ETL solutions. If you plan on using Pentaho Data Integration for doing any data manipulation task, this book will help you as well. This book is also a good starting point for data warehouse designers, architects, or anyone who is responsible for data warehouse projects and needs to load data into them.
Table of Contents
- Getting Started with Pentaho Data Integration
- Getting Started with Transformations
- Creating Basic Task Flows
- Reading and writing files
- Manipulating PDI Data and Metadata
- Controlling the Flow of Data
- Validating, Fixing and Cleansing data
- Transforming the Data by Coding
- Transforming the Dataset
- Performing Basic Operations with Databases
- Loading datamarts with PDI
- Creating Portable and Reusable Transformations
- Implementing Metadata Injection
- Creating Advanced Jobs
- Launching Transformations and Jobs from the Command Line
- Best Practices for Designing and Deploying a PDI Project
商品描述(中文翻譯)
**使用這本實用、易讀的指南快速上手 Pentaho Data Integration 工具**
### 主要特點
- 使用 Pentaho Data Integration 8 CE 操作您的數據,進行探索、轉換、驗證和整合
- 全面介紹 Pentaho Data Integration 8 CE 的功能
- 連接到任何資料庫引擎,探索資料庫,並對關聯資料庫執行各種操作
### 書籍描述
Pentaho Data Integration (PDI) 是一個直觀且圖形化的環境,具備拖放設計和強大的提取-轉換-加載 (ETL) 功能。本書展示並解釋了 Spoon 的新互動功能、重新設計的外觀和感覺,以及該工具的最新功能,包括轉換和作業執行器以及寶貴的元數據注入能力。
我們將從 PDI 軟體的安裝開始,然後涵蓋所有關鍵的 PDI 概念。每一章都介紹新功能,使您能夠逐步練習使用該工具。首先,您將學習如何進行各種數據操作並處理簡單的純文本文件。接著,本書將教您如何在 PDI 中使用關聯資料庫。此外,您將獲得數據倉儲概念的入門知識,並學習如何將數據加載到數據倉儲中。在本書的過程中,您將熟悉其直觀、圖形化和拖放設計環境。
在本書結束時,您將學會滿足數據操作需求所需的所有知識。此外,您將獲得設計和部署項目的最佳實踐和建議。
### 您將學到什麼
- 探索 Pentaho Data Integration 8 Community Edition 的功能和能力
- 安裝並開始使用 PDI
- 學習 Spoon 這個圖形設計工具的各種細節
- 學習如何從各種數據來源獲取數據,例如純文本文件、Excel 試算表、資料庫和 XML 文件
- 使用 Pentaho Data Integration 在關聯資料庫上執行 CRUD(創建、讀取、更新和刪除)操作
- 使用 Pentaho Data Integration 填充數據市集
- 使用 Pentaho Data Integration 組織文件和資料夾、運行日常流程、處理錯誤等
### 本書適合誰
本書是軟體開發人員、商業智慧分析師、IT 學生或任何參與或有興趣開發 ETL 解決方案的人必備的資源。如果您計劃使用 Pentaho Data Integration 進行任何數據操作任務,本書也將對您有所幫助。本書也是數據倉儲設計師、架構師或任何負責數據倉儲項目並需要將數據加載到其中的人的良好起點。
### 目錄
1. 開始使用 Pentaho Data Integration
2. 開始使用轉換
3. 創建基本任務流程
4. 讀取和寫入文件
5. 操作 PDI 數據和元數據
6. 控制數據流
7. 驗證、修正和清理數據
8. 通過編碼轉換數據
9. 轉換數據集
10. 在資料庫上執行基本操作
11. 使用 PDI 加載數據市集
12. 創建可攜帶和可重用的轉換
13. 實施元數據注入
14. 創建高級作業
15. 從命令行啟動轉換和作業
16. 設計和部署 PDI 項目的最佳實踐