Mapping Data Flows in Azure Data Factory: Building Scalable Etl Projects in the Microsoft Cloud
暫譯: 在 Azure Data Factory 中映射數據流:在微軟雲端構建可擴展的 ETL 專案
Kromer, Mark
- 出版商: Apress
- 出版日期: 2022-08-26
- 售價: $2,250
- 貴賓價: 9.5 折 $2,138
- 語言: 英文
- 頁數: 194
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1484286111
- ISBN-13: 9781484286111
-
相關分類:
Microsoft Azure、JVM 語言
海外代購書籍(需單獨結帳)
商品描述
Build scalable ETL data pipelines in the cloud using Azure Data Factory's Mapping Data Flows. Each chapter of this book addresses different aspects of an end-to-end data pipeline that includes repeatable design patterns based on best practices using ADF's code-free data transformation design tools. The book shows data engineers how to take raw business data at cloud scale and turn that data into business value by organizing and transforming the data for use in data science projects and analytics systems.
The book begins with an introduction to Azure Data Factory followed by an introduction to its Mapping Data Flows feature set. Subsequent chapters show how to build your first pipeline and corresponding data flow, implement common design patterns, and operationalize your result. By the end of the book, you will be able to apply what you've learned to your complex data integration and ETL projects in Azure. These projects will enable cloud-scale big analytics and data loading and transformation best practices for data warehouses. What You Will Learn
Who This Book Is For
Data engineers who are new to building complex data transformation pipelines in the cloud with Azure; and data engineers who need ETL solutions that scale to match swiftly growing volumes of data
The book begins with an introduction to Azure Data Factory followed by an introduction to its Mapping Data Flows feature set. Subsequent chapters show how to build your first pipeline and corresponding data flow, implement common design patterns, and operationalize your result. By the end of the book, you will be able to apply what you've learned to your complex data integration and ETL projects in Azure. These projects will enable cloud-scale big analytics and data loading and transformation best practices for data warehouses. What You Will Learn
- Build scalable ETL jobs in Azure without writing code
- Transform big data for data quality and data modeling requirements
- Understand the different aspects of Azure Data Factory ETL pipelines from datasets and Linked Services to Mapping Data Flows
- Apply best practices for designing and managing complex ETL data pipelines in Azure Data Factory
- Add cloud-based ETL patterns to your set of data engineering skills
- Build repeatable code-free ETL design patterns
Who This Book Is For
Data engineers who are new to building complex data transformation pipelines in the cloud with Azure; and data engineers who need ETL solutions that scale to match swiftly growing volumes of data
商品描述(中文翻譯)
使用 Azure Data Factory 的 Mapping Data Flows 在雲端構建可擴展的 ETL 數據管道。本書的每一章都針對端到端數據管道的不同方面,這些管道基於最佳實踐,使用 ADF 的無代碼數據轉換設計工具,包含可重複的設計模式。本書向數據工程師展示如何在雲端規模下處理原始業務數據,並通過組織和轉換數據,將其轉化為商業價值,以便用於數據科學項目和分析系統。
本書首先介紹 Azure Data Factory,接著介紹其 Mapping Data Flows 功能集。隨後的章節將展示如何構建您的第一個管道及相應的數據流,實現常見的設計模式,並將結果運營化。在本書結束時,您將能夠將所學應用於 Azure 中的複雜數據整合和 ETL 項目。這些項目將使雲端規模的大數據分析和數據加載及轉換的最佳實踐成為可能。
您將學到什麼
- 在 Azure 中構建可擴展的 ETL 作業而無需編寫代碼
- 轉換大數據以滿足數據質量和數據建模需求
- 了解 Azure Data Factory ETL 管道的不同方面,從數據集和 Linked Services 到 Mapping Data Flows
- 應用最佳實踐來設計和管理 Azure Data Factory 中的複雜 ETL 數據管道
- 將基於雲的 ETL 模式添加到您的數據工程技能集中
- 構建可重複的無代碼 ETL 設計模式
本書適合誰
本書適合剛開始在雲端使用 Azure 構建複雜數據轉換管道的數據工程師,以及需要能夠擴展以匹配快速增長的數據量的 ETL 解決方案的數據工程師。
作者簡介
Mark Kromer has been in the data analytics product space for over 20 years and is currently a Principal Program Manager for Microsoft's Azure data integration products. Mark often writes and speaks on big data analytics and data analytics and was an engineering architect and product manager for Oracle, Pentaho, AT&T, and Databricks prior to Microsoft Azure.
作者簡介(中文翻譯)
馬克·克羅默在數據分析產品領域已有超過20年的經驗,目前擔任微軟 Azure 數據整合產品的首席專案經理。馬克經常撰寫和演講有關大數據分析和數據分析的主題,並在加入微軟 Azure 之前,曾擔任甲骨文(Oracle)、Pentaho、AT&T 和 Databricks 的工程架構師和產品經理。