Azure Data Factory Cookbook - Second Edition: A data engineer's guide to building and managing ETL and ELT pipelines with data integration
暫譯: Azure Data Factory 食譜 - 第二版:數據工程師構建和管理 ETL 及 ELT 管道的數據整合指南

Foshin, Dmitry, Chernyshova, Tonya, Anoshin, Dmitry

  • 出版商: Packt Publishing
  • 出版日期: 2024-02-28
  • 售價: $2,200
  • 貴賓價: 9.5$2,090
  • 語言: 英文
  • 頁數: 532
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1803246596
  • ISBN-13: 9781803246598
  • 相關分類: Microsoft Azure
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Solve real-world data problems and create data-driven workflows for easy data movement and processing at scale with Azure Data Factory


Key Features:

  • Learn how to load and transform data from various sources, both on-premises and on cloud
  • Use Azure Data Factory's visual environment to build and manage hybrid ETL pipelines
  • Discover how to prepare, transform, process, and enrich data to generate key insights


Book Description:

This new edition of the Azure Data Factory Cookbook, fully updated to reflect ADS V2, will help you get up and running by showing you how to create and execute your first job in ADF.


You'll learn how to branch and chain activities, create custom activities, and schedule pipelines, as well as discovering the benefits of cloud data warehousing, Azure Synapse Analytics, and Azure Data Lake Gen2 Storage.


With practical recipes, you'll learn how to actively engage with analytical tools from Azure Data Services and leverage your on-premises infrastructure with cloud-native tools to get relevant business insights. As you advance, you'll be able to integrate the most commonly used Azure Services into ADF and understand how Azure services can be useful in designing ETL pipelines. You'll familiarize yourself with the common errors that you may encounter while working with ADF and find out how to use the Azure portal to monitor pipelines. You'll also understand error messages and resolve problems in connectors and data flows with the debugging capabilities of ADF.


Two new chapters covering Azure Data Explorer and key best practices have been added, along with new recipes throughout.


By the end of this book, you'll be able to use ADF as the main ETL and orchestration tool for your data warehouse or data platform projects.


What You Will Learn:

  • Create an orchestration and transformation job in ADF
  • Develop, execute, and monitor data flows using Azure Synapse
  • Create big data pipelines using Databricks and Delta tables
  • Work with big data in Azure Data Lake using Spark Pool
  • Migrate on-premises SSIS jobs to ADF
  • Integrate ADF with commonly used Azure services such as Azure ML, Azure Logic Apps, and Azure Functions
  • Run big data compute jobs within HDInsight and Azure Databricks
  • Copy data from AWS S3 and Google Cloud Storage to Azure Storage using ADF's built-in connectors


Who this book is for:

This book is for ETL developers, data warehouse and ETL architects, software professionals, and anyone else who wants to learn about the common and not-so-common challenges faced while developing traditional and hybrid ETL solutions using Microsoft's Azure Data Factory. You'll also find this book useful if you are looking for recipes to improve or enhance your existing ETL pipelines. Basic knowledge of data warehousing is a prerequisite.

商品描述(中文翻譯)

解決現實世界的數據問題,並使用 Azure Data Factory 創建數據驅動的工作流程,以便輕鬆地進行大規模數據移動和處理

主要特點:


  • 學習如何從各種來源(包括本地和雲端)加載和轉換數據

  • 使用 Azure Data Factory 的可視化環境來構建和管理混合 ETL 管道

  • 發現如何準備、轉換、處理和豐富數據,以生成關鍵見解

書籍描述:
這本全新修訂版的 Azure Data Factory Cookbook 完全更新以反映 ADS V2,將幫助您快速上手,展示如何在 ADF 中創建和執行您的第一個作業。

您將學習如何分支和鏈接活動、創建自定義活動和排程管道,並發現雲數據倉儲、Azure Synapse Analytics 和 Azure Data Lake Gen2 Storage 的好處。

通過實用的食譜,您將學會如何積極使用 Azure Data Services 的分析工具,並利用您的本地基礎設施與雲原生工具相結合,以獲得相關的商業見解。隨著您的進步,您將能夠將最常用的 Azure 服務整合到 ADF 中,並了解 Azure 服務在設計 ETL 管道中的實用性。您將熟悉在使用 ADF 時可能遇到的常見錯誤,並了解如何使用 Azure 入口網站來監控管道。您還將理解錯誤消息,並利用 ADF 的調試功能解決連接器和數據流中的問題。

新增了兩個涵蓋 Azure Data Explorer 和關鍵最佳實踐的新章節,並在全書中增加了新的食譜。

在本書結束時,您將能夠將 ADF 作為數據倉庫或數據平台項目的主要 ETL 和編排工具。

您將學到的內容:


  • 在 ADF 中創建編排和轉換作業

  • 使用 Azure Synapse 開發、執行和監控數據流

  • 使用 Databricks 和 Delta 表創建大數據管道

  • 在 Azure Data Lake 中使用 Spark Pool 處理大數據

  • 將本地 SSIS 作業遷移到 ADF

  • 將 ADF 與常用的 Azure 服務(如 Azure ML、Azure Logic Apps 和 Azure Functions)整合

  • 在 HDInsight 和 Azure Databricks 中運行大數據計算作業

  • 使用 ADF 的內建連接器將數據從 AWS S3 和 Google Cloud Storage 複製到 Azure Storage

本書適合誰:
本書適合 ETL 開發人員、數據倉庫和 ETL 架構師、軟體專業人員,以及任何希望了解在使用 Microsoft 的 Azure Data Factory 開發傳統和混合 ETL 解決方案時所面臨的常見和不常見挑戰的人。如果您正在尋找改善或增強現有 ETL 管道的食譜,這本書也將對您有所幫助。具備基本的數據倉儲知識是前提條件。