Building ETL Pipelines with Python: Create and deploy enterprise-ready ETL pipelines by employing modern methods
暫譯: 使用 Python 建立 ETL 管道:透過現代方法創建和部署企業級 ETL 管道

Pandey, Brij Kishore, Schoof, Emily Ro

  • 出版商: Packt Publishing
  • 出版日期: 2023-09-29
  • 售價: $1,710
  • 貴賓價: 9.5$1,625
  • 語言: 英文
  • 頁數: 246
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1804615250
  • ISBN-13: 9781804615256
  • 相關分類: Python程式語言
  • 海外代購書籍(需單獨結帳)

商品描述

Develop production-ready ETL pipelines by leveraging Python libraries and deploying them for suitable use cases


Key Features:


  • Understand how to set up a Python virtual environment with PyCharm
  • Learn functional and object-oriented approaches to create ETL pipelines
  • Create robust CI/CD processes for ETL pipelines
  • Purchase of the print or Kindle book includes a free PDF eBook


Book Description:


Modern extract, transform, and load (ETL) pipelines for data engineering have favored the Python language for its broad range of uses and a large assortment of tools, applications, and open source components. With its simplicity and extensive library support, Python has emerged as the undisputed choice for data processing.


In this book, you'll walk through the end-to-end process of ETL data pipeline development, starting with an introduction to the fundamentals of data pipelines and establishing a Python development environment to create pipelines. Once you've explored the ETL pipeline design principles and ET development process, you'll be equipped to design custom ETL pipelines. Next, you'll get to grips with the steps in the ETL process, which involves extracting valuable data; performing transformations, through cleaning, manipulation, and ensuring data integrity; and ultimately loading the processed data into storage systems. You'll also review several ETL modules in Python, comparing their pros and cons when building data pipelines and leveraging cloud tools, such as AWS, to create scalable data pipelines. Lastly, you'll learn about the concept of test-driven development for ETL pipelines to ensure safe deployments.


By the end of this book, you'll have worked on several hands-on examples to create high-performance ETL pipelines to develop robust, scalable, and resilient environments using Python.


What You Will Learn:


  • Explore the available libraries and tools to create ETL pipelines using Python
  • Write clean and resilient ETL code in Python that can be extended and easily scaled
  • Understand the best practices and design principles for creating ETL pipelines
  • Orchestrate the ETL process and scale the ETL pipeline effectively
  • Discover tools and services available in AWS for ETL pipelines
  • Understand different testing strategies and implement them with the ETL process


Who this book is for:


If you are a data engineer or software professional looking to create enterprise-level ETL pipelines using Python, this book is for you. Fundamental knowledge of Python is a prerequisite.

商品描述(中文翻譯)

利用 Python 函式庫開發可投入生產的 ETL 管道並將其部署於適合的使用案例


主要特點:



  • 了解如何使用 PyCharm 設置 Python 虛擬環境

  • 學習功能性和物件導向的方法來創建 ETL 管道

  • 為 ETL 管道創建穩健的 CI/CD 流程

  • 購買印刷版或 Kindle 書籍可獲得免費 PDF 電子書


書籍描述:


現代的提取、轉換和加載(ETL)管道在數據工程中偏好使用 Python 語言,因為它擁有廣泛的用途和大量的工具、應用程式及開源組件。憑藉其簡單性和廣泛的庫支持,Python 已成為數據處理的無可爭議的選擇。


在本書中,您將學習 ETL 數據管道開發的端到端過程,從數據管道的基本概念介紹開始,建立 Python 開發環境以創建管道。一旦您探索了 ETL 管道設計原則和 ET 開發過程,您將能夠設計自定義的 ETL 管道。接下來,您將掌握 ETL 過程中的步驟,包括提取有價值的數據;進行轉換,通過清理、操作和確保數據完整性;最終將處理過的數據加載到存儲系統中。您還將回顧幾個 Python 中的 ETL 模組,並在構建數據管道時比較它們的優缺點,並利用 AWS 等雲工具創建可擴展的數據管道。最後,您將了解 ETL 管道的測試驅動開發概念,以確保安全的部署。


在本書結束時,您將完成幾個實作範例,以使用 Python 創建高效能的 ETL 管道,開發穩健、可擴展且具韌性的環境。


您將學到的內容:



  • 探索可用的庫和工具,以使用 Python 創建 ETL 管道

  • 編寫乾淨且具韌性的 ETL 代碼,能夠擴展和輕鬆擴展

  • 了解創建 ETL 管道的最佳實踐和設計原則

  • 協調 ETL 過程並有效擴展 ETL 管道

  • 發現 AWS 中可用於 ETL 管道的工具和服務

  • 了解不同的測試策略並在 ETL 過程中實施它們


本書適合誰:


如果您是數據工程師或軟體專業人士,想要使用 Python 創建企業級 ETL 管道,那麼本書適合您。對 Python 的基本知識是先決條件。