Delta Lake: The Definitive Guide: Modern Data Lakehouse Architectures with Data Lakes

Lee, Denny, Wentling, Tristen, Haines, Scott

  • 出版商: O'Reilly
  • 出版日期: 2024-12-10
  • 售價: $2,780
  • 貴賓價: 9.5$2,641
  • 語言: 英文
  • 頁數: 380
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098151941
  • ISBN-13: 9781098151942
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Ready to simplify the process of building data lakehouses and data pipelines at scale? In this practical guide, learn how Delta Lake is helping data engineers, data scientists, and data analysts overcome key data reliability challenges with modern data engineering and management techniques.

Authors Denny Lee, Tristen Wentling, Scott Haines, and Prashanth Babu (with contributions from Delta Lake maintainer R. Tyler Croy) share expert insights on all things Delta Lake--including how to run batch and streaming jobs concurrently and accelerate the usability of your data. You'll also uncover how ACID transactions bring reliability to data lakehouses at scale.

This book helps you:

  • Understand key data reliability challenges and how Delta Lake solves them
  • Explain the critical role of Delta transaction logs as a single source of truth
  • Learn the Delta Lake ecosystem with technologies like Apache Flink, Kafka, and Trino
  • Architect data lakehouses with the medallion architecture
  • Optimize Delta Lake performance with features like deletion vectors and liquid clustering

商品描述(中文翻譯)

準備好簡化大規模建構資料湖屋和資料管道的過程了嗎?在這本實用指南中,了解 Delta Lake 如何幫助資料工程師、資料科學家和資料分析師克服現代資料工程和管理技術中的關鍵資料可靠性挑戰。

作者 Denny Lee、Tristen Wentling、Scott Haines 和 Prashanth Babu(並有 Delta Lake 維護者 R. Tyler Croy 的貢獻)分享了有關 Delta Lake 的專業見解,包括如何同時運行批次和串流作業,以及加速資料的可用性。您還將發現 ACID 交易如何在大規模資料湖屋中帶來可靠性。

這本書幫助您:
- 了解關鍵的資料可靠性挑戰以及 Delta Lake 如何解決這些挑戰
- 解釋 Delta 交易日誌作為單一真相來源的關鍵角色
- 學習 Delta Lake 生態系統,使用 Apache Flink、Kafka 和 Trino 等技術
- 使用獎牌架構設計資料湖屋
- 利用刪除向量和液態聚類等功能優化 Delta Lake 的性能