Databricks Lakehouse Platform Cookbook: 100+ Recipes for Building a Scalable and Secure Databricks Lakehouse

Dennis, Alan L.

  • 出版商: BPB Publications
  • 出版日期: 2023-12-18
  • 售價: $1,670
  • 貴賓價: 9.5$1,587
  • 語言: 英文
  • 頁數: 466
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 9355519567
  • ISBN-13: 9789355519566
  • 相關分類: JVM 語言
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

The Databricks Lakehouse is groundbreaking technology that simplifies data storage, processing, and analysis. This cookbook offers a clear and practical guide to building and optimizing your Lakehouse to make data-driven decisions and drive impactful results. This definitive guide walks you through the entire Lakehouse journey, from setting up your environment, and connecting to storage, to creating Delta tables, building data models, and ingesting and transforming data. We start off by discussing how to ingest data to Bronze, then refine it to produce Silver. Next, we discuss how to create Gold tables and various data modeling techniques often performed in the Gold layer. You will learn how to leverage Spark SQL and PySpark for efficient data manipulation, apply Delta Live Tables for real-time data processing, and implement Machine Learning and Data Science workflows with MLflow, Feature Store, and AutoML.

商品描述(中文翻譯)

Databricks Lakehouse 是一項開創性的技術,簡化了數據存儲、處理和分析。本書提供了一個清晰實用的指南,教你如何建立和優化你的 Lakehouse,以便做出數據驅動的決策並取得有影響力的結果。這本權威指南將引導你完成整個 Lakehouse 的旅程,從環境設置和連接存儲開始,到創建 Delta 表、構建數據模型以及數據的輸入和轉換。我們首先討論如何將數據輸入到 Bronze 層,然後對其進行精煉以生成 Silver 層。接下來,我們討論如何創建 Gold 表以及在 Gold 層中常用的各種數據建模技術。你將學習如何利用 Spark SQL 和 PySpark 進行高效的數據操作,應用 Delta Live Tables 進行實時數據處理,並使用 MLflow、Feature Store 和 AutoML 實施機器學習和數據科學工作流程。