Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses, and data lakes with Python
暫譯: 使用 Python 的現代數據架構：構建和部署數據管道、數據倉庫和數據湖的實用指南

Name: Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses, and data lakes with Python
Price: 1919 TWD
Availability: OnlineOnly
Author: Lipp, Brian
ISBN: 1801070490

Lipp, Brian

Modern Data Architectures with Python: A practical guide to building and deploying data pipelines, data warehouses, and data lakes with Python

出版商: Packt Publishing
出版日期: 2023-09-29
售價: $2,020
貴賓價: 9.5 折 $1,919
語言: 英文
頁數: 318
裝訂: Quality Paper - also called trade paper
ISBN: 1801070490
ISBN-13: 9781801070492
相關分類: Python、程式語言

海外代購書籍(需單獨結帳)

買這商品的人也買了...

~~$2,090~~ $1,980

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems (Paperback)
$250

數據湖架構
~~$1,980~~ $1,881

Clean Architecture: A Craftsman's Guide to Software Structure and Design (Paperback)
$2,224

Data Mesh: Delivering Data-Driven Value at Scale (Paperback)
$1,938

Mastering API Architecture: Design, Operate, and Evolve Api-Based Systems (Paperback)
~~$2,356~~ $2,232

Data Management at Scale: Modern Data Architecture with Data Mesh and Data Fabric 2/e
~~$1,680~~ $1,596

Get Your Hands Dirty on Clean Architecture: Build 'clean' applications with code examples in Java, 2/e (Paperback)
$2,061

Architecting Data and Machine Learning Platforms: Enable Analytics and Ai-Driven Innovation in the Cloud (Paperback)

商品描述

Build scalable and reliable data ecosystems using Data Mesh, Databricks Spark, and Kafka

Key Features

Develop modern data skills used in emerging technologies
Learn pragmatic design methodologies such as Data Mesh and data lakehouses
Gain a deeper understanding of data governance
Purchase of the print or Kindle book includes a free PDF eBook

Book Description

Modern Data Architectures with Python will teach you how to seamlessly incorporate your machine learning and data science work streams into your open data platforms. You’ll learn how to take your data and create open lakehouses that work with any technology using tried-and-true techniques, including the medallion architecture and Delta Lake.

Starting with the fundamentals, this book will help you build pipelines on Databricks, an open data platform, using SQL and Python. You’ll gain an understanding of notebooks and applications written in Python using standard software engineering tools such as git, pre-commit, Jenkins, and Github. Next, you’ll delve into streaming and batch-based data processing using Apache Spark and Confluent Kafka. As you advance, you’ll learn how to deploy your resources using infrastructure as code and how to automate your workflows and code development. Since any data platform's ability to handle and work with AI and ML is a vital component, you’ll also explore the basics of ML and how to work with modern MLOps tooling. Finally, you’ll get hands-on experience with Apache Spark, one of the key data technologies in today’s market.

By the end of this book, you’ll have amassed a wealth of practical and theoretical knowledge to build, manage, orchestrate, and architect your data ecosystems.

What you will learn

Understand data patterns including delta architecture
Discover how to increase performance with Spark internals
Find out how to design critical data diagrams
Explore MLOps with tools such as AutoML and MLflow
Get to grips with building data products in a data mesh
Discover data governance and build confidence in your data
Introduce data visualizations and dashboards into your data practice

Who this book is for

This book is for developers, analytics engineers, and managers looking to further develop a data ecosystem within their organization. While they’re not prerequisites, basic knowledge of Python and prior experience with data will help you to read and follow along with the examples.

商品描述(中文翻譯)

建立可擴展且可靠的數據生態系統，使用 Data Mesh、Databricks Spark 和 Kafka

主要特點

- 發展在新興技術中使用的現代數據技能
- 學習實用的設計方法論，如 Data Mesh 和數據湖屋
- 深入了解數據治理
- 購買印刷版或 Kindle 書籍可獲得免費 PDF 電子書

書籍描述

《使用 Python 的現代數據架構》將教您如何將機器學習和數據科學工作流程無縫整合到開放數據平台中。您將學習如何利用經過驗證的技術，將數據轉化為可與任何技術協作的開放湖屋，包括獎牌架構和 Delta Lake。

本書從基礎開始，將幫助您使用 SQL 和 Python 在 Databricks（一個開放數據平台）上構建管道。您將了解使用標準軟體工程工具（如 git、pre-commit、Jenkins 和 Github）編寫的 Python 筆記本和應用程式。接下來，您將深入探討使用 Apache Spark 和 Confluent Kafka 的流式和批量數據處理。隨著進展，您將學習如何使用基礎設施即代碼來部署資源，以及如何自動化工作流程和代碼開發。由於任何數據平台處理和運用 AI 和 ML 的能力是至關重要的組成部分，您還將探索 ML 的基本概念以及如何使用現代 MLOps 工具。最後，您將獲得使用 Apache Spark 的實踐經驗，這是當今市場上關鍵的數據技術之一。

在本書結束時，您將積累大量實用和理論知識，以構建、管理、協調和設計您的數據生態系統。

您將學到的內容

- 理解數據模式，包括 delta 架構
- 探索如何通過 Spark 內部提高性能
- 瞭解如何設計關鍵數據圖表
- 探索使用 AutoML 和 MLflow 等工具的 MLOps
- 熟悉在數據網格中構建數據產品
- 探索數據治理並增強對數據的信心
- 將數據可視化和儀表板引入您的數據實踐

本書適合誰

本書適合開發人員、分析工程師和希望在其組織內進一步發展數據生態系統的管理者。雖然不是必備條件，但對 Python 的基本知識和先前的數據經驗將有助於您閱讀和跟隨示例。