Data Wrangling on AWS: Clean and organize complex data for analysis
暫譯: 在AWS上進行數據整理:清理和組織複雜數據以進行分析

Shukla, Navnit, M, Sankar, Palani, Sam

  • 出版商: Packt Publishing
  • 出版日期: 2023-07-31
  • 售價: $1,800
  • 貴賓價: 9.5$1,710
  • 語言: 英文
  • 頁數: 420
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1801810907
  • ISBN-13: 9781801810906
  • 相關分類: Amazon Web ServicesGAN 生成對抗網絡
  • 立即出貨 (庫存=1)

商品描述

Revamp your data landscape and implement highly effective data pipelines in AWS with this hands-on guide

Purchase of the print or Kindle book includes a free PDF eBook

Key Features

  • Execute extract, transform, and load (ETL) tasks on data lakes, data warehouses, and databases
  • Implement effective Pandas data operation with data wrangler
  • Integrate pipelines with AWS data services

Book Description

Data wrangling is the process of cleaning, transforming, and organizing raw, messy, or unstructured data into a structured format. It involves processes such as data cleaning, data integration, data transformation, and data enrichment to ensure that the data is accurate, consistent, and suitable for analysis. Data Wrangling on AWS equips you with the knowledge to reap the full potential of AWS data wrangling tools.

First, you’ll be introduced to data wrangling on AWS and will be familiarized with data wrangling services available in AWS. You’ll understand how to work with AWS Glue DataBrew, AWS data wrangler, and AWS Sagemaker. Next, you’ll discover other AWS services like Amazon S3, Redshift, Athena, and Quicksight. Additionally, you’ll explore advanced topics such as performing Pandas data operation with AWS data wrangler, optimizing ML data with AWS SageMaker, building the data warehouse with Glue DataBrew, along with security and monitoring aspects.

By the end of this book, you’ll be well-equipped to perform data wrangling using AWS services.

What you will learn

  • Explore how to write simple to complex transformations using AWS data wrangler
  • Use abstracted functions to extract and load data from and into AWS datastores
  • Configure AWS Glue DataBrew for data wrangling
  • Develop data pipelines using AWS data wrangler
  • Integrate AWS security features into Data Wrangler using identity and access management (IAM)
  • Optimize your data with AWS SageMaker

Who this book is for

This book is for data engineers, data scientists, and business data analysts looking to explore the capabilities, tools, and services of data wrangling on AWS for their ETL tasks. Basic knowledge of Python, Pandas, and a familiarity with AWS tools such as AWS Glue, Amazon Athena is required to get the most out of this book.

商品描述(中文翻譯)

重新設計您的數據環境,並在 AWS 中實施高效的數據管道,這本實用指南將為您提供幫助

購買印刷版或 Kindle 版書籍可獲得免費 PDF 電子書

主要特點


  • 在數據湖、數據倉庫和數據庫上執行提取、轉換和加載(ETL)任務

  • 使用數據處理工具實施有效的 Pandas 數據操作

  • 將管道與 AWS 數據服務集成

書籍描述

數據處理是將原始、雜亂或非結構化數據清理、轉換和組織成結構化格式的過程。它涉及數據清理、數據整合、數據轉換和數據增強等過程,以確保數據的準確性、一致性和適合分析。《AWS 上的數據處理》將使您掌握充分利用 AWS 數據處理工具的知識。

首先,您將了解 AWS 上的數據處理,並熟悉 AWS 中可用的數據處理服務。您將了解如何使用 AWS Glue DataBrew、AWS 數據處理工具和 AWS Sagemaker。接下來,您將發現其他 AWS 服務,如 Amazon S3、Redshift、Athena 和 Quicksight。此外,您還將探索高級主題,例如使用 AWS 數據處理工具執行 Pandas 數據操作、使用 AWS SageMaker 優化機器學習數據、使用 Glue DataBrew 建立數據倉庫,以及安全性和監控方面的內容。

在本書結束時,您將能夠熟練使用 AWS 服務進行數據處理。

您將學到什麼


  • 探索如何使用 AWS 數據處理工具編寫從簡單到複雜的轉換

  • 使用抽象函數從 AWS 數據存儲中提取和加載數據

  • 配置 AWS Glue DataBrew 以進行數據處理

  • 使用 AWS 數據處理工具開發數據管道

  • 使用身份和訪問管理(IAM)將 AWS 安全功能集成到數據處理工具中

  • 使用 AWS SageMaker 優化您的數據

本書適合誰

本書適合數據工程師、數據科學家和商業數據分析師,他們希望探索 AWS 上數據處理的能力、工具和服務以進行 ETL 任務。需要具備基本的 Python 和 Pandas 知識,以及對 AWS 工具(如 AWS Glue、Amazon Athena)的熟悉,以便充分利用本書的內容。

目錄大綱

  1. Introduction to Data Wrangling on AWS
  2. Working with AWS GlueDataBrew
  3. Introducing AWS Data Wrangler
  4. Introducing Amazon SageMaker Data Wrangler
  5. Working with Amazon S3
  6. Working with AWS Glue
  7. Working with Athena
  8. Working with Quicksight
  9. Perform Pandas operation with AWS Data Wrangler
  10. Optimizing ML data with AWS SageMaker Data Wrangler
  11. Security and Monitoring

目錄大綱(中文翻譯)


  1. Introduction to Data Wrangling on AWS

  2. Working with AWS GlueDataBrew

  3. Introducing AWS Data Wrangler

  4. Introducing Amazon SageMaker Data Wrangler

  5. Working with Amazon S3

  6. Working with AWS Glue

  7. Working with Athena

  8. Working with Quicksight

  9. Perform Pandas operation with AWS Data Wrangler

  10. Optimizing ML data with AWS SageMaker Data Wrangler

  11. Security and Monitoring

最後瀏覽商品 (20)