Talend for Big Data
暫譯: Talend 用於大數據

Bahaaldine Azarmi

  • 出版商: Packt Publishing
  • 出版日期: 2014-02-21
  • 售價: $1,540
  • 貴賓價: 9.5$1,463
  • 語言: 英文
  • 頁數: 96
  • 裝訂: Paperback
  • ISBN: 1782169490
  • ISBN-13: 9781782169499
  • 相關分類: 大數據 Big-data
  • 海外代購書籍(需單獨結帳)

商品描述

If you want to start working on big data projects fast, this is the guide you've been looking for. Delve deep into Talend and discover how just how easily you can revolutionize your data handling and presentation.

Overview

  • Write complex processing job codes easily with the help of clear and step by step instructions
  • Compare, filter, evaluate, and group vast quantities of data using Hadoop Pig
  • Explore and perform HDFS and RDBMS integration with the Sqoop component

In Detail

Talend, a successful Open Source Data Integration Solution, accelerates the adoption of new big data technologies and efficiently integrates them into your existing IT infrastructure. It is able to do this because of its intuitive graphical language, its multiple connectors to the Hadoop ecosystem, and its array of tools for data integration, quality, management, and governance.

This is a concise, pragmatic book that will guide you through design and implement big data transfer easily and perform big data analytics jobs using Hadoop technologies like HDFS, HBase, Hive, Pig, and Sqoop. You will see and learn how to write complex processing job codes and how to leverage the power of Hadoop projects through the design of graphical Talend jobs using business modeler, meta-data repository, and a palette of configurable components.

Starting with understanding how to process a large amount of data using Talend big data components, you will then learn how to write job procedures in HDFS. You will then look at how to use Hadoop projects to process data and how to export the data to your favourite relational database system.

You will learn how to implement Hive ELT jobs, Pig aggregation and filtering jobs, and simple Sqoop jobs using the Talend big data component palette. You will also learn the basics of Twitter sentiment analysis the instructions to format data with Apache Hive.

Talend for Big Data will enable you to start working on big data projects immediately, from simple processing projects to complex projects using common big data patterns.

What you will learn from this book

  • Know the structure of the Talend Unified Platform
  • Work with Talend HDFS components
  • Implement ELT processing jobs using Talend Hive components
  • Load, filter, aggregate, and store data using Talend Pig components
  • Integrate HDFS with RDBMS using Sqoop components
  • Use the streaming pattern for big data
  • Learn to reuse the partitioning pattern for big data

Approach

This book is written in a concise and easy-to-understand manner, and acts as a comprehensive guide on data analytics and integration with Talend big data processing jobs.

Who this book is written for

If you are a chief information officer, enterprise architect, data architect, data scientist, software developer, software engineer, or a data analyst who is familiar with data processing projects and who wants to use Talend to get your first big data job executed in a reliable, quick, and graphical way, then Talend for Big Data is perfect for you.

商品描述(中文翻譯)

如果您想快速開始進行大數據專案,這是您一直在尋找的指南。深入了解 Talend,發現您可以多麼輕鬆地徹底改變您的數據處理和呈現方式。

**概述**
- 藉助清晰且逐步的指導,輕鬆編寫複雜的處理作業代碼
- 使用 Hadoop Pig 比較、過濾、評估和分組大量數據
- 探索並執行 HDFS 和 RDBMS 的整合,使用 Sqoop 組件

**詳細內容**
Talend 是一個成功的開源數據整合解決方案,加速了新大數據技術的採用,並有效地將其整合到您現有的 IT 基礎架構中。它能夠做到這一點,因為它具有直觀的圖形語言、多個與 Hadoop 生態系統的連接器,以及一系列用於數據整合、質量、管理和治理的工具。

這是一本簡明實用的書,將指導您輕鬆設計和實施大數據傳輸,並使用 Hadoop 技術(如 HDFS、HBase、Hive、Pig 和 Sqoop)執行大數據分析作業。您將看到並學習如何編寫複雜的處理作業代碼,以及如何通過使用業務建模器、元數據庫和可配置組件的調色板設計圖形 Talend 作業來利用 Hadoop 專案的力量。

從了解如何使用 Talend 大數據組件處理大量數據開始,然後您將學習如何在 HDFS 中編寫作業程序。接著,您將了解如何使用 Hadoop 專案處理數據,以及如何將數據導出到您喜愛的關聯數據庫系統。

您將學習如何使用 Talend 大數據組件調色板實現 Hive ELT 作業、Pig 聚合和過濾作業,以及簡單的 Sqoop 作業。您還將學習 Twitter 情感分析的基本知識,以及如何使用 Apache Hive 格式化數據的指導。

Talend for Big Data 將使您能夠立即開始進行大數據專案,從簡單的處理專案到使用常見大數據模式的複雜專案。

**您將從本書中學到什麼**
- 瞭解 Talend 統一平台的結構
- 使用 Talend HDFS 組件
- 使用 Talend Hive 組件實現 ELT 處理作業
- 使用 Talend Pig 組件加載、過濾、聚合和存儲數據
- 使用 Sqoop 組件將 HDFS 與 RDBMS 整合
- 使用流式模式處理大數據
- 學習重用大數據的分區模式

**方法**
本書以簡明易懂的方式撰寫,並作為 Talend 大數據處理作業的數據分析和整合的綜合指南。

**本書的讀者對象**
如果您是首席信息官、企業架構師、數據架構師、數據科學家、軟體開發人員、軟體工程師或熟悉數據處理專案的數據分析師,並希望使用 Talend 以可靠、快速和圖形化的方式執行您的第一個大數據作業,那麼 Talend for Big Data 將非常適合您。