Learning Cloudera Impala
暫譯: 學習 Cloudera Impala

Name: Learning Cloudera Impala
Price: 1463 TWD
Availability: OnlineOnly
Author: Avkash Chauhan
ISBN: 1783281278

Avkash Chauhan

出版商: Packt Publishing
出版日期: 2013-12-27
售價: $1,540
貴賓價: 9.5 折 $1,463
語言: 英文
頁數: 150
裝訂: Paperback
ISBN: 1783281278
ISBN-13: 9781783281275

海外代購書籍(需單獨結帳)

商品描述

Everything you need to know about Cloudera Impala is here – from installation onwards. Your raw data processing in Hadoop takes on new dimensions of speed and volume with this hands-on tutorial.

Overview

Step-by-step guidance to get you started with Impala on your Hadoop cluster
Manipulate your data rapidly by writing proper SQL statements
Explore the concepts of Impala security, administration, and troubleshooting in detail to maintain your Impala cluster

In Detail

If you have always wanted to crunch billions of rows of raw data on Hadoop in a couple of seconds, then Cloudera Impala is the number one choice for you. Cloudera Impala provides fast, interactive SQL queries directly on your Apache Hadoop data stored in HDFS or HBase. In addition to using the same unified storage platform, Impala also uses the same metadata, SQL syntax (Hive SQL), ODBC driver, and user interface (Hue Beeswax) as Apache Hive. This provides a familiar and unified platform for batch-oriented or real-time queries.

In this practical, example-oriented book, you will learn everything you need to know about Cloudera Impala so that you can get started on your very own project. The book covers everything about Cloudera Impala from installation, administration, and query processing, all the way to connectivity with other third party applications. With this book in your hand, you will find yourself empowered to play with your data in Hadoop.

As a reader of this book, you will learn about the origin of Impala and the technology behind it that allows it to run on thousands of machines. You will learn how to install, run, manage, and troubleshoot Impala in your own Hadoop cluster using the step-by-step guidance provided in the book. The book covers tenets of data processing such as loading data stored in Hadoop into Impala tables and querying data using Impala SQL statements, all with various code illustrations and a real-world example.

The book is written to get you started with Impala by providing rich information so you can understand what Impala is, what it can do for you, and finally how you can use it to achieve your objective.

What you will learn from this book

Understand the various ways of installing Impala in your Hadoop cluster
Use the Impala shell API to interact with Impala components
Utilize Impala Query Language and built-in functions to play with data
Administrate and fine-tune Impala for high availability
Identify and troubleshoot problems in a variety of ways
Get acquainted with various input data formats in Hadoop and how to use them with Impala
Comprehend how third party applications can connect with Impala to provide data visualization and various other enhancements

Approach

This book is an easy-to-follow, step-by-step tutorial where each chapter takes your knowledge to the next level. The book covers practical knowledge with tips to implement this knowledge in real-world scenarios. A chapter with a real-life example is included to help you understand the concepts in full.

Who this book is written for

Using Cloudera Impala is for those who really want to take advantage of their Hadoop cluster by processing extremely large amounts of raw data in Hadoop at real-time speed. Prior knowledge of Hadoop and some exposure to HIVE and MapReduce is expected.

商品描述(中文翻譯)

有關 Cloudera Impala 的所有資訊都在這裡——從安裝開始。這本實作教程將使您在 Hadoop 中的原始數據處理速度和容量達到新的維度。

概述

逐步指導您在 Hadoop 集群上開始使用 Impala
通過撰寫適當的 SQL 語句快速操作數據
詳細探索 Impala 的安全性、管理和故障排除概念，以維護您的 Impala 集群

詳細內容

如果您一直想在幾秒鐘內處理數十億行的原始數據，那麼 Cloudera Impala 是您的首選。Cloudera Impala 直接在存儲於 HDFS 或 HBase 的 Apache Hadoop 數據上提供快速、互動式的 SQL 查詢。除了使用相同的統一存儲平台外，Impala 還使用與 Apache Hive 相同的元數據、SQL 語法（Hive SQL）、ODBC 驅動程式和用戶界面（Hue Beeswax）。這為批量導向或實時查詢提供了一個熟悉且統一的平台。

在這本以實例為導向的實用書中，您將學到有關 Cloudera Impala 的所有知識，以便您可以開始自己的專案。這本書涵蓋了 Cloudera Impala 的所有內容，從安裝、管理和查詢處理，到與其他第三方應用程式的連接。有了這本書，您將能夠在 Hadoop 中隨意操作您的數據。

作為這本書的讀者，您將了解 Impala 的起源及其背後的技術，使其能夠在數千台機器上運行。您將學會如何安裝、運行、管理和故障排除 Impala，並使用書中提供的逐步指導在自己的 Hadoop 集群中進行操作。這本書涵蓋了數據處理的基本原則，例如將存儲在 Hadoop 中的數據加載到 Impala 表中，以及使用 Impala SQL 語句查詢數據，並提供各種代碼示例和實際案例。

這本書旨在幫助您開始使用 Impala，提供豐富的信息，以便您了解 Impala 是什麼、它能為您做什麼，以及最終如何使用它來實現您的目標。

您將從這本書中學到什麼