Elasticsearch for Hadoop
暫譯: Hadoop 的 Elasticsearch 使用指南

Vishal Shukla

商品描述

Integrate Elasticsearch into Hadoop to effectively visualize and analyze your data

About This Book

  • Build production-ready analytics applications by integrating the Hadoop ecosystem with Elasticsearch
  • Learn complex Elasticsearch queries and develop real-time monitoring Kibana dashboards to visualize your data
  • Use Elasticsearch and Kibana to search data in Hadoop easily with this comprehensive, step-by-step guide

Who This Book Is For

This book is targeted at Java developers with basic knowledge on Hadoop. No prior Elasticsearch experience is expected.

What You Will Learn

  • Set up the Elasticsearch-Hadoop environment
  • Import HDFS data into Elasticsearch with MapReduce jobs
  • Perform full-text search and aggregations efficiently using Elasticsearch
  • Visualize data and create interactive dashboards using Kibana
  • Check and detect anomalies in streaming data using Storm and Elasticsearch
  • Inject and classify real-time streaming data into Elasticsearch
  • Get production-ready for Elasticsearch-Hadoop based projects
  • Integrate with Hadoop eco-system such as Pig, Storm, Hive, and Spark

In Detail

The Hadoop ecosystem is a de-facto standard for processing terra-bytes and peta-bytes of data. Lucene-enabled Elasticsearch is becoming an industry standard for its full-text search and aggregation capabilities. Elasticsearch-Hadoop serves as a perfect tool to bridge the worlds of Elasticsearch and Hadoop ecosystem to get best out of both the worlds. Powered with Kibana, this stack makes it a cakewalk to get surprising insights out of your massive amount of Hadoop ecosystem in a flash.

In this book, you'll learn to use Elasticsearch, Kibana and Elasticsearch-Hadoop effectively to analyze and understand your HDFS and streaming data.

You begin with an in-depth understanding of the Hadoop, Elasticsearch, Marvel, and Kibana setup. Right after this, you will learn to successfully import Hadoop data into Elasticsearch by writing MapReduce job in a real-world example. This is then followed by a comprehensive look at Elasticsearch essentials, such as full-text search analysis, queries, filters and aggregations; after which you gain an understanding of creating various visualizations and interactive dashboard using Kibana. Classifying your real-world streaming data and identifying trends in it using Storm and Elasticsearch are some of the other topics that we'll cover. You will also gain an insight about key concepts of Elasticsearch and Elasticsearch-hadoop in distributed mode, advanced configurations along with some common configuration presets you may need for your production deployments. You will have “Go production checklist” and high-level view for cluster administration for post-production. Towards the end, you will learn to integrate Elasticsearch with other Hadoop eco-system tools, such as Pig, Hive and Spark.

Style and approach

A concise yet comprehensive approach has been adopted with real-time examples to help you grasp the concepts easily.

商品描述(中文翻譯)

將 Elasticsearch 整合到 Hadoop 中,以有效地視覺化和分析您的數據

本書介紹



  • 通過將 Hadoop 生態系統與 Elasticsearch 整合,構建生產就緒的分析應用程序

  • 學習複雜的 Elasticsearch 查詢,並開發實時監控的 Kibana 儀表板以視覺化您的數據

  • 使用這本全面的逐步指南,輕鬆在 Hadoop 中搜索數據

本書適合誰


本書針對具有基本 Hadoop 知識的 Java 開發人員。不需要具備先前的 Elasticsearch 經驗。

您將學到什麼



  • 設置 Elasticsearch-Hadoop 環境

  • 使用 MapReduce 作業將 HDFS 數據導入 Elasticsearch

  • 高效地使用 Elasticsearch 執行全文搜索和聚合

  • 使用 Kibana 視覺化數據並創建互動式儀表板

  • 使用 Storm 和 Elasticsearch 檢查和檢測流數據中的異常

  • 將實時流數據注入並分類到 Elasticsearch 中

  • 為基於 Elasticsearch-Hadoop 的項目做好生產準備

  • 與 Hadoop 生態系統中的工具(如 Pig、Storm、Hive 和 Spark)整合

詳細內容


Hadoop 生態系統是處理 TB(太字節)和 PB(拍字節)數據的事實標準。基於 Lucene 的 Elasticsearch 正在成為其全文搜索和聚合能力的行業標準。Elasticsearch-Hadoop 是一個完美的工具,能夠橋接 Elasticsearch 和 Hadoop 生態系統,充分發揮兩者的優勢。借助 Kibana,這個技術堆棧使您能夠快速從大量的 Hadoop 生態系統中獲得驚人的見解。


在本書中,您將學會如何有效地使用 Elasticsearch、Kibana 和 Elasticsearch-Hadoop 來分析和理解您的 HDFS 和流數據。


您將從深入了解 Hadoop、Elasticsearch、Marvel 和 Kibana 的設置開始。接下來,您將學會通過編寫 MapReduce 作業,成功將 Hadoop 數據導入 Elasticsearch,並以實際案例為例。然後,您將全面了解 Elasticsearch 的基本要素,例如全文搜索分析、查詢、過濾器和聚合;之後,您將學會使用 Kibana 創建各種視覺化和互動式儀表板。使用 Storm 和 Elasticsearch 對您的實際流數據進行分類和識別趨勢是我們將涵蓋的其他主題之一。您還將深入了解 Elasticsearch 和 Elasticsearch-Hadoop 在分佈式模式下的關鍵概念、高級配置以及您在生產部署中可能需要的一些常見配置預設。您將擁有“生產準備檢查清單”和集群管理的高級視圖,以便於後期生產。在最後,您將學會如何將 Elasticsearch 與其他 Hadoop 生態系統工具(如 Pig、Hive 和 Spark)整合。

風格與方法


本書採用簡潔而全面的方法,並提供實時示例,以幫助您輕鬆掌握概念。