相關主題
商品描述
This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform.
What you'll learn
- How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5.
- How to run a MapReduce job
- How to store data with Apache Hive, Apache HBase
- How to index data in HDFS with Apache Solr
- How to develop a Kafka messaging system
- How to develop a Mahout User Recommender System
- How to stream Logs to HDFS with Apache Flume
- How to transfer data from MySQL database to Hive, HDFS and HBase with Sqoop
- How create a Hive table over Apache Solr
The primary audience is Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.
商品描述(中文翻譯)
這本書是一本實用指南,介紹了使用Apache Hadoop項目的方法,包括MapReduce、HDFS、Apache Hive、Apache HBase、Apache Kafka、Apache Mahout和Apache Solr。從環境設置到運行示例應用程序,每一章都是使用Apache Hadoop生態系統項目的實用教程。儘管有很多關於Apache Hadoop的書籍,但大多數都是基於主要項目MapReduce和HDFS,沒有討論其他Apache Hadoop生態系統項目以及這些項目如何作為一個統一的大數據開發平台一起工作。
你將學到什麼:
- 如何在Linux上使用Cloudera Hadoop Distribution CDH 5設置Hadoop項目的環境。
- 如何運行MapReduce作業。
- 如何使用Apache Hive、Apache HBase存儲數據。
- 如何使用Apache Solr在HDFS中索引數據。
- 如何開發Kafka消息系統。
- 如何開發Mahout用戶推薦系統。
- 如何使用Apache Flume將日誌流式傳輸到HDFS。
- 如何使用Sqoop將數據從MySQL數據庫轉移到Hive、HDFS和HBase。
- 如何在Apache Solr上創建Hive表。
這本書的讀者主要是Apache Hadoop開發人員。需要具備Linux的基礎知識和一些Hadoop知識。