Data Analytics with Hadoop: An Introduction for Data Scientists
暫譯: Hadoop 數據分析：數據科學家的入門指南

Name: Data Analytics with Hadoop: An Introduction for Data Scientists
Price: 610 TWD
Availability: InStock
Author: Benjamin Bengfort, Jenny Kim
ISBN: 1491913703

Benjamin Bengfort, Jenny Kim

出版商: O'Reilly
出版日期: 2016-07-12
定價: $1,220
售價: 5.0 折 $610
語言: 英文
頁數: 288
裝訂: Paperback
ISBN: 1491913703
ISBN-13: 9781491913703
相關分類: Hadoop、Data Science

立即出貨 (庫存=1)

買這商品的人也買了...

$1,176

Database Management Systems, 3/e (IE-Paperback)
~~$480~~ $470

工程數學(二): 常微分方程式、特殊函數暨 Laplace 轉換(修訂版)
~~$3,325~~ $3,150

The Internet of Things: Connecting Objects (Hardcover)
~~$780~~ $663

Embedded Linux 嵌入式系統開發實務, 2/e (Embedded Linux Primer: A Practical Real-World Approach, 2/e)
~~$680~~ $578

SQL Server 2012 資料庫實務應用
~~$450~~ $356

快快樂樂學 Excel 2013─善用資料圖表、函數巨集的精算達人
~~$480~~ $317

嗯！Office 2013 我也會─超實用的活動 DM X 財會營收 X 銷售分析 X 互動影音 X 雲端協同範例即上手
~~$650~~ $514

Windows Server 2012 R2 Active Directory 建置實務
~~$620~~ $608

資料探勘 (Han: Data Mining: Concepts and Techniques, 3/e )
~~$550~~ $435

Arduino 互動設計專題與實戰－深入 Arduino 的全方位指南 (附114段教學與執行影片/範例程式檔)
~~$680~~ $537

C++ 並行程式設計實戰手冊 (C++ Concurrency in Action: Practical Multithreading)
~~$350~~ $273

Active Directory 環境的 PowerShell 活用指南 (Active Directory with PowerShell)
~~$650~~ $514

Visual C# 2015 程式設計經典 (附範例光碟)
~~$480~~ $408

思辨賽局看穿局勢、創造優勢的策略智慧
~~$420~~ $332

讓響應式(RWD)網頁設計變簡單：Bootstrap開發速成 (附135分鐘專題影音教學)
~~$500~~ $395

大數據分析 Excel Power BI 全方位應用
~~$490~~ $387

Kubernetes 使用指南
~~$580~~ $452

Python 機器學習 (Python Machine Learning)
~~$450~~ $405

發現演算法
~~$580~~ $493

精通 Go 程式設計 (The Go Programming Language)
~~$450~~ $383

Python 函式庫語法範例字典
~~$980~~ $774

Hadoop 技術手冊, 4/e (Hadoop: The Definitive Guide, 4/e)
~~$490~~ $382

職業駭客的告白III部曲 -- C語言、組合語言與逆向工程的秘密
~~$320~~ $240

快速精通 Bootstrap (Bootstrap Essentials)
$347

Hadoop大數據實戰權威指南

商品描述

Ready to use statistical and machine-learning techniques across large data sets? This practical guide shows you why the Hadoop ecosystem is perfect for the job. Instead of deployment, operations, or software development usually associated with distributed computing, you’ll focus on particular analyses you can build, the data warehousing techniques that Hadoop provides, and higher order data workflows this framework can produce.

Data scientists and analysts will learn how to perform a wide range of techniques, from writing MapReduce and Spark applications with Python to using advanced modeling and data management with Spark MLlib, Hive, and HBase. You’ll also learn about the analytical processes and data systems available to build and empower data products that can handle—and actually require—huge amounts of data.

Understand core concepts behind Hadoop and cluster computing
Use design patterns and parallel analytical algorithms to create distributed data analysis jobs
Learn about data management, mining, and warehousing in a distributed context using Apache Hive and HBase
Use Sqoop and Apache Flume to ingest data from relational databases
Program complex Hadoop and Spark applications with Apache Pig and Spark DataFrames
Perform machine learning techniques such as classification, clustering, and collaborative filtering with Spark’s MLlib

商品描述(中文翻譯)

準備好在大型數據集上使用統計和機器學習技術了嗎？這本實用指南向您展示為什麼 Hadoop 生態系統非常適合這項工作。您將專注於可以構建的特定分析、Hadoop 提供的數據倉儲技術，以及這個框架可以產生的高階數據工作流程，而不是通常與分散式計算相關的部署、操作或軟體開發。

數據科學家和分析師將學習如何執行各種技術，從使用 Python 編寫 MapReduce 和 Spark 應用程式，到使用 Spark MLlib、Hive 和 HBase 進行高級建模和數據管理。您還將了解可用於構建和增強數據產品的分析過程和數據系統，這些產品能夠處理並實際需要大量數據。

- 理解 Hadoop 和叢集計算背後的核心概念
- 使用設計模式和並行分析算法來創建分散式數據分析任務
- 學習在分散式環境中使用 Apache Hive 和 HBase 進行數據管理、挖掘和倉儲
- 使用 Sqoop 和 Apache Flume 從關聯數據庫中導入數據
- 使用 Apache Pig 和 Spark DataFrames 編寫複雜的 Hadoop 和 Spark 應用程式
- 使用 Spark 的 MLlib 執行機器學習技術，如分類、聚類和協同過濾