Fast Data Processing with Spark, 2/e（Paperback）
暫譯: 快速數據處理與 Spark, 第2版（平裝本）

Name: Fast Data Processing with Spark, 2/e（Paperback）
Price: 1292 TWD
Availability: OnlineOnly
Author: Krishna Sankar, Holden Karau
ISBN: 178439257X

Krishna Sankar, Holden Karau

出版商: Packt Publishing
出版日期: 2015-03-31
售價: $1,360
貴賓價: 9.5 折 $1,292
語言: 英文
頁數: 184
裝訂: Paperback
ISBN: 178439257X
ISBN-13: 9781784392574
相關分類: Spark

海外代購書籍(需單獨結帳)

買這商品的人也買了...

~~$2,220~~ $2,109

Optimizing Linux Performance: A Hands-On Guide to Linux Performance Tools
~~$1,980~~ $1,881

Linux Debugging and Performance Tuning: Tips and Techniques (Paperback)
~~$1,840~~ $1,748

Designing and Implementing Linux Firewalls with QoS using netfilter, iproute2, NAT and L7-filter
~~$1,900~~ $1,805

Linux Firewalls: Attack Detection and Response with iptables, psad, and fwsnort (Paperback)

商品描述

Perform real-time analytics using Spark in a fast, distributed, and scalable way

About This Book

Develop a machine learning system with Spark's MLlib and scalable algorithms
Deploy Spark jobs to various clusters such as Mesos, EC2, Chef, YARN, EMR, and so on
This is a step-by-step tutorial that unleashes the power of Spark and its latest features

Who This Book Is For

Fast Data Processing with Spark - Second Edition is for software developers who want to learn how to write distributed programs with Spark. It will help developers who have had problems that were too big to be dealt with on a single computer. No previous experience with distributed programming is necessary. This book assumes knowledge of either Java, Scala, or Python.

What You Will Learn

Install and set up Spark on your cluster
Prototype distributed applications with Spark's interactive shell
Learn different ways to interact with Spark's distributed representation of data (RDDs)
Query Spark with a SQL-like query syntax
Effectively test your distributed software
Recognize how Spark works with big data
Implement machine learning systems with highly scalable algorithms

In Detail

Spark is a framework used for writing fast, distributed programs. Spark solves similar problems as Hadoop MapReduce does, but with a fast in-memory approach and a clean functional style API. With its ability to integrate with Hadoop and built-in tools for interactive query analysis (Spark SQL), large-scale graph processing and analysis (GraphX), and real-time analysis (Spark Streaming), it can be interactively used to quickly process and query big datasets.

Fast Data Processing with Spark - Second Edition covers how to write distributed programs with Spark. The book will guide you through every step required to write effective distributed programs from setting up your cluster and interactively exploring the API to developing analytics applications and tuning them for your purposes.

商品描述(中文翻譯)

使用 Spark 以快速、分散且可擴展的方式執行即時分析

本書簡介

使用 Spark 的 MLlib 和可擴展算法開發機器學習系統

將 Spark 作業部署到各種叢集，如 Mesos、EC2、Chef、YARN、EMR 等

這是一本逐步教學，釋放 Spark 的力量及其最新功能

本書適合誰閱讀

《使用 Spark 進行快速數據處理 - 第二版》適合希望學習如何使用 Spark 編寫分散式程式的軟體開發人員。它將幫助那些面對無法在單一電腦上處理的龐大問題的開發人員。無需具備分散式程式設計的先前經驗。本書假設讀者具備 Java、Scala 或 Python 的知識。

您將學到什麼

在您的叢集上安裝和設置 Spark

使用 Spark 的互動式外殼原型分散式應用程式

學習與 Spark 的分散式數據表示（RDDs）互動的不同方式

使用類似 SQL 的查詢語法查詢 Spark

有效測試您的分散式軟體

了解 Spark 如何處理大數據

使用高度可擴展的算法實現機器學習系統

詳細內容

Spark 是一個用於編寫快速、分散式程式的框架。Spark 解決的問題與 Hadoop MapReduce 類似，但採用快速的內存處理方法和乾淨的函數式風格 API。它能夠與 Hadoop 整合，並具備用於互動查詢分析（Spark SQL）、大規模圖形處理和分析（GraphX）以及即時分析（Spark Streaming）的內建工具，可以互動式地快速處理和查詢大型數據集。

《使用 Spark 進行快速數據處理 - 第二版》涵蓋了如何使用 Spark 編寫分散式程式。本書將指導您完成編寫有效分散式程式所需的每一步，從設置叢集和互動式探索 API，到開發分析應用程式並根據您的需求進行調整。