High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark (Paperback)
暫譯: 高效能 Spark：擴展與優化 Apache Spark 的最佳實踐 (平裝本)

Name: High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark (Paperback)
Price: 1575 TWD
Availability: InStock
Author: Holden Karau, Rachel Warren
ISBN: 1491943203

Holden Karau, Rachel Warren

出版商: O'Reilly
出版日期: 2017-07-11
定價: $1,750
售價: 9.0 折 $1,575
語言: 英文
頁數: 358
裝訂: Paperback
ISBN: 1491943203
ISBN-13: 9781491943205
相關分類: Spark
相關翻譯: 高性能Spark (簡中版)

立即出貨 (庫存=1)

買這商品的人也買了...

$1,330

RESTful Web Services Cookbook: Solutions for Improving Scalability and Simplicity (Paperback)
~~$1,650~~ $1,568

Scala in Depth (Paperback)
~~$1,230~~ $1,169

REST API Design Rulebook (Paperback)
~~$940~~ $700

無瑕的程式碼－敏捷軟體開發技巧守則 + 番外篇－專業程式設計師的生存之道 (雙書合購)
$1,568

Spark: Big Data Cluster Computing in Production (Paperback)
~~$520~~ $411

Spark 學習手冊 (Learning Spark: Lightning-Fast Big Data Analysis)
~~$2,050~~ $1,948

Large Scale Machine Learning with Spark
$990

Spark in Action
$2,090

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems (Paperback)
$948

Scala for the Impatient,2/e
$882

Learning Concurrent Programming in Scala - Second Edition
~~$790~~ $672

無瑕的程式碼－敏捷完整篇－物件導向原則、設計模式與 C# 實踐 (Agile principles, patterns, and practices in C#)
~~$580~~ $452

無瑕的程式碼－整潔的軟體設計與架構篇 (Clean Architecture: A Craftsman's Guide to Software Structure and Design)
$352

程序員面試筆試寶典(第3版)
$505

Spark 全棧數據分析
~~$880~~ $695

Spark 技術手冊｜輕鬆寫意處理大數據 (Spark: The Definitive Guide｜Big Data Processing Made Simple)
~~$1,650~~ $1,617

Mining of Massive Datasets, 3/e (Hardcover)
$1,710

Data Pipelines with Apache Airflow (Paperback)
~~$599~~ $473

資料科學的建模基礎 : 別急著 coding！你知道模型的陷阱嗎？
$505

實戰大數據 (Hadoop + Spark + Flink) 從平臺構建到交互式數據分析 (離線/實時)
$2,185

Learning Domain-Driven Design: Aligning Software Architecture and Business Strategy (Paperback)
$254

大數據技術入門 — Hadoop + Spark
$560

圖解 Spark 大數據快速分析實戰
~~$1,128~~ $1,072

Scala 編程, 5/e (Programming in Scala 5/e)
~~$780~~ $616

軟體架構：困難部分 (Software Architecture: The Hard Parts: Modern Trade-Off Analyses for Distributed Architectures)

商品描述

Apache Spark is amazing when everything clicks. But if you haven’t seen the performance improvements you expected, or still don’t feel confident enough to use Spark in production, this practical book is for you. Authors Holden Karau and Rachel Warren demonstrate performance optimizations to help your Spark queries run faster and handle larger data sizes, while using fewer resources.

Ideal for software engineers, data engineers, developers, and system administrators working with large-scale data applications, this book describes techniques that can reduce data infrastructure costs and developer hours. Not only will you gain a more comprehensive understanding of Spark, you’ll also learn how to make it sing.

With this book, you’ll explore:

How Spark SQL’s new interfaces improve performance over SQL’s RDD data structure
The choice between data joins in Core Spark and Spark SQL
Techniques for getting the most out of standard RDD transformations
How to work around performance issues in Spark’s key/value pair paradigm
Writing high-performance Spark code without Scala or the JVM
How to test for functionality and performance when applying suggested improvements
Using Spark MLlib and Spark ML machine learning libraries
Spark’s Streaming components and external community packages

商品描述(中文翻譯)

Apache Spark 在一切運行順利時是非常出色的。但如果您還沒有看到預期的性能提升，或者仍然對在生產環境中使用 Spark 感到不夠自信，那麼這本實用的書籍就是為您而寫的。作者 Holden Karau 和 Rachel Warren 展示了性能優化技術，幫助您的 Spark 查詢運行得更快，處理更大的數據量，同時使用更少的資源。

這本書非常適合從事大規模數據應用的軟體工程師、數據工程師、開發人員和系統管理員，描述了可以降低數據基礎設施成本和開發人員工時的技術。您不僅會對 Spark 有更全面的理解，還會學會如何讓它發揮最佳效能。

在這本書中，您將探索：

- Spark SQL 的新介面如何改善相較於 SQL 的 RDD 數據結構的性能
- Core Spark 和 Spark SQL 中數據聯接的選擇
- 獲取標準 RDD 轉換最大效益的技術
- 如何解決 Spark 的鍵/值對範式中的性能問題
- 如何在不使用 Scala 或 JVM 的情況下編寫高性能的 Spark 代碼
- 在應用建議的改進時如何測試功能和性能
- 使用 Spark MLlib 和 Spark ML 機器學習庫
- Spark 的 Streaming 組件和外部社群套件