Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture
暫譯: 可擴展的大數據架構:實務者選擇相關大數據架構的指南

Bahaaldine Azarmi

  • 出版商: Apress
  • 出版日期: 2015-12-30
  • 售價: $2,050
  • 貴賓價: 9.5$1,948
  • 語言: 英文
  • 頁數: 160
  • 裝訂: Paperback
  • ISBN: 1484213270
  • ISBN-13: 9781484213278
  • 相關分類: JVM 語言大數據 Big-data
  • 海外代購書籍(需單獨結帳)

商品描述

This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance.

Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution.

When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it’s often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed data in real time.

This book shows you how to choose a relevant combination of big data technologies available within the Hadoop ecosystem. It focuses on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern is illustrated with practical examples, which use the different open sourceprojects such as Logstash, Spark, Kafka, and so on.

Traditional data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book helps you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints imposed by dealing with the high throughput of Big data.

Scalable Big Data Architecture is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a Big Data project and which tools to integrate into that pattern.

商品描述(中文翻譯)

這本書突顯了不同類型的數據架構,並說明了「大數據」這個術語背後隱藏的許多可能性,從 No-SQL 數據庫的使用到流分析架構、機器學習和治理的部署。

《可擴展的大數據架構》涵蓋了現實世界中具體的行業案例,這些案例利用了複雜的分散式應用程序,涉及網頁應用程序、RESTful API,以及在高度可擴展的 No-SQL 數據存儲(如 Couchbase 和 Elasticsearch)中存儲的大量數據的高吞吐量。本書展示了如何從使用 NoSQL 數據存儲到大數據分發的組合,實現大規模的數據處理。

當數據處理過於複雜,並涉及不同的處理拓撲,例如長時間運行的作業、流處理、多數據源關聯和機器學習時,通常需要將負載委派給 Hadoop 或 Spark,並使用 No-SQL 來實時提供處理後的數據。

這本書告訴你如何選擇 Hadoop 生態系統中可用的大數據技術的相關組合。它專注於處理長作業、架構、流數據模式、日誌分析和實時分析。每個模式都用實際範例來說明,這些範例使用了不同的開源項目,如 Logstash、Spark、Kafka 等等。

傳統的數據基礎設施是為了消化和呈現來自大量數據的數據綜合和分析而建立的。本書幫助你理解為什麼在項目早期就應考慮使用機器學習算法,以免在處理大數據的高吞吐量時被約束所壓倒。

《可擴展的大數據架構》適合開發人員、數據架構師和數據科學家,幫助他們更好地理解如何為大數據項目選擇最相關的模式,以及應該將哪些工具整合到該模式中。