Data Algorithms: Recipes for Scaling Up with Hadoop and Spark (Paperback)
暫譯: 數據演算法:使用 Hadoop 和 Spark 擴展的配方 (平裝本)
Mahmoud Parsian
- 出版商: O'Reilly
- 出版日期: 2015-08-11
- 定價: $2,300
- 售價: 9.5 折 $2,185
- 語言: 英文
- 頁數: 778
- 裝訂: Paperback
- ISBN: 1491906189
- ISBN-13: 9781491906187
-
相關分類:
Hadoop、Spark、Algorithms-data-structures
-
相關翻譯:
數據算法:Hadoop/Spark大數據處理技巧 (簡中版)
立即出貨
買這商品的人也買了...
-
$3,496Statistics and Data with R: An applied approach through examples (Hardcover)
-
$3,430$3,259 -
$2,640$2,508 -
$825R Cookbook (Paperback)
-
$1,995R in a Nutshell, 2/e (Paperback)
-
$825R Graphics Cookbook (Paperback)
-
$420$332 -
$420$332 -
$1,881Doing Data Science: Straight Talk from the Frontline (Paperback)
-
$2,240An Introduction to Statistical Learning: With Applications in R (Hardcover)
-
$780$616 -
$2,510$2,385 -
$350$277 -
$550$435 -
$460$359 -
$550$435 -
$780$616 -
$360$284 -
$480$379 -
$880$695 -
$450$356 -
$400$316 -
$620$484 -
$680$578 -
$380$300
商品描述
Learn the algorithms and tools you need to build MapReduce applications with Hadoop and Spark for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-stepthrough the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns.
- Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq
- Use the most relevant regression/analytical algorithms used for different biological data types
- Apply t-test, joins, top-10, and correlation algorithms using MapReduce/Hadoop and Spark
商品描述(中文翻譯)
學習您需要的算法和工具,以便使用 Hadoop 和 Spark 建立 MapReduce 應用程式,處理在一般硬體集群上大小為千兆字節、太字節或拍字節的數據集。這本實用的書籍由 Illumina 大數據團隊負責人 Mahmoud Parsian 撰寫,將逐步引導您設計機器學習算法,例如 Naive Bayes 和馬可夫鏈,並展示如何將它們應用於臨床和生物數據集,使用 MapReduce 設計模式。
- 將 MapReduce 算法應用於臨床和生物數據,例如 DNA-Seq 和 RNA-Seq
- 使用最相關的回歸/分析算法,針對不同的生物數據類型
- 使用 MapReduce/Hadoop 和 Spark 應用 t 檢驗、聯接、前十名和相關算法