Data Algorithms: Recipes for Scaling Up with Hadoop and Spark (Paperback) (數據演算法:使用 Hadoop 和 Spark 擴展的秘訣)

Mahmoud Parsian

買這商品的人也買了...

相關主題

商品描述

Learn the algorithms and tools you need to build MapReduce applications with Hadoop and Spark for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-stepthrough the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns.

  • Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq
  • Use the most relevant regression/analytical algorithms used for different biological data types
  • Apply t-test, joins, top-10, and correlation algorithms using MapReduce/Hadoop and Spark

商品描述(中文翻譯)

學習使用Hadoop和Spark建立MapReduce應用程式所需的演算法和工具,以處理吉比、太比或拍比級別的資料集,並在廉價硬體集群上進行處理。在這本實用書中,作者Mahmoud Parsian(Illumina的大數據團隊負責人)逐步介紹機器學習演算法的設計,例如Naive Bayes和Markov Chain,並展示如何應用這些演算法於臨床和生物資料集,使用MapReduce設計模式。

本書內容包括:
- 將MapReduce演算法應用於臨床和生物資料,例如DNA-Seq和RNA-Seq
- 使用最相關的迴歸/分析演算法處理不同類型的生物資料
- 使用MapReduce/Hadoop和Spark應用t-test、連接、前10名和相關性演算法