Mining of Massive Datasets, 3/e (Hardcover)
暫譯: 大規模數據集挖掘(第三版)
Leskovec, Jure, Rajaraman, Anand, Ullman, Jeffrey David
- 出版商: Cambridge
- 出版日期: 2020-02-13
- 售價: $1,650
- 貴賓價: 9.8 折 $1,617
- 語言: 英文
- 頁數: 565
- 裝訂: Hardcover - also called cloth, retail trade, or trade
- ISBN: 1108476341
- ISBN-13: 9781108476348
-
相關分類:
大數據 Big-data、DeepLearning、Web-crawler 網路爬蟲
-
相關翻譯:
斯坦福數據挖掘教程, 3/e (Mining of Massive Datasets, 3/e) (簡中版)
立即出貨 (庫存=1)
買這商品的人也買了...
-
$301用戶網絡行為畫像
-
$520$411 -
$680$530 -
$1,540High Performance Spark: Best Practices for Scaling and Optimizing Apache Spark (Paperback)
-
$265Web API 的設計與開發 (Web API : the Good Parts)
-
$2,540$2,413 -
$250PySpark 實戰指南 : 利用 Python 和 Spark 構建數據密集型應用並規模化部署 (Learning PySpark)
-
$480$379 -
$898Online Investing for Dummies
-
$520$411 -
$650$507 -
$301PySpark 機器學習、自然語言處理與推薦系統 (Machine Learning with PySpark: With Natural Language Processing and Recommender Systems)
商品描述
Written by leading authorities in database and Web technologies, this book is essential reading for students and practitioners alike. The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. This book focuses on practical algorithms that have been used to solve key problems in data mining and can be applied successfully to even the largest datasets. It begins with a discussion of the MapReduce framework, an important tool for parallelizing algorithms automatically. The authors explain the tricks of locality-sensitive hashing and stream-processing algorithms for mining data that arrives too fast for exhaustive processing. Other chapters cover the PageRank idea and related tricks for organizing the Web, the problems of finding frequent itemsets, and clustering. This third edition includes new and extended coverage on decision trees, deep learning, and mining social-network graphs.
商品描述(中文翻譯)
由資料庫和網路技術領域的權威專家撰寫,本書對於學生和實務工作者來說都是必讀之作。網路和網路商務的普及提供了許多極大的數據集,這些數據集可以通過資料探勘來提取信息。本書專注於實用的演算法,這些演算法已被用來解決資料探勘中的關鍵問題,並且能成功應用於即使是最大的數據集。書中首先討論了 MapReduce 框架,這是一個自動平行化演算法的重要工具。作者解釋了局部敏感哈希(locality-sensitive hashing)和流處理(stream-processing)演算法的技巧,以便處理到達速度過快而無法進行全面處理的數據。其他章節涵蓋了 PageRank 概念及其相關技巧,用於組織網路,尋找頻繁項集的問題,以及聚類。這第三版包括了有關決策樹(decision trees)、深度學習(deep learning)和社交網路圖(social-network graphs)挖掘的新內容和擴展內容。