Individual and Collective Graph Mining: Principles, Algorithms, and Applications (Synthesis Lectures on Data Mining and Knowledge Discovery)
暫譯: 個人與集體圖形挖掘:原則、演算法與應用(數據挖掘與知識發現綜合講座)
Danai Koutra, Christos Faloutsos
- 出版商: Morgan & Claypool
- 出版日期: 2017-10-26
- 售價: $3,530
- 貴賓價: 9.5 折 $3,354
- 語言: 英文
- 頁數: 206
- 裝訂: Hardcover
- ISBN: 1681732475
- ISBN-13: 9781681732473
-
相關分類:
Algorithms-data-structures、Data-mining
海外代購書籍(需單獨結帳)
商品描述
Graphs naturally represent information ranging from links between web pages, to communication in email networks, to connections between neurons in our brains. These graphs often span billions of nodes and interactions between them. Within this deluge of interconnected data, how can we find the most important structures and summarize them? How can we efficiently visualize them? How can we detect anomalies that indicate critical events, such as an attack on a computer system, disease formation in the human brain, or the fall of a company?
This book presents scalable, principled discovery algorithms that combine globality with locality to make sense of one or more graphs. In addition to fast algorithmic methodologies, we also contribute graph-theoretical ideas and models, and real-world applications in two main areas:
•Individual Graph Mining: We show how to interpretably summarize a single graph by identifying its important graph structures. We complement summarization with inference, which leverages information about few entities (obtained via summarization or other methods) and the network structure to efficiently and effectively learn information about the unknown entities.
•Collective Graph Mining: We extend the idea of individual-graph summarization to time-evolving graphs, and show how to scalably discover temporal patterns. Apart from summarization, we claim that graph similarity is often the underlying problem in a host of applications where multiple graphs occur (e.g., temporal anomaly detection, discovery of behavioral patterns), and we present principled, scalable algorithms for aligning networks and measuring their similarity.
The methods that we present in this book leverage techniques from diverse areas, such as matrix algebra, graph theory, optimization, information theory, machine learning, finance, and social science, to solve real-world problems. We present applications of our exploration algorithms to massive datasets, including a Web graph of 6.6 billion edges, a Twitter graph of 1.8 billion edges, brain graphs with up to 90 million edges, collaboration, peer-to-peer networks, browser logs, all spanning millions of users and interactions.
商品描述(中文翻譯)
圖形自然地表示從網頁之間的連結、電子郵件網絡中的通信,到我們大腦中神經元之間的連接等各種信息。這些圖形通常涵蓋數十億個節點及其之間的互動。在這海量的互聯數據中,我們如何找到最重要的結構並對其進行總結?我們如何有效地可視化它們?我們如何檢測指示關鍵事件的異常情況,例如對計算機系統的攻擊、人腦中的疾病形成或公司的倒閉?
本書介紹了可擴展的、原則性的發現算法,這些算法結合了全局性與局部性,以理解一個或多個圖形。除了快速的算法方法外,我們還貢獻了圖論的思想和模型,以及在兩個主要領域的實際應用:
• 個別圖形挖掘:我們展示了如何通過識別單個圖形的重要結構來可解釋地總結該圖形。我們將總結與推斷相結合,利用有關少數實體的信息(通過總結或其他方法獲得)和網絡結構,以高效且有效地學習有關未知實體的信息。
• 集體圖形挖掘:我們將個別圖形總結的概念擴展到隨時間演變的圖形,並展示如何可擴展地發現時間模式。除了總結之外,我們聲稱圖形相似性通常是多個圖形出現的應用中潛在的問題(例如,時間異常檢測、行為模式的發現),我們提出了原則性、可擴展的算法來對齊網絡並測量它們的相似性。
本書中介紹的方法利用來自不同領域的技術,如矩陣代數、圖論、優化、信息理論、機器學習、金融和社會科學,以解決現實世界中的問題。我們展示了我們的探索算法在大規模數據集上的應用,包括一個擁有66億條邊的網絡圖、一個擁有18億條邊的Twitter圖、最多擁有9000萬條邊的大腦圖、協作、點對點網絡、瀏覽器日誌,所有這些都涵蓋了數百萬用戶和互動。