Machine Learning for Text
暫譯: 文本的機器學習

Aggarwal, Charu C.

  • 出版商: Springer
  • 出版日期: 2019-02-01
  • 售價: $2,320
  • 貴賓價: 9.5$2,204
  • 語言: 英文
  • 頁數: 493
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 3030088073
  • ISBN-13: 9783030088071
  • 相關分類: Machine Learning
  • 已絕版

商品描述

Text analytics is a field that lies on the interface of information retrieval, machine learning, and natural language processing, and this textbook carefully covers a coherently organized framework drawn from these intersecting topics. The chapters of this textbook is organized into three categories:

- Basic algorithms: Chapters 1 through 7 discuss the classical algorithms for machine learning from text such as preprocessing, similarity computation, topic modeling, matrix factorization, clustering, classification, regression, and ensemble analysis.

- Domain-sensitive mining: Chapters 8 and 9 discuss the learning methods from text when combined with different domains such as multimedia and the Web. The problem of information retrieval and Web search is also discussed in the context of its relationship with ranking and machine learning methods.

- Sequence-centric mining: Chapters 10 through 14 discuss various sequence-centric and natural language applications, such as feature engineering, neural language models, deep learning, text summarization, information extraction, opinion mining, text segmentation, and event detection.

This textbook covers machine learning topics for text in detail. Since the coverage is extensive, multiple courses can be offered from the same book, depending on course level. Even though the presentation is text-centric, Chapters 3 to 7 cover machine learning algorithms that are often used indomains beyond text data. Therefore, the book can be used to offer courses not just in text analytics but also from the broader perspective of machine learning (with text as a backdrop).

This textbook targets graduate students in computer science, as well as researchers, professors, and industrial practitioners working in these related fields. This textbook is accompanied with a solution manual for classroom teaching.

商品描述(中文翻譯)

文本分析是一個位於資訊檢索、機器學習和自然語言處理交界處的領域,本教科書仔細涵蓋了一個從這些交叉主題中提煉出的有條理的框架。本教科書的章節分為三個類別:

- 基本演算法:第1至第7章討論了從文本中進行機器學習的經典演算法,如預處理、相似度計算、主題建模、矩陣分解、聚類、分類、回歸和集成分析。

- 領域敏感挖掘:第8和第9章討論了當文本與不同領域(如多媒體和網路)結合時的學習方法。資訊檢索和網路搜索的問題也在其與排名和機器學習方法的關係背景下進行討論。

- 序列中心挖掘:第10至第14章討論了各種以序列為中心的自然語言應用,如特徵工程、神經語言模型、深度學習、文本摘要、資訊擷取、情感挖掘、文本分段和事件檢測。

本教科書詳細涵蓋了文本的機器學習主題。由於內容廣泛,根據課程級別,可以從同一本書中提供多個課程。儘管呈現方式以文本為中心,第3至第7章涵蓋了通常用於文本數據以外領域的機器學習演算法。因此,本書不僅可以用於文本分析課程,還可以從更廣泛的機器學習(以文本為背景)的角度提供課程。

本教科書的目標讀者為計算機科學的研究生,以及在這些相關領域工作的研究人員、教授和業界從業者。本教科書附有解答手冊以供課堂教學使用。

作者簡介

Charu C. Aggarwal is a Distinguished Research Staff Member (DRSM) at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his undergraduate degree in Computer Science from the Indian Institute of Technology at Kanpur in 1993 and his Ph.D. from the Massachusetts Institute of Technology in 1996. He has worked extensively in the field of data mining. He has published more than 350 papers in refereed conferences and journals and authored over 80 patents. He is the author or editor of 17 books, including textbooks on data mining, recommender systems, and outlier analysis. Because of the commercial value of his patents, he has thrice been designated a Master Inventor at IBM. He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of two IBM Outstanding Technical Achievement Awards (2009, 2015) for his work on data streams/high-dimensional data. He received the EDBT 2014 Test of Time Award for his work on condensation-based privacy-preserving data mining. He is also a recipient of the IEEE ICDM Research Contributions Award (2015), which is one of the two highest awards for influential research contributions in the field of data mining. He has served as the general co-chair of the IEEE Big Data Conference (2014) and as the program co-chair of the ACM CIKM Conference (2015), the IEEE ICDM Conference (2015), and the ACM KDD Conference (2016). He served as an associate editor of the IEEE Transactions on Knowledge and Data Engineering from 2004 to 2008. He is an associate editor of the IEEE Transactions on Big Data, an action editor of the Data Mining and Knowledge Discovery Journal, and an associate editor of the Knowledge and Information Systems Journal. He has served as editor-in-chief of the ACM SIGKDD Explorations (2014-2017) and is currently an editor-in-chief of the ACM Transactions on Knowledge Discovery from Data. He serves on the advisory board of the Lecture Notes on Social Networks, a publication by Springer. He has served as the vice-president of the SIAM Activity Group on Data Mining and is a member of the SIAM industry committee. He is a fellow of the SIAM, ACM, and the IEEE, for "contributions to knowledge discovery and data mining algorithms."

作者簡介(中文翻譯)

Charu C. Aggarwal 是 IBM T. J. Watson 研究中心位於紐約約克鎮的傑出研究人員 (Distinguished Research Staff Member, DRSM)。他於 1993 年在印度理工學院坎普爾校區獲得計算機科學學士學位,並於 1996 年在麻省理工學院獲得博士學位。他在資料探勘領域有廣泛的研究經驗,已在經過審核的會議和期刊上發表超過 350 篇論文,並擁有超過 80 項專利。他是 17 本書的作者或編輯,包括資料探勘、推薦系統和異常分析的教科書。由於其專利的商業價值,他三度被 IBM 指定為大師發明家 (Master Inventor)。他因在資料流中的生物恐怖威脅檢測工作而獲得 IBM 企業獎 (2003),因對隱私技術的科學貢獻而獲得 IBM 傑出創新獎 (2008),以及因在資料流/高維資料方面的工作而獲得兩次 IBM 傑出技術成就獎 (2009, 2015)。他因在基於凝聚的隱私保護資料探勘方面的工作而獲得 EDBT 2014 時間考驗獎。他還是 IEEE ICDM 研究貢獻獎 (2015) 的獲得者,該獎項是資料探勘領域影響力研究貢獻的兩個最高獎項之一。他曾擔任 IEEE 大數據會議 (2014) 的共同主席,以及 ACM CIKM 會議 (2015)、IEEE ICDM 會議 (2015) 和 ACM KDD 會議 (2016) 的程序共同主席。他於 2004 年至 2008 年擔任 IEEE 知識與資料工程期刊的副編輯,現在是 IEEE 大數據期刊的副編輯、資料探勘與知識發現期刊的行動編輯,以及知識與資訊系統期刊的副編輯。他曾擔任 ACM SIGKDD Explorations 的主編 (2014-2017),目前是 ACM Transactions on Knowledge Discovery from Data 的主編。他在 Springer 出版的社交網絡講義筆記的諮詢委員會中任職。他曾擔任 SIAM 資料探勘活動小組的副主席,並且是 SIAM 工業委員會的成員。他是 SIAM、ACM 和 IEEE 的會士,因其對知識發現和資料探勘演算法的貢獻而獲得此榮譽。