Text Analysis Pipelines: Towards Ad-hoc Large-Scale Text Mining (Lecture Notes in Computer Science)
暫譯: 文本分析管道:邁向即時大規模文本挖掘(計算機科學講義)

Henning Wachsmuth

  • 出版商: Springer
  • 出版日期: 2015-12-04
  • 售價: $2,420
  • 貴賓價: 9.5$2,299
  • 語言: 英文
  • 頁數: 302
  • 裝訂: Paperback
  • ISBN: 3319257404
  • ISBN-13: 9783319257402
  • 相關分類: Text-miningComputer-Science
  • 海外代購書籍(需單獨結帳)

商品描述

This monograph proposes a comprehensive and fully automatic approach to designing text analysis pipelines for arbitrary information needs that are optimal in terms of run-time efficiency and that robustly mine relevant information from text of any kind. Based on state-of-the-art techniques from machine learning and other areas of artificial intelligence, novel pipeline construction and execution algorithms are developed and implemented in prototypical software. Formal analyses of the algorithms and extensive empirical experiments underline that the proposed approach represents an essential step towards the ad-hoc use of text mining in web search and big data analytics.
Both web search and big data analytics aim to fulfill peoples’ needs for information in an adhoc manner. The information sought for is often hidden in large amounts of natural language text. Instead of simply returning links to potentially relevant texts, leading search and analytics engines have started to directly mine relevant information from the texts. To this end, they execute text analysis pipelines that may consist of several complex information-extraction and text-classification stages. Due to practical requirements of efficiency and robustness, however, the use of text mining has so far been limited to anticipated information needs that can be fulfilled with rather simple, manually constructed pipelines.


商品描述(中文翻譯)

這篇專著提出了一種全面且完全自動化的方法,用於設計文本分析管道,以滿足任意信息需求,並在運行效率方面達到最佳效果,能夠穩健地從各類文本中挖掘相關信息。基於機器學習和其他人工智慧領域的最先進技術,開發並實現了新穎的管道構建和執行算法,並在原型軟體中進行實現。對這些算法的正式分析和廣泛的實證實驗強調了所提出的方法代表了在網路搜索和大數據分析中隨需應變使用文本挖掘的重要一步。

網路搜索和大數據分析的目標是以隨需應變的方式滿足人們的信息需求。所尋求的信息通常隱藏在大量的自然語言文本中。領先的搜索和分析引擎不再僅僅返回潛在相關文本的鏈接,而是開始直接從文本中挖掘相關信息。為此,它們執行的文本分析管道可能由幾個複雜的信息提取和文本分類階段組成。然而,由於效率和穩健性的實際需求,文本挖掘的使用迄今為止仍然限於可以用相對簡單的手動構建管道滿足的預期信息需求。

最後瀏覽商品 (2)