Collaborative Annotation for Reliable Natural Language Processing: Technical and Sociological Aspects
暫譯: 可靠自然語言處理的協作註釋:技術與社會學層面

Karën Fort

  • 出版商: Wiley
  • 出版日期: 2016-06-03
  • 售價: $5,970
  • 貴賓價: 9.5$5,672
  • 語言: 英文
  • 頁數: 192
  • 裝訂: Hardcover
  • ISBN: 1848219040
  • ISBN-13: 9781848219045
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

This book presents a unique opportunity for constructing a consistent image of collaborative manual annotation for Natural Language Processing (NLP).  NLP has witnessed two major evolutions in the past 25 years: firstly, the extraordinary success of machine learning, which is now, for better or for worse, overwhelmingly dominant in the field, and secondly, the multiplication of evaluation campaigns or shared tasks. Both involve manually annotated corpora, for the training and evaluation of the systems.

These corpora have progressively become the hidden pillars of our domain, providing food for our hungry machine learning algorithms and reference for evaluation. Annotation is now the place where linguistics hides in NLP. However, manual annotation has largely been ignored for some time, and it has taken a while even for annotation guidelines to be recognized as essential.

Although some efforts have been made lately to address some of the issues presented by manual annotation, there has still been little research done on the subject. This book aims to provide some useful insights into the subject.

Manual corpus annotation is now at the heart of NLP, and is still largely unexplored. There is a need for manual annotation engineering (in the sense of a precisely formalized process), and this book aims to provide a first step towards a holistic methodology, with a global view on annotation.

 

商品描述(中文翻譯)

這本書提供了一個獨特的機會,來構建一個一致的協作手動標註形象,以應用於自然語言處理(Natural Language Processing, NLP)。在過去的25年中,NLP經歷了兩次重大的演變:首先是機器學習的非凡成功,這在這個領域中現在無論好壞都佔據了主導地位;其次是評估活動或共享任務的增加。這兩者都涉及手動標註的語料庫,用於系統的訓練和評估。

這些語料庫逐漸成為我們領域中隱藏的支柱,為我們渴望的機器學習算法提供了養分,並作為評估的參考。標註現在是語言學在NLP中隱藏的地方。然而,手動標註在很大程度上被忽視了一段時間,甚至連標註指南的必要性也花了一段時間才被認可。

儘管最近對手動標註所呈現的一些問題進行了一些努力,但在這個主題上仍然進行的研究不多。本書旨在提供一些有用的見解。

手動語料庫標註現在已成為NLP的核心,且仍然在很大程度上未被探索。需要進行手動標註工程(在精確形式化過程的意義上),本書旨在提供邁向整體方法論的第一步,並對標註進行全球視野的探討。