Python for Natural Language Processing: Programming with Numpy, Scikit-Learn, Keras, and Pytorch

Nugues, Pierre M.

  • 出版商: Springer
  • 出版日期: 2024-07-10
  • 售價: $2,750
  • 貴賓價: 9.5$2,613
  • 語言: 英文
  • 頁數: 520
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3031575482
  • ISBN-13: 9783031575488
  • 相關分類: DeepLearningPython程式語言
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Since the last edition of this book (2014), progress has been astonishing in all areas of Natural Language Processing, with recent achievements in Text Generation that spurred a media interest going beyond the traditional academic circles. Text Processing has meanwhile become a mainstream industrial tool that is used, to various extents, by countless companies. As such, a revision of this book was deemed necessary to catch up with the recent breakthroughs, and the author discusses models and architectures that have been instrumental in the recent progress of Natural Language Processing.

As in the first two editions, the intention is to expose the reader to the theories used in Natural Language Processing, and to programming examples that are essential for a deep understanding of the concepts. Although present in the previous two editions, Machine Learning is now even more pregnant, having replaced many of the earlier techniques to process text. Many new techniques build on the availability of text.

Using Python notebooks, the reader will be able to load small corpora, format text, apply the models through executing pieces of code, gradually discover the theoretical parts by possibly modifying the code or the parameters, and traverse theories and concrete problems through a constant interaction between the user and the machine. The data sizes and hardware requirements are kept to a reasonable minimum so that a user can see instantly, or at least quickly, the results of most experiments on most machines.

The book does not assume a deep knowledge of Python, and an introduction to this language aimed at Text Processing is given in Ch. 2, which will enable the reader to touch all the programming concepts, including NumPy arrays and PyTorch tensors as fundamental structures to represent and process numerical data in Python, or Keras for training Neural Networks to classify texts. Covering topics like Word Segmentation and Part-of-Speech and Sequence Annotation, the textbook also gives an in-depth overview of Transformers (for instance, BERT), Self-Attention and Sequence-to-Sequence Architectures.

商品描述(中文翻譯)

自從本書的最後一版(2014年)以來,自然語言處理的各個領域進展驚人,最近在文本生成方面的成就引發了超越傳統學術圈的媒體關注。文本處理目前已成為主流的工業工具,無數公司在不同程度上使用它。因此,對本書進行修訂被認為是必要的,以跟上最近的突破,作者討論了在自然語言處理最近進展中發揮重要作用的模型和架構。

與前兩版一樣,本書的目的是讓讀者接觸到自然語言處理中使用的理論,以及對於深入理解這些概念至關重要的程式範例。儘管在前兩版中已經存在,機器學習現在的比重更大,已取代許多早期的文本處理技術。許多新技術建立在文本可用性的基礎上。

使用 Python 筆記本,讀者將能夠加載小型語料庫、格式化文本、通過執行程式碼片段應用模型,逐步發現理論部分,可能通過修改程式碼或參數來進行探索,並通過用戶與機器之間的持續互動來穿梭於理論與具體問題之間。數據大小和硬體需求保持在合理的最低限度,以便用戶能夠立即或至少快速看到大多數實驗在大多數機器上的結果。

本書不假設讀者對 Python 有深入的了解,第二章提供了針對文本處理的 Python 語言介紹,這將使讀者能夠接觸所有程式設計概念,包括 NumPy 陣列和 PyTorch 張量,這些都是在 Python 中表示和處理數據的基本結構,或使用 Keras 訓練神經網絡以分類文本。本書涵蓋了詞彙切分、詞性標註和序列標註等主題,並對 Transformers(例如 BERT)、自注意力和序列到序列架構進行了深入的概述。

作者簡介

Pierre Nugues is a professor in the Dept. of Computer Science of Lund University. His research is focused on natural language processing for advanced user interfaces and spoken dialogue. This includes the design and the implementation of conversational agents within a multimodal framework and text visualization. He led the team that designed a navigation agent - Ulysse - that enables a user to navigate in a virtual reality environment using language, and the team that designed the CarSim system that generates animated 3D scenes from written texts. He has taught natural-language processing and computational linguistics at the following institutions: ISMRA, Caen, France; University of Nottingham, UK; Staffordshire University, UK; FH Konstanz, Germany; Lund University, Sweden and Ghent University, Belgium.

作者簡介(中文翻譯)

皮埃爾·努傑斯(Pierre Nugues)是隆德大學(Lund University)計算機科學系的教授。他的研究專注於自然語言處理,應用於先進的用戶界面和語音對話。這包括在多模態框架內設計和實現對話代理,以及文本可視化。他領導的團隊設計了一個導航代理 - Ulysse,該代理使得用戶能夠使用語言在虛擬現實環境中導航;此外,他還領導了設計CarSim系統的團隊,該系統能夠從書面文本生成動畫3D場景。他曾在以下機構教授自然語言處理和計算語言學:法國卡昂的ISMRA、英國諾丁漢大學、英國斯塔福德郡大學、德國康斯坦茨應用科技大學、瑞典隆德大學以及比利時根特大學。