-
出版商:
Packt Publishing
-
出版日期:
2010-11-12
-
售價:
$1,690
-
貴賓價:
9.5 折
$1,606
-
語言:
英文
-
頁數:
272
-
裝訂:
Paperback
-
ISBN:
1849513600
-
ISBN-13:
9781849513609
-
相關分類:
Python、程式語言
商品描述
Use Python's NLTK suite of libraries to maximize your Natural Language Processing capabilities. * Quickly get to grips with Natural Language Processing ? with Text Analysis, Text Mining, and beyond * Learn how machines and crawlers interpret and process natural languages * Easily work with huge amounts of data and learn how to handle distributed processing * Part of Packt's Cookbook series: Each recipe is a carefully organized sequence of instructions to complete the task as efficiently as possible In Detail Natural Language Processing is used everywhere ? in search engines, spell checkers, mobile phones, computer games ? even your washing machine. Python's Natural Language Toolkit (NLTK) suite of libraries has rapidly emerged as one of the most efficient tools for Natural Language Processing. You want to employ nothing less than the best techniques in Natural Language Processing ? and this book is your answer. Python Text Processing with NLTK 2.0 Cookbook is your handy and illustrative guide, which will walk you through all the Natural Language Processing techniques in a step?by-step manner. It will demystify the advanced features of text analysis and text mining using the comprehensive NLTK suite. This book cuts short the preamble and you dive right into the science of text processing with a practical hands-on approach. Get started off with learning tokenization of text. Get an overview of WordNet and how to use it. Learn the basics as well as advanced features of Stemming and Lemmatization. Discover various ways to replace words with simpler and more common (read: more searched) variants. Create your own corpora and learn to create custom corpus readers for JSON files as well as for data stored in MongoDB. Use and manipulate POS taggers. Transform and normalize parsed chunks to produce a canonical form without changing their meaning. Dig into feature extraction and text classification. Learn how to easily handle huge amounts of data without any loss in efficiency or speed. This book will teach you all that and beyond, in a hands-on learn-by-doing manner. Make yourself an expert in using the NLTK for Natural Language Processing with this handy companion. What you will learn from this book * Learn Text categorization and Topic identification * Learn Stemming and Lemmatization and how to go beyond the usual spell checker * Replace negations with antonyms in your text * Learn to tokenize words into lists of sentences and words, and gain an insight into WordNet * Transform and manipulate chunks and trees * Learn advanced features of corpus readers and create your own custom corpora * Tag different parts of speech by creating, training, and using a part-of-speech tagger * Improve accuracy by combining multiple part-of-speech taggers * Learn how to do partial parsing to extract small chunks of text from a part-of-speech tagged sentence * Produce an alternative canonical form without changing the meaning by normalizing parsed chunks * Learn how search engines use Natural Language Processing to process text * Make your site more discoverable by learning how to automatically replace words with more searched equivalents * Parse dates, times, and HTML * Train and manipulate different types of classifiers Approach The learn-by-doing approach of this book will enable you to dive right into the heart of text processing from the very first page. Each recipe is carefully designed to fulfill your appetite for Natural Language Processing. Packed with numerous illustrative examples and code samples, it will make the task of using the NLTK for Natural Language Processing easy and straightforward. Who this book is written for This book is for Python programmers who want to quickly get to grips with using the NLTK for Natural Language Processing. Familiarity with basic text processing concepts is required. Programmers experienced in the NLTK will also find it useful. Students of linguistics will find it invaluable.
商品描述(中文翻譯)
使用 Python 的 NLTK 函式庫套件來最大化您的自然語言處理能力。
* 快速掌握自然語言處理 - 包括文本分析、文本挖掘等
* 學習機器和爬蟲如何解釋和處理自然語言
* 輕鬆處理大量數據,學習如何進行分散式處理
* Packt 的食譜系列之一:每個食譜都是一系列精心組織的指令,以最有效的方式完成任務
詳細內容
自然語言處理無處不在 - 在搜尋引擎、拼寫檢查器、手機、電腦遊戲,甚至是您的洗衣機中。Python 的自然語言工具包 (NLTK) 函式庫套件迅速成為自然語言處理中最有效的工具之一。您希望使用自然語言處理中的最佳技術,而這本書就是您的答案。
《Python 文本處理與 NLTK 2.0 食譜》是您方便且具說明性的指南,將以逐步的方式引導您了解所有自然語言處理技術。它將揭開文本分析和文本挖掘的高級特徵,使用全面的 NLTK 套件。
這本書省略了前言,讓您直接進入文本處理的科學,採用實用的動手方式。
從學習文本的標記化開始。了解 WordNet 及其使用方法。學習詞幹提取 (Stemming) 和詞形還原 (Lemmatization) 的基本知識及其高級特徵。探索用更簡單和更常見(即:更常被搜尋)的變體替換單詞的各種方法。創建自己的語料庫,並學習如何為 JSON 文件以及存儲在 MongoDB 中的數據創建自定義語料庫讀取器。使用和操作詞性標註器 (POS taggers)。轉換和標準化解析的片段,以生成不改變其意義的標準形式。深入了解特徵提取和文本分類。學習如何輕鬆處理大量數據,而不會損失效率或速度。
這本書將以動手實作的方式教您所有這些及更多。讓自己成為使用 NLTK 進行自然語言處理的專家,這本實用的伴侶將助您一臂之力。
您將從這本書中學到什麼
* 學習文本分類和主題識別
* 學習詞幹提取和詞形還原,並超越一般的拼寫檢查器
* 在文本中用反義詞替換否定詞
* 學習將單詞標記化為句子和單詞的列表,並深入了解 WordNet
* 轉換和操作片段和樹狀結構
* 學習語料庫讀取器的高級特徵,並創建自己的自定義語料庫
* 通過創建、訓練和使用詞性標註器來標註不同的詞性
* 通過結合多個詞性標註器來提高準確性
* 學習如何進行部分解析,以從詞性標註的句子中提取小片段
* 通過標準化解析的片段,生成不改變意義的替代標準形式
* 學習搜尋引擎如何使用自然語言處理來處理文本
* 通過學習如何自動用更常被搜尋的同義詞替換單詞,使您的網站更易被發現
* 解析日期、時間和 HTML
* 訓練和操作不同類型的分類器
方法
這本書的動手學習方法將使您從第一頁開始就能深入文本處理的核心。每個食譜都經過精心設計,以滿足您對自然語言處理的需求。書中包含大量的示例和代碼範例,將使使用 NLTK 進行自然語言處理的任務變得簡單明瞭。
本書的讀者對象
這本書適合希望快速掌握使用 NLTK 進行自然語言處理的 Python 程式設計師。需要具備基本的文本處理概念。對 NLTK 有經驗的程式設計師也會覺得它有用。語言學的學生將會發現它非常寶貴。