Mastering Text Mining with R(Paperback)
暫譯: 精通 R 語言的文本挖掘(平裝本)
Ashish Kumar, Avinash Paul
- 出版商: Packt Publishing
- 出版日期: 2016-12-30
- 售價: $1,840
- 貴賓價: 9.5 折 $1,748
- 語言: 英文
- 頁數: 258
- 裝訂: Paperback
- ISBN: 178355181X
- ISBN-13: 9781783551811
-
相關分類:
Text-mining
海外代購書籍(需單獨結帳)
買這商品的人也買了...
-
$360$281 -
$160$152 -
$600$570 -
$380$361 -
$1,940$1,843 -
$825Machine Learning with R, 2/e (Paperback)
-
$780$616 -
$420$332 -
$580$458 -
$3,620$3,439 -
$720$562 -
$480$379 -
$590$502 -
$520$411 -
$680$537 -
$580$458 -
$500$395 -
$1,400$1,330 -
$360$281 -
$580$458 -
$590$460 -
$1,320Text Mining with R: A Tidy Approach (Paperback)
-
$1,100$1,045 -
$5,250$4,988 -
$1,230$1,169
商品描述
Master text-taming techniques and build effective text-processing applications with R
About This Book
- This book will help you develop an in-depth understanding of the text mining process with lucid implementation in the R language
- After reading this book, you will be able to enhance your skills on building text-mining apps with R
- All the examples in the book use the latest version of R, making this book an update-to-date edition in the market
Who This Book Is For
If you are an R programmer, analyst, or data scientist who wants to gain experience in performing text data mining and analytics with R, then this book is for you. Exposure to working with statistical methods and language processing would be helpful.
What You Will Learn
- Get acquainted with some of the highly efficient R packages such as OpenNLP and RWeka to perform various steps in the text mining process
- Access and manipulate data from different sources such as JSON and HTTP
- Process text using regular expressions
- Get to know the different approaches of tagging texts, such as POS tagging, to get started with text analysis
- Explore different dimensionality reduction techniques, such as Principal Component Analysis (PCA), and understand its implementation in R
- Discover the underlying themes or topics that are present in an unstructured collection of documents, using common topic models such as Latent Dirichlet Allocation (LDA)
- Build a baseline sentence completing application
- Perform entity extraction and named entity recognition using R
- Get an introduction to various approaches in opinion mining and their implementation in R
In Detail
Text Mining (or text data mining or text analytics) is a process of extracting useful and high-quality information from text by devising patterns and trends through machine learning, statistical pattern learning, and related algorithms and methods. R provides an extensive ecosystem to mine text through its many frameworks and packages.
This book will help you develop a thorough understanding of the steps in the text mining process and gain confidence in applying the concepts to build text-data driven products.
Starting with basic information about the statistics concepts used in text mining, the book will teach you how to access, cleanse, and process text using the R language and teach you how to analyze them. It will equip you with the tools and the associated knowledge about different tagging, chunking, and entailment approaches and their usage in natural language processing.
Moving on, the book will teach you different dimensionality reduction techniques and their implementation in R, along with topic modeling, text summarization, and extracting hidden themes from documents and collections. Next, we will cover pattern recognition in text data utilizing classification mechanisms, perform entity recognition, and develop an ontology learning framework. You will learn the concept of an opinion in a text document and be able to apply various techniques to extract a sentiment and opinion out of it.
By the end of the book, you will develop a practical application from the concepts learned, and will understand how text mining can be leveraged to analyze the massively available data on social media.
商品描述(中文翻譯)
**掌握文本處理技術,使用 R 建立有效的文本處理應用程式**
## 本書介紹
- 本書將幫助您深入了解文本挖掘過程,並以 R 語言進行清晰的實作。
- 閱讀本書後,您將能夠提升使用 R 建立文本挖掘應用程式的技能。
- 本書中的所有範例均使用最新版本的 R,使本書成為市場上最新的版本。
## 本書適合誰
如果您是 R 程式設計師、分析師或數據科學家,並希望獲得使用 R 進行文本數據挖掘和分析的經驗,那麼本書適合您。接觸統計方法和語言處理的經驗將會有所幫助。
## 您將學到什麼
- 熟悉一些高效的 R 套件,如 OpenNLP 和 RWeka,以執行文本挖掘過程中的各種步驟。
- 訪問和操作來自不同來源的數據,如 JSON 和 HTTP。
- 使用正則表達式處理文本。
- 了解標記文本的不同方法,如詞性標記(POS tagging),以開始文本分析。
- 探索不同的降維技術,如主成分分析(Principal Component Analysis, PCA),並了解其在 R 中的實作。
- 使用常見的主題模型,如潛在狄利克雷分配(Latent Dirichlet Allocation, LDA),發現非結構化文檔集合中存在的潛在主題或主題。
- 建立一個基線句子完成應用程式。
- 使用 R 執行實體提取和命名實體識別。
- 瞭解意見挖掘的各種方法及其在 R 中的實作。
## 詳細內容
文本挖掘(或文本數據挖掘或文本分析)是一個通過機器學習、統計模式學習及相關算法和方法,從文本中提取有用且高質量信息的過程。R 提供了一個廣泛的生態系統,通過其眾多框架和套件來挖掘文本。
本書將幫助您全面了解文本挖掘過程中的步驟,並增強您應用這些概念來建立以文本數據為驅動的產品的信心。
本書從文本挖掘中使用的統計概念的基本信息開始,教您如何使用 R 語言訪問、清理和處理文本,並教您如何分析這些文本。它將為您提供有關不同標記、分塊和推理方法的工具和相關知識,以及它們在自然語言處理中的使用。
接下來,本書將教您不同的降維技術及其在 R 中的實作,並涵蓋主題建模、文本摘要以及從文檔和集合中提取隱藏主題。然後,我們將利用分類機制進行文本數據中的模式識別,執行實體識別,並開發本體學習框架。您將學習文本文檔中意見的概念,並能夠應用各種技術來提取情感和意見。
在本書結束時,您將從所學的概念中開發一個實用的應用程式,並了解如何利用文本挖掘來分析社交媒體上大量可用的數據。