Natural Language Processing Projects: Build Next-Generation NLP Applications Using AI Techniques
暫譯: 自然語言處理專案:使用 AI 技術構建下一代 NLP 應用程式

Kulkarni, Akshay, Shivananda, Adarsha, Kulkarni, Anoosh

  • 出版商: Apress
  • 出版日期: 2021-12-04
  • 售價: $1,800
  • 貴賓價: 9.5$1,710
  • 語言: 英文
  • 頁數: 336
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484273850
  • ISBN-13: 9781484273852
  • 相關分類: 人工智慧Text-mining
  • 立即出貨 (庫存=1)

商品描述

Chapter 1: Natural Language Processing & Artificial Intelligence OverviewChapter Goal: This is an introductory chapter. This chapter provides a quick refresher of the topics to be covered in this book. Since this book teaches projects surrounding a specific area of technology, we will provide a brief introduction to the key concepts required for these projects. We will not be working on a specific project, rather discuss some important concepts without going into details. The depth on each of these topics will be covered in the specific chaptersNo of pages: 25Sub - Topics: 1. Artificial intelligence paradigm2. NLP and AI life cycle3. NLP concepts (TF-IDF, word embeddings, many more)4. Machine learning concepts (supervised learning, classification, unsupervised learning)5. Deep learning concepts (CNN, RNN, LSTM)
Chapter 2: Product360 - Sentiment, Emotion & Trend Capturing SystemChapter Goal: Sentiment analysis involves finding the polarity of a sentence and labels it as positive, negative or neutral. Emotion detection involves identifying emotions(sad, anger, happy, etc) from the sentences. Data is extracted from social media like Twitter, Facebook etc. and Ecommerce website, processed and analyzed using different NLP techniques will provide a 360 degree view of that product which enables better decision making. This chapter introduces sentiment analysis to the reader and the various techniques that can be used to analyze text. We will apply sentiment, emotion and trend analysis on reviews data for any E-commerce website like Amazon, Zomato, and IMDb, etc. which contains millions of customer reviews and star ratings. For this task, we will use Python libraries such as Vader, Textblob, etc. No of pages: 30Sub - Topics 1. Text mining and various available libraries. 2. Data preprocessing.3. Data cleaning tricks, optimized feature engineering4. EDA5. Sentiment analysis6. Emotion and trend analysis
Chapter 3: TED Talks Segmentation & Topics Extraction Using Machine LearningChapter Goal: Document clustering is an unsupervised learning process for grouping documents. For example, there are number of e-books and they have to be grouped to build a structure around them saves time while finding the books. Articles grouping, product clustering are the other few examples. Once we identify the clusters, it is important to understand the properties of clusters. So, Topic modeling is performed to extract topics from a set of documents and articles to understand the content of the documents using keywords and be able to tag the articles or documents using those topics. In this chapter will see how to group TED talks based on description using various clustering techniques like K-Means and Hierarchical clustering. Then we will perform topic modeling using Latent Dirichlet Allocation (LDA) to understand what defines each cluster. Important libraries include Gensim, NLTK, Scikit-learn and word2vec for this problem. We will use over 100k articles from different American publications. No of pages: 30Sub - Topics 1. Data understanding and pre-processing2. Computing TF-IDF 3. K-Means and hierarchical clustering4. Evaluation and visualization5. Topic modeling using Latent Dirichlet Allocation
Chapter 4: Enhancing E-commerce Through Advanced Search Engine and Recommendation SystemChapter Goal: An information retrieval system will search product descriptions based on a search query text and gives the results. Search engines are the most common and best use case of information retrieval models. The concept of information retrieval started from a string or word comp

商品描述(中文翻譯)

第1章:自然語言處理與人工智慧概述
章節目標:這是一個入門章節。本章提供了本書將涵蓋主題的快速回顧。由於本書教授圍繞特定技術領域的專案,我們將簡要介紹這些專案所需的關鍵概念。我們不會針對特定專案進行工作,而是討論一些重要概念而不深入細節。每個主題的深度將在特定章節中涵蓋。
頁數:25
子主題:
1. 人工智慧範式
2. NLP與AI生命週期
3. NLP概念(TF-IDF、詞嵌入等)
4. 機器學習概念(監督學習、分類、非監督學習)
5. 深度學習概念(CNN、RNN、LSTM)

第2章:Product360 - 情感、情緒與趨勢捕捉系統
章節目標:情感分析涉及找出句子的極性,並將其標記為正面、負面或中立。情緒檢測涉及從句子中識別情緒(悲傷、憤怒、快樂等)。數據從社交媒體如Twitter、Facebook等以及電子商務網站提取,經過處理和分析,使用不同的NLP技術將提供該產品的360度視圖,從而促進更好的決策。本章向讀者介紹情感分析及可用於分析文本的各種技術。我們將對任何電子商務網站(如Amazon、Zomato和IMDb等)上的評論數據進行情感、情緒和趨勢分析,這些網站包含數百萬條客戶評論和星級評分。為此任務,我們將使用Python庫,如Vader、Textblob等。
頁數:30
子主題:
1. 文本挖掘及各種可用庫
2. 數據預處理
3. 數據清理技巧,優化特徵工程
4. EDA
5. 情感分析
6. 情緒與趨勢分析

第3章:使用機器學習進行TED演講分段與主題提取
章節目標:文檔聚類是一種無監督學習過程,用於對文檔進行分組。例如,有許多電子書,必須將它們分組以建立結構,這樣在查找書籍時可以節省時間。文章分組、產品聚類是其他幾個例子。一旦我們識別出聚類,了解聚類的屬性就變得重要。因此,進行主題建模以從一組文檔和文章中提取主題,使用關鍵字理解文檔的內容,並能夠使用這些主題標記文章或文檔。在本章中,我們將看到如何根據描述使用各種聚類技術(如K-Means和層次聚類)對TED演講進行分組。然後,我們將使用潛在狄利克雷分配(LDA)進行主題建模,以了解每個聚類的定義。重要的庫包括Gensim、NLTK、Scikit-learn和word2vec。我們將使用來自不同美國出版物的超過10萬篇文章。
頁數:30
子主題:
1. 數據理解與預處理
2. 計算TF-IDF
3. K-Means和層次聚類
4. 評估與可視化
5. 使用潛在狄利克雷分配進行主題建模

第4章:通過先進的搜索引擎和推薦系統增強電子商務
章節目標:信息檢索系統將根據搜索查詢文本搜索產品描述並給出結果。搜索引擎是信息檢索模型最常見和最佳的用例。信息檢索的概念始於字符串或單詞的組合。