Blueprints for Text Analytics Using Python: Machine Learning-Based Solutions for Common Real World (Nlp) Applications (Paperback)
暫譯: 使用 Python 的文本分析藍圖:基於機器學習的常見現實世界 (NLP) 應用解決方案 (平裝本)

Albrecht, Jens, Ramachandran, Sidharth, Winkler, Christian

  • 出版商: O'Reilly
  • 出版日期: 2021-01-12
  • 定價: $2,700
  • 售價: 9.5$2,565
  • 語言: 英文
  • 頁數: 424
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 149207408X
  • ISBN-13: 9781492074083
  • 相關分類: Python程式語言Machine LearningText-mining
  • 立即出貨 (庫存 < 3)

買這商品的人也買了...

相關主題

商品描述

Turning text into valuable information is essential for many businesses looking to gain a competitive advantage. There have been many improvements in natural language processing and users have a lot of options when choosing to work on a problem. However, it's not always clear which NLP tools or libraries would work for a business use--or which techniques you should use and in what order.

This practical book provides theoretical background and real-world case studies with detailed code examples to help developers and data scientists obtain insight from text online. Authors Jens Albrecht, Sidharth Ramachandran, and Christian Winkler use blueprints for text-related problems that apply state-of-the-art machine learning methods in Python.

If you have a fundamental understanding of statistics and machine learning along with basic programming experience in Python, you're ready to get started. You'll learn how to:

  • Crawl and clean then explore and visualize textual data in different formats
  • Preprocess and vectorize text for machine learning
  • Apply methods for classification, topic analysis, summarization, and knowledge extraction
  • Use semantic word embeddings and deep learning approaches for complex problems
  • Work with Python NLP libraries like spaCy, NLTK, and Gensim in combination with scikit-learn, Pandas, and PyTorch

商品描述(中文翻譯)

將文字轉換為有價值的信息對於許多尋求獲得競爭優勢的企業來說至關重要。自然語言處理(NLP)方面已經有了許多改進,使用者在選擇解決問題時有很多選擇。然而,並不總是清楚哪些NLP工具或庫適合商業用途,或者應該使用哪些技術以及使用的順序。

這本實用的書籍提供了理論背景和真實案例研究,並附有詳細的程式碼範例,以幫助開發人員和數據科學家從線上文本中獲取洞見。作者Jens Albrecht、Sidharth Ramachandran和Christian Winkler使用針對文本相關問題的藍圖,應用最先進的機器學習方法於Python中。

如果您對統計學和機器學習有基本的理解,並且具備Python的基本程式設計經驗,那麼您已經準備好開始了。您將學習如何:

- 爬取並清理文本數據,然後探索和可視化不同格式的文本數據
- 對文本進行預處理和向量化,以便用於機器學習
- 應用分類、主題分析、摘要和知識提取的方法
- 使用語義詞嵌入和深度學習方法解決複雜問題
- 結合使用Python的NLP庫,如spaCy、NLTK和Gensim,以及scikit-learn、Pandas和PyTorch

作者簡介

Jens Albrecht is a full-time professor for Computer Science Department at the Nuremberg Institute of Technology. His work focuses on data management and analytics with a focus on text. He holds a doctorates degree in computer science. Before he rejoined academia in 2012, he has been working for over a decade in the industry as consultant and data architect. He is author of several articles on Big Data management and analysis.

Sidharth Ramachandran currently leads a team of data scientists at GfK helping to build data products for the consumer goods industry. He has over 10 years of experience in software engineering and data science across telecom, banking and marketing industries. Sidharth also co-founded WACAO, a smart personal assistant on Whatsapp which was also featured on Techcrunch. He holds an undergraduate engineering degree from IIT Roorkee and an MBA from IIM Kozhikode. Sidharth is passionate about solving real problems through technology and loves to hack through personal projects in his free time.

Christian Winkler is a Data Scientist and Machine Learning Architect. He holds a PhD in theoretical physics and has been working in the field of large data volumes and artificial intelligence for 20 years, with particular focus on scalable systems and intelligent algorithms for mass text processing. He is founder of datanizing GmbH, speaker at conferences and author of Machine Learning / Text Analytics articles.

作者簡介(中文翻譯)

Jens Albrecht 是紐倫堡科技學院計算機科學系的全職教授。他的工作專注於數據管理和分析,特別是文本方面。他擁有計算機科學的博士學位。在2012年重返學術界之前,他在業界擔任顧問和數據架構師,工作超過十年。他是多篇有關大數據管理和分析的文章的作者。

Sidharth Ramachandran 目前在 GfK 領導一支數據科學家團隊,幫助為消費品行業構建數據產品。他在電信、銀行和市場行銷行業擁有超過10年的軟體工程和數據科學經驗。Sidharth 也是 WACAO 的共同創辦人,這是一個在 Whatsapp 上的智能個人助理,曾在 Techcrunch 上介紹過。他擁有 IIT Roorkee 的工程學學士學位和 IIM Kozhikode 的 MBA 學位。Sidharth 對通過技術解決實際問題充滿熱情,並喜歡在空閒時間進行個人項目的黑客實驗。

Christian Winkler 是一名數據科學家和機器學習架構師。他擁有理論物理的博士學位,並在大數據和人工智慧領域工作了20年,特別專注於可擴展系統和大規模文本處理的智能算法。他是 datanizing GmbH 的創始人,並在會議上擔任演講者,也是機器學習/文本分析文章的作者。

最後瀏覽商品 (20)