Developing Enterprise Chatbots: Learning Linguistic Structures
暫譯: 開發企業聊天機器人:學習語言結構
Galitsky, Boris
商品描述
A chatbot is expected to be capable of supporting a cohesive and coherent conversation and be knowledgeable, which makes it one of the most complex intelligent systems being designed nowadays. Designers have to learn to combine intuitive, explainable language understanding and reasoning approaches with high-performance statistical and deep learning technologies.
Today, there are two popular paradigms for chatbot construction:
1. Build a bot platform with universal NLP and ML capabilities so that a bot developer for a particular enterprise, not being an expert, can populate it with training data;
2. Accumulate a huge set of training dialogue data, feed it to a deep learning network and expect the trained chatbot to automatically learn "how to chat".
Although these two approaches are reported to imitate some intelligent dialogues, both of them are unsuitable for enterprise chatbots, being unreliable and too brittle.
The latter approach is based on a belief that some learning miracle will happen and a chatbot will start functioning without a thorough feature and domain engineering by an expert and interpretable dialogue management algorithms.
Enterprise high-performance chatbots with extensive domain knowledge require a mix of statistical, inductive, deep machine learning and learning from the web, syntactic, semantic and discourse NLP, ontology-based reasoning and a state machine to control a dialogue. This book will provide a comprehensive source of algorithms and architectures for building chatbots for various domains based on the recent trends in computational linguistics and machine learning. The foci of this book are applications of discourse analysis in text relevant assessment, dialogue management and content generation, which help to overcome the limitations of platform-based and data driven-based approaches.
Supplementary material and code is available at https: //github.com/bgalitsky/relevance-based-on-parse-trees
商品描述(中文翻譯)
聊天機器人預期能夠支持連貫且一致的對話,並具備豐富的知識,這使得它成為當今設計中最複雜的智能系統之一。設計者必須學會將直觀的、可解釋的語言理解和推理方法與高效能的統計和深度學習技術相結合。
目前,聊天機器人構建有兩種流行的範式:
1. 建立一個具備通用自然語言處理(NLP)和機器學習(ML)能力的機器人平台,以便特定企業的機器人開發者,即使不是專家,也能夠用訓練數據來填充它;
2. 累積大量的訓練對話數據,將其餵入深度學習網絡,並期望訓練出來的聊天機器人能自動學會「如何聊天」。
雖然這兩種方法據報導能模仿某些智能對話,但它們都不適合企業聊天機器人,因為不可靠且過於脆弱。
後者的方法基於一種信念,即某種學習奇蹟將會發生,聊天機器人將在沒有專家進行徹底的特徵和領域工程以及可解釋的對話管理算法的情況下開始運作。
具備廣泛領域知識的企業高效能聊天機器人需要結合統計、歸納、深度機器學習和網絡學習、句法、語義和話語自然語言處理、基於本體的推理以及狀態機來控制對話。本書將提供一個全面的算法和架構來源,用於基於計算語言學和機器學習的最新趨勢構建各個領域的聊天機器人。本書的重點是話語分析在文本相關評估、對話管理和內容生成中的應用,這有助於克服基於平台和數據驅動方法的限制。
補充材料和代碼可在 https://github.com/bgalitsky/relevance-based-on-parse-trees 獲得。
作者簡介
作者簡介(中文翻譯)
博里斯·加利茨基博士在過去25年中為矽谷的初創公司貢獻了語言學和機器學習技術,並曾在eBay和Oracle工作,目前他是數位助理專案的架構師。作為兩本計算機科學書籍的作者,擁有150多篇出版物和15項以上專利,他目前正在研究話語分析如何改善搜索相關性並支持對話管理。在他之前的書中,加利茨基博士提出了自閉症推理的基礎,闡明了聊天機器人應如何促進對話。博里斯是Apache OpenNLP的提交者,他創建了OpenNLP.Similarity組件,這是聊天機器人開發的基礎。