Mastering Voice Interfaces: Creating Great Voice Apps for Real Users
暫譯: 掌握語音介面:為真實用戶創建優秀的語音應用程式
Thymé-Gobbel, Ann, Jankowski, Charles
- 出版商: Apress
- 出版日期: 2021-05-30
- 售價: $2,550
- 貴賓價: 9.5 折 $2,423
- 語言: 英文
- 頁數: 693
- 裝訂: Quality Paper - also called trade paper
- ISBN: 1484270045
- ISBN-13: 9781484270042
海外代購書籍(需單獨結帳)
商品描述
Build great voice apps of any complexity for any domain by learning both the how's and why's of voice development. In this book you'll see how we live in a golden age of voice technology and how advances in automatic speech recognition (ASR), natural language processing (NLP), and related technologies allow people to talk to machines and get reasonable responses. Today, anyone with computer access can build a working voice app. That democratization of the technology is great. But, while it's fairly easy to build a voice app that runs, it's still remarkably difficult to build a great one, one that users trust, that understands their natural ways of speaking and fulfills their needs, and that makes them want to return for more.
We start with an overview of how humans and machines produce and process conversational speech, explaining how they differ from each other and from other modalities. This is the background you need to understand the consequences of each design and implementation choice as we dive into the core principles of voice interface design. We walk you through many design and development techniques, including ones that some view as advanced, but that you can implement today. We use the Google development platform and Python, but our goal is to explain the reasons behind each technique such that you can take what you learn and implement it on any platform.
Readers of Mastering Voice Interfaces will come away with a solid understanding of what makes voice interfaces special, learn the core voice design principles for building great voice apps, and how to actually implement those principles to create robust apps. We've learned during many years in the voice industry that the most successful solutions are created by those who understand both the human and the technology sides of speech, and that both sides affect design and development. Because we focus on developing task-oriented voice apps for real users in the real world, you'll learn how to take your voice apps from idea through scoping, design, development, rollout, and post-deployment performance improvements, all illustrated with examples from our own voice industry experiences.What You Will Learn
- Create truly great voice apps that users will love and trust
- See how voice differs from other input and output modalities, and why that matters
- Discover best practices for designing conversational voice-first applications, and the consequences of design and implementation choices
- Implement advanced voice designs, with real-world examples you can use immediately.
- Verify that your app is performing well, and what to change if it doesn't
Who This Book Is For
Anyone curious about the real how's and why's of voice interface design and development. In particular, it's aimed at teams of developers, designers, and product owners who need a shared understanding of how to create successful voice interfaces using today's technology. We expect readers to have had some exposure to voice apps, at least as users.
商品描述(中文翻譯)
建立任何複雜度的優秀語音應用程式,無論是什麼領域,您都可以透過學習語音開發的「如何」與「為何」來實現。在這本書中,您將看到我們生活在語音技術的黃金時代,隨著自動語音識別(ASR)、自然語言處理(NLP)及相關技術的進步,人們能夠與機器對話並獲得合理的回應。如今,任何擁有電腦的人都可以建立一個運行中的語音應用程式。這項技術的民主化是非常棒的。然而,雖然建立一個能運行的語音應用程式相對容易,但要建立一個優秀的應用程式,讓用戶信任、理解他們自然的說話方式並滿足他們的需求,並讓他們想要再次使用,仍然是相當困難的。
我們首先概述人類和機器如何產生和處理對話語音,解釋它們之間以及與其他模式的不同之處。這是您理解每個設計和實施選擇後果所需的背景知識,隨著我們深入語音介面設計的核心原則。我們將帶您了解許多設計和開發技術,包括一些被視為進階的技術,但您今天就可以實施。我們使用 Google 開發平台和 Python,但我們的目標是解釋每個技術背後的原因,以便您能夠將所學應用於任何平台。
《Mastering Voice Interfaces》的讀者將對語音介面的特殊性有堅實的理解,學習建立優秀語音應用程式的核心語音設計原則,以及如何實際實施這些原則以創建穩健的應用程式。我們在語音產業多年來的經驗告訴我們,最成功的解決方案是由那些理解語音的人類和技術兩方面的人創造的,這兩方面都會影響設計和開發。因為我們專注於為現實世界中的真實用戶開發以任務為導向的語音應用程式,您將學習如何將您的語音應用程式從構思、範圍界定、設計、開發、推出到部署後的性能改進,所有這些都以我們自身語音產業經驗的例子來說明。
您將學到的內容:
- 創建真正優秀的語音應用程式,讓用戶喜愛並信任
- 了解語音與其他輸入和輸出模式的不同之處,以及這為何重要
- 探索設計對話式語音優先應用程式的最佳實踐,以及設計和實施選擇的後果
- 實施進階語音設計,並提供您可以立即使用的實際範例
- 驗證您的應用程式表現良好,並在表現不佳時該如何改變
本書適合對語音介面設計和開發的實際「如何」和「為何」感到好奇的任何人。特別是針對需要共同理解如何使用當今技術創建成功語音介面的開發者、設計師和產品負責人團隊。我們預期讀者至少對語音應用程式有一些接觸,至少作為用戶。
作者簡介
Ann Thymé-Gobbel's career has focused on how people use speech and natural language to communicate with each other and with technology. After completing her PhD in cognitive science and linguistics from UC San Diego, she's held a broad set of voice-related UI/UX design roles in both large corporations and small start-ups, working with diverse teams in product development, client project engagements, and R&D. Her past work includes design, data analysis and establishing best practices at Nuance, voice design for mobile and in-home devices at Amazon Lab 126, and creating natural language conversations for multimodal healthcare apps at 22otters. Her research has covered automatic language detection, error correction, and discourse structure. She is currently Director of UI/UX Design at Loose Cannon Systems, the team bringing to market Milo, a handsfree wearable communicator. Ann never stops doing research: she collects and analyzes data at every opportunity and enjoys sharing her findings with others, having presented and taught at conferences internationally.
Charles Jankowski has over 30 years' experience in industry and academia developing applications and algorithms for real-world users incorporating advanced speech recognition, speaker verification, and natural language technologies. He has used state-of-the-art machine learning processes and techniques for data analysis, performance optimization, and algorithm development. Charles has highly in-depth technical experience with state-of-the-art technologies, effective management of cross-functional teams for all facets of application deployment, and outstanding relationships with clients. Currently, he is Director of NLP at Brain Technologies, creating the Natural iOS application with which you can "Say it and Get it." Previously he was Director of NLP and Robotics at CloudMinds, Director of Speech and Natural Language at 22otters, Senior Speech Scientist at Performance Technology Partners, and Director of Professional Services at Nuance. He has also been an independent consultant. Charles holds S.B., S.M., and Ph.D. degrees from MIT, all in electrical engineering.
作者簡介(中文翻譯)
安·提梅-戈貝爾(Ann Thymé-Gobbel)的職業生涯專注於人們如何使用語音和自然語言進行相互交流以及與技術的互動。在加州大學聖地牙哥分校(UC San Diego)獲得認知科學和語言學的博士學位後,她在大型企業和小型創業公司中擔任了多個與語音相關的UI/UX設計角色,與多元化的團隊合作進行產品開發、客戶專案參與和研發。她過去的工作包括在Nuance進行設計、數據分析和建立最佳實踐,在亞馬遜實驗室126(Amazon Lab 126)為移動和家庭設備進行語音設計,以及在22otters為多模態醫療應用創建自然語言對話。她的研究涵蓋自動語言檢測、錯誤修正和話語結構。她目前是Loose Cannon Systems的UI/UX設計總監,該團隊正在推出Milo,一款免持的可穿戴通訊設備。安從不停止研究:她在每個機會中收集和分析數據,並喜歡與他人分享她的發現,曾在國際會議上進行演講和教學。
查爾斯·揚科夫斯基(Charles Jankowski)在工業界和學術界擁有超過30年的經驗,開發針對現實世界用戶的應用程序和算法,並結合先進的語音識別、說話者驗證和自然語言技術。他使用最先進的機器學習過程和技術進行數據分析、性能優化和算法開發。查爾斯在最先進技術方面擁有深入的技術經驗,能有效管理跨功能團隊以應對應用部署的各個方面,並與客戶建立了卓越的關係。目前,他是Brain Technologies的自然語言處理(NLP)總監,正在創建Natural iOS應用程序,讓用戶可以「說出來,得到它」。他曾擔任CloudMinds的NLP和機器人總監、22otters的語音和自然語言總監、Performance Technology Partners的高級語音科學家,以及Nuance的專業服務總監。他也曾是一名獨立顧問。查爾斯擁有麻省理工學院(MIT)的S.B.、S.M.和Ph.D.學位,均為電機工程專業。