Machine Learning for Audio, Image and Video Analysis: Theory and Applications (Advanced Information and Knowledge Processing)
暫譯: 音頻、影像與視頻分析的機器學習:理論與應用(高級資訊與知識處理)
Francesco Camastra
- 出版商: Springer
- 出版日期: 2016-10-23
- 售價: $2,660
- 貴賓價: 9.5 折 $2,527
- 語言: 英文
- 頁數: 580
- 裝訂: Paperback
- ISBN: 1447168402
- ISBN-13: 9781447168409
-
相關分類:
Machine Learning
海外代購書籍(需單獨結帳)
商品描述
This second edition focuses on audio, image and video data, the three main types of input that machines deal with when interacting with the real world. A set of appendices provides the reader with self-contained introductions to the mathematical background necessary to read the book.
Divided into three main parts, "From Perception to Computation" introduces methodologies aimed at representing the data in forms suitable for computer processing, especially when it comes to audio and images. Whilst the second part, "Machine Learning" includes an extensive overview of statistical techniques aimed at addressing three main problems, namely classification (automatically assigning a data sample to one of the classes belonging to a predefined set), clustering (automatically grouping data samples according to the similarity of their properties) and sequence analysis (automatically mapping a sequence of observations into a sequence of human-understandable symbols). The third part "Applications" shows how the abstract problems defined in the second part underlie technologies capable to perform complex tasks such as the recognition of hand gestures or the transcription of handwritten data.
"Machine Learning for Audio, Image and Video Analysis" is suitable for students to acquire a solid background in machine learning as well as for practitioners to deepen their knowledge of the state-of-the-art. All application chapters are based on publicly available data and free software packages, thus allowing readers to replicate the experiments.
商品描述(中文翻譯)
這第二版專注於音頻、圖像和視頻數據,這是機器在與現實世界互動時處理的三種主要輸入類型。一組附錄為讀者提供了必要的數學背景的獨立介紹,以便閱讀本書。
本書分為三個主要部分,第一部分「從感知到計算」介紹了旨在將數據表示為適合計算機處理的形式的方法,特別是在音頻和圖像方面。第二部分「機器學習」則包括了針對三個主要問題的統計技術的廣泛概述,這三個問題分別是分類(自動將數據樣本分配到預定義集合中的一個類別)、聚類(根據屬性相似性自動將數據樣本分組)和序列分析(自動將觀察序列映射為人類可理解的符號序列)。第三部分「應用」展示了第二部分中定義的抽象問題如何支撐能夠執行複雜任務的技術,例如手勢識別或手寫數據的轉錄。
《音頻、圖像和視頻分析的機器學習》適合學生獲得堅實的機器學習背景,也適合從業者深化對最新技術的了解。所有應用章節均基於公開可用的數據和免費軟體包,從而使讀者能夠重現實驗。