From Unimodal to Multimodal Machine Learning: An Overview

Skrlj, Blaz

  • 出版商: Springer
  • 出版日期: 2024-05-22
  • 售價: $1,940
  • 貴賓價: 9.5$1,843
  • 語言: 英文
  • 頁數: 70
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 3031570154
  • ISBN-13: 9783031570155
  • 相關分類: Machine Learning
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

With the increasing amount of various data types, machine learning methods capable of leveraging diverse sources of information have become highly relevant. Deep learning-based approaches have made significant progress in learning from texts and images in recent years. These methods enable simultaneous learning from different types of representations (embeddings). Substantial advancements have also been made in joint learning from different types of spaces. Additionally, other modalities such as sound, physical signals from the environment, and time series-based data have been recently explored. Multimodal machine learning, which involves processing and learning from data across multiple modalities, has opened up new possibilities in a wide range of applications, including speech recognition, natural language processing, and image recognition.

From Unimodal to Multimodal Machine Learning: An Overview gradually introduces the concept of multimodal machine learning, providing readers with the necessary background to understand this type of learning and its implications. Key methods representative of different modalities are described in more detail, aiming to offer an understanding of the peculiarities of various types of data and how multimodal approaches tend to address them (although not yet in some cases). The book examines the implications of multimodal learning in other domains and presents alternative approaches that offer computationally simpler yet still applicable solutions. The final part of the book focuses on intriguing open research problems, making it useful for practitioners who wish to better understand the limitations of existing methods and explore potential research avenues to overcome them


商品描述(中文翻譯)

隨著各種數據類型的增加,能夠利用多樣信息來源的機器學習方法變得越來越重要。基於深度學習的方法在近年來在文本和圖像的學習上取得了顯著進展。這些方法使得能夠同時從不同類型的表示(嵌入)中學習。對於來自不同類型空間的聯合學習也取得了重大進展。此外,最近還探索了其他模態,例如聲音、來自環境的物理信號以及基於時間序列的數據。多模態機器學習涉及跨多個模態處理和學習數據,為語音識別、自然語言處理和圖像識別等廣泛應用開啟了新的可能性。

《從單模態到多模態機器學習:概述》逐步介紹了多模態機器學習的概念,為讀者提供了理解這種學習及其影響所需的背景。書中詳細描述了代表不同模態的關鍵方法,旨在幫助讀者理解各類數據的特性以及多模態方法如何應對這些特性(儘管在某些情況下尚未完全解決)。本書探討了多模態學習在其他領域的影響,並提出了計算上更簡單但仍然適用的替代方法。書的最後部分集中於引人入勝的開放研究問題,對於希望更好地理解現有方法的局限性並探索克服這些局限的潛在研究方向的實踐者來說,具有實用價值。

作者簡介

Blaz Skrlj is a postdoctoral researcher and a research assistant at Jozef Stefan Institute, where he investigates the domain of efficient multimodal machine learning and low-resource machine learning. Blaz completed his PhD in Information and Communication Technologies at the Jozef Stean International Postgraduate School. His work focused on neuro-symbolic machine learning, automated machine learning (AutoML) and representation learning. He authored and co-authored more than fifty research publications, mainly on machine learning and its applications in biomedicine and bioinformatics.

作者簡介(中文翻譯)

Blaz Skrlj 是約瑟夫·斯特凡研究所的博士後研究員及研究助理,他專注於高效的多模態機器學習和低資源機器學習領域。Blaz 在約瑟夫·斯特凡國際研究生院完成了資訊與通信技術的博士學位。他的研究主要集中在神經符號機器學習、自動化機器學習(AutoML)和表示學習上。他已發表和共同發表超過五十篇研究論文,主要涉及機器學習及其在生物醫學和生物資訊學中的應用。