Information-Driven Machine Learning: Data Science as an Engineering Discipline

Friedland, Gerald

  • 出版商: Springer
  • 出版日期: 2023-12-02
  • 售價: $3,380
  • 貴賓價: 9.5$3,211
  • 語言: 英文
  • 頁數: 267
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3031394763
  • ISBN-13: 9783031394768
  • 相關分類: Machine LearningData Science
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

This groundbreaking book transcends traditional machine learning approaches by introducing information measurement methodologies that revolutionize the field.

Stemming from a UC Berkeley seminar on experimental design for machine learning tasks, these techniques aim to overcome the 'black box' approach of machine learning by reducing conjectures such as magic numbers (hyper-parameters) or model-type bias. Information-based machine learning enables data quality measurements, a priori task complexity estimations, and reproducible design of data science experiments. The benefits include significant size reduction, increased explainability, and enhanced resilience of models, all contributing to advancing the discipline's robustness and credibility.

While bridging the gap between machine learning and disciplines such as physics, information theory, and computer engineering, this textbook maintains an accessible and comprehensive style, making complex topics digestible fora broad readership. Information-Driven Machine Learning explores the synergistic harmony among these disciplines to enhance our understanding of data science modeling. Instead of solely focusing on the "how," this text provides answers to the "why" questions that permeate the field, shedding light on the underlying principles of machine learning processes and their practical implications. By advocating for systematic methodologies grounded in fundamental principles, this book challenges industry practices that have often evolved from ideologic or profit-driven motivations. It addresses a range of topics, including deep learning, data drift, and MLOps, using fundamental principles such as entropy, capacity, and high dimensionality.

Ideal for both academia and industry professionals, this textbook serves as a valuable tool for those seeking to deepen their understanding of data science as an engineering discipline. Its thought-provoking content stimulates intellectual curiosity and caters to readers who desire more than just code or ready-made formulas. The text invites readers to explore beyond conventional viewpoints, offering an alternative perspective that promotes a big-picture view for integrating theory with practice. Suitable for upper undergraduate or graduate-level courses, this book can also benefit practicing engineers and scientists in various disciplines by enhancing their understanding of modeling and improving data measurement effectively.


商品描述(中文翻譯)

這本開創性的書籍通過引入信息測量方法論,超越了傳統的機器學習方法,從而革新了這一領域。這些技術源於加州大學伯克利分校關於機器學習任務實驗設計的研討會,旨在通過減少猜測(如魔術數字(超參數)或模型類型偏差)來克服機器學習的“黑盒子”方法。基於信息的機器學習使得數據質量測量、事前任務複雜性估計和可重現的數據科學實驗設計成為可能。其好處包括顯著的尺寸減小、增加的可解釋性和模型的增強韌性,這些都有助於推動該學科的穩健性和可信度。

這本教科書在機器學習與物理學、信息理論和計算機工程等學科之間架起了橋樑,保持了易於理解和全面的風格,使複雜的主題對廣大讀者易於理解。《信息驅動的機器學習》探索了這些學科之間的協同和諧,以增強我們對數據科學建模的理解。這本書不僅僅關注“如何”,還提供了回答貫穿該領域的“為什麼”問題的答案,揭示了機器學習過程的基本原理及其實際影響。通過倡導以基本原理為基礎的系統方法論,這本書挑戰了行業實踐,這些實踐往往源於意識形態或以利益為驅動的動機。它涵蓋了深度學習、數據漂移和MLOps等一系列主題,並使用熵、容量和高維度等基本原理。

這本教科書對於學術界和行業專業人士都非常理想,對於那些希望深入了解數據科學作為一門工程學科的人來說,它是一個寶貴的工具。它引發思考,激發知識的好奇心,迎合那些不僅僅需要代碼或現成公式的讀者。這本書邀請讀者超越傳統觀點,提供一種促進理論與實踐融合的大局觀。適合高年級本科生或研究生課程,這本書還可以幫助各個學科的實踐工程師和科學家,提高他們對建模和有效改進數據測量的理解能力。

作者簡介

Gerald Friedland: Listed in the AI2000 Most Influential Scholar list as one of the top-cited research scholars in AI in the last decade, Friedland's contributions to the field of machine learning have been both substantial and enduring since he started working in the field in 2001. His Simple Interactive Object Extraction algorithm has been part of open source image editing and creation tools since 2005 and his cloud-less MOVI Speech Recognition board has been used by makers since 2015. Currently, he is adjunct faculty at the University of California, Berkeley, a Faculty Fellow of the Berkeley Institute of Data Science, and a Principal Scientist in the Sagemaker team at Amazon AWS.

After earning his Ph.D. from Freie Universität Berlin in 2006, Gerald led a team of researchers in speech and multimedia content analysis as the Director of Audio and Multimedia research at the International Computer Science Institute in Berkeley. He then held the role of Principal Data Scientist at Lawrence Livermore National Lab from 2016 to 2019. That year, he co-founded Brainome, Inc., where he harnessed his technical expertise to develop an automatic machine learning tool rooted in the information measurement techniques central to this book. His journey then took him to Amazon AWS in 2022 as a Principal Scientist, AutoML.

Beyond his industry and academic roles, Gerald is a seasoned author. His literature contributions span from the textbooks Multimedia Computing (Cambridge University Press) and Multimodal Location Estimation of Videos and Images (Springer) to a programming book for young children published by Apress.


作者簡介(中文翻譯)

Gerald Friedland: 在AI2000最具影響力學者名單中,Friedland被列為過去十年中在人工智慧領域中被引用次數最多的研究學者之一。自2001年開始從事這個領域以來,他對機器學習的貢獻既重大又持久。他的Simple Interactive Object Extraction演算法自2005年以來一直是開源圖像編輯和創作工具的一部分,他的無雲MOVI語音識別板自2015年以來一直被製造商使用。目前,他是加州大學伯克利分校的兼職教師,伯克利數據科學研究所的教職員,以及亞馬遜AWS的Sagemaker團隊的首席科學家。

在2006年從柏林自由大學獲得博士學位後,Gerald作為國際計算機科學研究所音頻和多媒體研究的主任,帶領一個研究團隊從事語音和多媒體內容分析。然後,他在2016年至2019年擔任勞倫斯利弗摩爾國家實驗室的首席數據科學家。那一年,他共同創辦了Brainome公司,利用他的技術專長開發了一個根據本書所介紹的信息測量技術的自動機器學習工具。之後,他於2022年加入亞馬遜AWS,擔任首席科學家,負責AutoML領域。

除了在工業和學術領域的角色外,Gerald還是一位經驗豐富的作者。他的文學貢獻包括教科書《多媒體計算》(劍橋大學出版社)和《視頻和圖像的多模態位置估計》(Springer),以及一本由Apress出版的針對年輕兒童的編程書籍。