Vectorization
暫譯: 向量化

Cui, Edward Dongbo

  • 出版商: Wiley
  • 出版日期: 2024-12-24
  • 售價: $4,760
  • 貴賓價: 9.5$4,522
  • 語言: 英文
  • 頁數: 448
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 1394272944
  • ISBN-13: 9781394272945
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Enables readers to develop foundational and advanced vectorization skills for scalable data science and machine learning and address real-world problems

Offering insights across various domains such as computer vision and natural language processing, Vectorization covers the fundamental topics of vectorization including array and tensor operations, data wrangling, and batch processing. This book illustrates how the principles discussed lead to successful outcomes in machine learning projects, serving as concrete examples for the theories explained, with each chapter including practical case studies and code implementations using NumPy, TensorFlow, and PyTorch.

Each chapter has one or two types of contents: either an introduction/comparison of the specific operations in the numerical libraries (illustrated as tables) and/or case study examples that apply the concepts introduced to solve a practical problem (as code blocks and figures). Readers can approach the knowledge presented by reading the text description, running the code blocks, or examining the figures.

Written by the developer of the first recommendation system on the Peacock streaming platform, Vectorization explores sample topics including:

  • Basic tensor operations and the art of tensor indexing, elucidating how to access individual or subsets of tensor elements
  • Vectorization in tensor multiplications and common linear algebraic routines, which form the backbone of many machine learning algorithms
  • Masking and padding, concepts which come into play when handling data of non-uniform sizes, and string processing techniques for natural language processing (NLP)
  • Sparse matrices and their data structures and integral operations, and ragged or jagged tensors and the nuances of processing them

From the essentials of vectorization to the subtleties of advanced data structures, Vectorization is an ideal one-stop resource for both beginners and experienced practitioners, including researchers, data scientists, statisticians, and other professionals in industry, who seek academic success and career advancement.

商品描述(中文翻譯)

**使讀者能夠發展可擴展數據科學和機器學習的基礎和進階向量化技能,並解決現實世界的問題**

本書提供了計算機視覺和自然語言處理等各個領域的見解,向量化涵蓋了向量化的基本主題,包括數組和張量操作、數據處理和批處理。這本書說明了所討論的原則如何在機器學習項目中導致成功的結果,作為解釋理論的具體例子,每一章都包括使用NumPy、TensorFlow和PyTorch的實用案例研究和代碼實現。

每一章有一到兩種類型的內容:要麼是數值庫中特定操作的介紹/比較(以表格形式呈現),要麼是應用所介紹概念解決實際問題的案例研究示例(以代碼塊和圖形呈現)。讀者可以通過閱讀文本描述、運行代碼塊或檢查圖形來接觸所呈現的知識。

本書由Peacock串流平台第一個推薦系統的開發者撰寫,向量化探討的主題包括:
- 基本張量操作和張量索引的藝術,闡明如何訪問單個或子集的張量元素
- 張量乘法中的向量化和常見的線性代數例程,這些構成了許多機器學習算法的基礎
- 掩碼和填充,這些概念在處理非均勻大小的數據時會用到,以及自然語言處理(NLP)的字符串處理技術
- 稀疏矩陣及其數據結構和基本操作,以及不規則或鋸齒狀張量及其處理的細微差別

從向量化的基本概念到進階數據結構的細微之處,向量化是初學者和經驗豐富的從業者的理想一站式資源,包括研究人員、數據科學家、統計學家和其他行業專業人士,尋求學術成功和職業發展。

作者簡介

Edward DongBo Cui is a Data Science and Machine Learning Engineering Leader who holds a PhD in Neuroscience from Case Western Reserve University, USA. Edward served as Director of Data Science at NBC Universal, building the first recommendation system on the new Peacock streaming platform. Previously, he was Lead Data Scientist at Nielsen Global Media. He is an expert in ML engineering, research, and MLOps to drive data-centric decision-making and enhance product innovation.

作者簡介(中文翻譯)

Edward DongBo Cui 是一位數據科學和機器學習工程領導者,擁有美國凱斯西儲大學的神經科學博士學位。Edward 曾擔任 NBC Universal 的數據科學總監,負責在新的 Peacock 串流平台上建立首個推薦系統。在此之前,他是尼爾森全球媒體的首席數據科學家。他在機器學習工程、研究和 MLOps 方面具有專業知識,致力於推動以數據為中心的決策制定並提升產品創新。