Introducing Mlops: How to Scale Machine Learning in the Enterprise
暫譯: 介紹 MLOps:如何在企業中擴展機器學習

Treveil, Mark, Omont, Nicolas, Stenac, CL

買這商品的人也買了...

商品描述

More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren't truly operational, these models can't possibly do what you've trained them to do.

This book introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach--Build, Manage, Deploy and Integrate, and Monitor--for creating ML-infused applications within your organization.

You'll learn how to:

  • Fulfill data science value by reducing friction throughout ML pipelines and workflows
  • Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy
  • Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable
  • Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized
  • Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action

商品描述(中文翻譯)

超過一半的分析和機器學習(ML)模型在當今的組織中從未投入生產。相反,許多這些 ML 模型僅僅提供靜態的見解,並以幻燈片的形式呈現。如果它們並未真正運行,這些模型就無法實現您所訓練的功能。

本書介紹了實用的概念,幫助數據科學家和應用工程師將 ML 模型運營化,以推動真正的商業變革。通過基於全球眾多項目的課程,六位數據分析專家提供了一個應用的四步驟方法——建立、管理、部署和整合,以及監控——用於在您的組織內創建融合 ML 的應用程序。

您將學到如何:

- 通過減少 ML 管道和工作流程中的摩擦來實現數據科學的價值
- 通過再訓練、定期調整甚至完全重建不斷完善 ML 模型,以確保長期的準確性
- 設計 ML Ops 生命週期,以確保面向用戶的模型是無偏見、公平且可解釋的
- 將 ML 模型運營化,不僅用於管道部署,還用於更複雜且不那麼標準化的外部業務系統
- 將四步驟的建立、管理、部署和整合,以及監控方法付諸實踐

作者簡介

Mark Treveil has designed products in fields as diverse as telecoms, banking, and online trading. His own startup led a revolution in governance in the UK local government, where it still dominates. He is now part of the Dataiku Product Team based in Paris.

Nicolas Omont is VP of operations at Artelys where he is developing mathematical optimization solutions for energy and transport. He previously held the role of Dataiku Product Manager for ML and advanced analytics. He holds a PhD in Computer Science, and he's been working in operations research and statistics for the past 15 years, mainly in the telecommunications and energy utility sectors.

Clément Stenac is a passionate software engineer, CTO and co-founder at Dataiku. He oversees the design, development of the Dataiku DSS Entreprise AI Platform. Clément was previously head of product development at Exalead, leading the design and implementation of web-scale search engine software. He also has extensive experience with open source software, as a former developer of the VideoLAN (VLC) and Debian projects.

Kenji Lefevre is VP Product at Dataiku. He oversees the product roadmap and the user experience of the Dataiku DSS Entreprise AI Platform. He holds a PhD in pure mathematics from University of Paris VII, and he directed documentary movies before switching to Data Science and product management.

Du Phan is a Machine Learning engineer at Dataiku, where he works in democratizing data science. In the past few years, he has been dealing with a variety of data problems, from geospatial analysis to deep learning. His work now focuses on different facets and challenges of MLOps.

Joachim Zentici is an Engineering Director at Dataiku. Joachim graduated in applied mathematics from Ecole Centrale Paris. Prior to joining Dataiku in 2014, he was a Research Engineer in computer vision at Siemens Molecular Imaging and INRIA. He has also been a teacher and a lecturer. At Dataiku, Joachim had multiple contributions including managing the engineers in charge of the core infrastructure, building the team for the plugins & ecosystem effort as well as leading the global technology training program for customer-facing engineers.

Adrien Lavoillotte is Engineering Director at Dataiku where he leads the team responsible for machine learning and statistics features in the software. He studied at ECE Paris, a graduate school of engineering, and worked for several startups before joining Dataiku in 2015.

Makoto Miyazaki is a Data Scientist at Dataiku and responsible for delivering hands-on consulting services using Dataiku DSS for European and Japanese clients. Makoto holds a Bachelor's degree in economics and a Master's Degree in data science, and he was also a former financial journalist with a wide range of beats, including nuclear energy and economic recoveries from the tsunami.

Lynn Heidmann received her Bachelor of Arts in Journalism/Mass Communications and Anthropology from the University of Wisconsin-Madison in 2008 and decided to bring her passion for research and writing into the world of tech. She spent seven years in the San Francisco Bay Area writing and running operations with Google and subsequently Niantic before moving to Paris to head content initiatives at Dataiku. In her current role, Lynn follows and writes about technological trends and developments in the world of data and AI.

作者簡介(中文翻譯)

馬克·特雷維爾(Mark Treveil)在電信、銀行和線上交易等多個領域設計產品。他的創業公司在英國地方政府的治理上引領了一場革命,至今仍然佔據主導地位。他目前是位於巴黎的 Dataiku 產品團隊成員。

尼古拉斯·奧蒙(Nicolas Omont)是 Artelys 的運營副總裁,負責開發能源和交通的數學優化解決方案。他之前擔任 Dataiku 的機器學習(ML)和高級分析產品經理。他擁有計算機科學博士學位,並在過去 15 年中專注於運營研究和統計,主要在電信和能源公用事業領域工作。

克萊門特·斯特納克(Clément Stenac)是一位充滿熱情的軟體工程師,擔任 Dataiku 的首席技術官(CTO)和共同創辦人。他負責 Dataiku DSS 企業人工智慧平台的設計和開發。克萊門特之前是 Exalead 的產品開發負責人,領導網路規模搜尋引擎軟體的設計和實施。他還擁有豐富的開源軟體經驗,曾是 VideoLAN(VLC)和 Debian 項目的開發者。

賴賢治(Kenji Lefevre)是 Dataiku 的產品副總裁,負責 Dataiku DSS 企業人工智慧平台的產品路線圖和用戶體驗。他擁有巴黎第七大學的純數學博士學位,並在轉向數據科學和產品管理之前,曾導演紀錄片。

杜·潘(Du Phan)是 Dataiku 的機器學習工程師,致力於數據科學的民主化。在過去幾年中,他處理了各種數據問題,從地理空間分析到深度學習。他目前的工作專注於 MLOps 的不同面向和挑戰。

喬阿基姆·岑提奇(Joachim Zentici)是 Dataiku 的工程總監。他畢業於巴黎中央理工學院的應用數學專業。在 2014 年加入 Dataiku 之前,他曾在西門子分子成像和法國國家計算機科學研究所(INRIA)擔任計算機視覺研究工程師。他還曾擔任教師和講師。在 Dataiku,喬阿基姆有多項貢獻,包括管理負責核心基礎設施的工程師,建立插件和生態系統團隊,以及領導面向客戶的工程師的全球技術培訓計劃。

阿德里安·拉沃伊特(Adrien Lavoillotte)是 Dataiku 的工程總監,負責軟體中的機器學習和統計功能團隊。他在巴黎 ECE 工程學院學習,並在加入 Dataiku 之前為幾家初創公司工作。

宮崎誠(Makoto Miyazaki)是 Dataiku 的數據科學家,負責為歐洲和日本客戶提供使用 Dataiku DSS 的實地諮詢服務。宮崎擁有經濟學學士學位和數據科學碩士學位,並曾是一名金融記者,報導範圍廣泛,包括核能和海嘯後的經濟復甦。

琳·海德曼(Lynn Heidmann)於 2008 年在威斯康辛大學麥迪遜分校獲得新聞學/大眾傳播和人類學的文學士學位,並決定將她對研究和寫作的熱情帶入科技領域。她在舊金山灣區工作了七年,與 Google 和隨後的 Niantic 一起撰寫和運營,然後搬到巴黎負責 Dataiku 的內容計劃。在她目前的角色中,琳跟蹤並撰寫有關數據和人工智慧領域的技術趨勢和發展。