Docker for Data Science: Building Scalable and Extensible Data Infrastructure Around the Jupyter Notebook Server
暫譯: 數據科學中的 Docker:構建可擴展和可延伸的數據基礎設施以支持 Jupyter Notebook 伺服器

Joshua Cook

  • 出版商: Apress
  • 出版日期: 2017-08-25
  • 售價: $2,590
  • 貴賓價: 9.5$2,461
  • 語言: 英文
  • 頁數: 257
  • 裝訂: Paperback
  • ISBN: 1484230116
  • ISBN-13: 9781484230114
  • 相關分類: DockerJVM 語言Data Science
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Learn Docker "infrastructure as code" technology to define a system for performing standard but non-trivial data tasks on medium- to large-scale data sets, using Jupyter as the master controller.

It is not uncommon for a real-world data set to fail to be easily managed. The set may not fit well into access memory or may require prohibitively long processing. These are significant challenges to skilled software engineers and they can render the standard Jupyter system unusable. 

As a solution to this problem, Docker for Data Science proposes using Docker. You will learn how to use existing pre-compiled public images created by the major open-source technologies―Python, Jupyter, Postgres―as well as using the Dockerfile to extend these images to suit your specific purposes. The Docker-Compose technology is examined and you will learn how it can be used to build a linked system with Python churning data behind the scenes and Jupyter managing these background tasks. Best practices in using existing images are explored as well as developing your own images to deploy state-of-the-art machine learning and optimization algorithms.

What  You'll Learn 
  • Master interactive development using the Jupyter platform
  • Run and build Docker containers from scratch and from publicly available open-source images
  • Write infrastructure as code using the docker-compose tool and its docker-compose.yml file type
  • Deploy a multi-service data science application across a cloud-based system

Who This Book Is For

Data scientists, machine learning engineers, artificial intelligence researchers, Kagglers, and software developers

商品描述(中文翻譯)

學習 Docker 的「基礎設施即代碼」技術,以定義一個系統,用於在中到大型數據集上執行標準但非平凡的數據任務,並使用 Jupyter 作為主控器。

在現實世界中,數據集無法輕易管理並不罕見。該數據集可能無法很好地適應存取記憶體,或可能需要過長的處理時間。這些對於熟練的軟體工程師來說是重大挑戰,並且可能使標準的 Jupyter 系統無法使用。

作為解決此問題的方案,《Docker for Data Science》提議使用 Docker。您將學習如何使用主要開源技術(如 Python、Jupyter、Postgres)創建的現有預編譯公共映像,以及如何使用 Dockerfile 擴展這些映像以適應您的特定需求。將探討 Docker-Compose 技術,您將學習如何使用它來構建一個連結系統,讓 Python 在背後處理數據,而 Jupyter 則管理這些背景任務。還將探討使用現有映像的最佳實踐,以及開發自己的映像以部署最先進的機器學習和優化算法。

您將學到的內容:
- 精通使用 Jupyter 平台進行互動式開發
- 從頭開始運行和構建 Docker 容器,並從公開可用的開源映像中構建
- 使用 docker-compose 工具及其 docker-compose.yml 文件類型編寫基礎設施即代碼
- 在基於雲的系統中部署多服務數據科學應用程序

本書適合對象:
數據科學家、機器學習工程師、人工智慧研究人員、Kagglers 及軟體開發人員