Hands-On Data Science with the Command Line: Automate everyday data science tasks using command-line tools

Jason Morris, Chris McCubbin, Raymond Page

  • 出版商: Packt Publishing
  • 出版日期: 2019-01-31
  • 售價: $1,440
  • 貴賓價: 9.5$1,368
  • 語言: 英文
  • 頁數: 124
  • 裝訂: Paperback
  • ISBN: 1789132983
  • ISBN-13: 9781789132984
  • 相關分類: Command LineData Science
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Big data processing and analytics at speed and scale using command line tools.

Key Features

  • Perform string processing, numerical computations, and more using CLI tools
  • Understand the essential components of data science development workflow
  • Automate data pipeline scripts and visualization with the command line

Book Description

The Command Line has been in existence on UNIX-based OSes in the form of Bash shell for over 3 decades. However, very little is known to developers as to how command-line tools can be OSEMN (pronounced as awesome and standing for Obtaining, Scrubbing, Exploring, Modeling, and iNterpreting data) for carrying out simple-to-advanced data science tasks at speed.

This book will start with the requisite concepts and installation steps for carrying out data science tasks using the command line. You will learn to create a data pipeline to solve the problem of working with small-to medium-sized files on a single machine. You will understand the power of the command line, learn how to edit files using a text-based and an. You will not only learn how to automate jobs and scripts, but also learn how to visualize data using the command line.

By the end of this book, you will learn how to speed up the process and perform automated tasks using command-line tools.

What you will learn

  • Understand how to set up the command line for data science
  • Use AWK programming language commands to search quickly in large datasets.
  • Work with files and APIs using the command line
  • Share and collect data with CLI tools
  • Perform visualization with commands and functions
  • Uncover machine-level programming practices with a modern approach to data science

Who this book is for

This book is for data scientists and data analysts with little to no knowledge of the command line but has an understanding of data science. Perform everyday data science tasks using the power of command line tools.

Table of Contents

  1. Data Science at the Command line and Setting it up
  2. Essential Commands
  3. Obtaining and Working with Data,Detached Processing and Terminal Multiplexers
  4. Bash Functions and Data Visualization
  5. Loops, Functions and String Processing
  6. The Command Line as a Database, Math in Bash, and Bringing It All Together

商品描述(中文翻譯)

大數據處理和分析,使用命令行工具以高速和大規模進行。

主要特點:
- 使用命令行工具進行字符串處理、數值計算等操作
- 了解數據科學開發工作流程的基本組件
- 使用命令行自動化數據流程腳本和可視化

書籍描述:
命令行在基於UNIX的操作系統中以Bash shell的形式存在已有30多年。然而,開發人員對於如何使用命令行工具以高速執行簡單到高級的數據科學任務的知識非常有限。本書將從必要的概念和安裝步驟開始,介紹如何使用命令行執行數據科學任務。您將學習創建數據流程以解決在單台機器上處理小到中型文件的問題。您將了解命令行的強大功能,學習如何使用基於文本的編輯器編輯文件。您不僅將學習如何自動化工作和腳本,還將學習如何使用命令行進行數據可視化。通過本書,您將學會如何加快處理速度並使用命令行工具執行自動化任務。

您將學到:
- 了解如何設置用於數據科學的命令行
- 使用AWK編程語言命令在大型數據集中快速搜索
- 使用命令行處理文件和API
- 使用命令行工具共享和收集數據
- 使用命令和函數進行數據可視化
- 以現代方法揭示機器級編程實踐的數據科學

本書適合對命令行幾乎沒有了解但對數據科學有一定了解的數據科學家和數據分析師。使用命令行工具執行日常數據科學任務的強大功能。

目錄:
1. 命令行中的數據科學及其設置
2. 基本命令
3. 獲取和處理數據,分離處理和終端多路復用器
4. Bash函數和數據可視化
5. 循環、函數和字符串處理
6. 命令行作為數據庫,Bash中的數學和整合所有內容