Introduction to Data Science: Data Analysis and Prediction Algorithms with R
暫譯: 數據科學入門:使用 R 進行數據分析與預測演算法

Irizarry, Rafael A.

買這商品的人也買了...

相關主題

商品描述

Introduction to Data Science: Data Analysis and Prediction Algorithms with R introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning. It also helps you develop skills such as R programming, data wrangling, data visualization, predictive algorithm building, file organization with UNIX/Linux shell, version control with Git and GitHub, and reproducible document preparation.

 

This book is a textbook for a first course in data science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The book is divided into six parts: R, data visualization, statistics with R, data wrangling, machine learning, and productivity tools. Each part has several chapters meant to be presented as one lecture.

 

The author uses motivating case studies that realistically mimic a data scientist's experience. He starts by asking specific questions and answers these through data analysis so concepts are learned as a means to answering the questions. Examples of the case studies included are: US murder rates by state, self-reported student heights, trends in world health and economics, the impact of vaccines on infectious disease rates, the financial crisis of 2007-2008, election forecasting, building a baseball team, image processing of hand-written digits, and movie recommendation systems.

 

The statistical concepts used to answer the case study questions are only briefly introduced, so complementing with a probability and statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand the chapters and complete the exercises, you will be prepared to learn the more advanced concepts and skills needed to become an expert.

A complete solutions manual is available to registered instructors who require the text for a course.

商品描述(中文翻譯)

《數據科學導論:使用 R 的數據分析與預測演算法》介紹了可以幫助您應對現實世界數據分析挑戰的概念和技能。它涵蓋了概率、統計推斷、線性回歸和機器學習等概念。它還幫助您發展如 R 程式設計、數據整理、數據視覺化、預測演算法構建、使用 UNIX/Linux shell 的文件組織、使用 Git 和 GitHub 的版本控制,以及可重複文檔準備等技能。

這本書是數據科學入門課程的教科書。雖然不需要具備 R 的先前知識,但一些程式設計經驗可能會有所幫助。該書分為六個部分:R、數據視覺化、使用 R 的統計、數據整理、機器學習和生產力工具。每個部分都有幾個章節,旨在作為一堂講座呈現。

作者使用了激勵性的案例研究,真實地模擬數據科學家的經驗。他首先提出具體問題,然後通過數據分析來回答這些問題,因此概念的學習是為了回答這些問題。案例研究的例子包括:美國各州的謀殺率、自報的學生身高、全球健康與經濟的趨勢、疫苗對傳染病率的影響、2007-2008 年的金融危機、選舉預測、建立棒球隊、手寫數字的影像處理,以及電影推薦系統。

用來回答案例研究問題的統計概念僅簡要介紹,因此強烈建議搭配概率和統計的教科書,以深入理解這些概念。如果您閱讀並理解這些章節並完成練習,您將為學習成為專家的更高級概念和技能做好準備。

完整的解答手冊可供需要該文本作為課程的註冊講師使用。

作者簡介

 

Rafael A. Irizarry is professor of data sciences at the Dana-Farber Cancer Institute, professor of biostatistics at Harvard, and a fellow of the American Statistical Association. Dr. Irizarry is an applied statistician and during the last 20 years has worked in diverse areas, including genomics, sound engineering, and public health. He disseminates solutions to data analysis challenges as open source software, tools that are widely downloaded and used. Prof. Irizarry has also developed and taught several data science courses at Harvard as well as popular online courses.

作者簡介(中文翻譯)

拉斐爾·A·伊里薩里是達納-法伯癌症研究所的數據科學教授,哈佛大學的生物統計學教授,以及美國統計協會的研究員。伊里薩里博士是一位應用統計學家,在過去20年中,他在基因組學、聲音工程和公共衛生等多個領域工作。他將數據分析挑戰的解決方案以開源軟體的形式發佈,這些工具被廣泛下載和使用。伊里薩里教授還在哈佛大學開發並教授了幾門數據科學課程以及受歡迎的線上課程。