Foundations of Statistics for Data Scientists: With R and Python
暫譯: 數據科學家統計學基礎:使用 R 和 Python

Agresti, Alan, Kateri, Maria

相關主題

商品描述

Designed as a textbook for a one or two-term introduction to mathematical statistics for students training to become data scientists, Foundations of Statistics for Data Scientists: With R and Python is an in-depth presentation of the topics in statistical science with which any data scientist should be familiar, including probability distributions, descriptive and inferential statistical methods, and linear modelling. The book assumes knowledge of basic calculus, so the presentation can focus on 'why it works' as well as 'how to do it.' Compared to traditional mathematical statistics textbooks, however, the book has less emphasis on probability theory and more emphasis on using software to implement statistical methods and to conduct simulations to illustrate key concepts. All statistical analyses in the book use R software, with an appendix showing the same analyses with Python.

The book also introduces modern topics that do not normally appear in mathematical statistics texts but are highly relevant for data scientists, such as Bayesian inference, generalized linear models for non-normal responses (e.g., logistic regression and Poisson loglinear models), and regularized model fitting. The nearly 500 exercises are grouped into Data Analysis and Applications and Methods and Concepts. Appendices introduce R and Python and contain solutions for odd-numbered exercises. The book's website has expanded R, Python, and Matlab appendices and all data sets from the examples and exercises.

Alan Agresti, Distinguished Professor Emeritus at the University of Florida, is the author of seven books, including Categorical Data Analysis (Wiley) and Statistics: The Art and Science of Learning from Data (Pearson), and has presented short courses in 35 countries. His awards include an honorary doctorate from De Montfort University (UK) and the Statistician of the Year from the American Statistical Association (Chicago chapter). Maria Kateri, Professor of Statistics and Data Science at the RWTH Aachen University, authored the monograph Contingency Table Analysis: Methods and Implementation Using R (Birkhäuser/Springer) and a textbook on mathematics for economists (in German). She has a long-term experience in teaching statistics courses to students of Data Science, Mathematics, Statistics, Computer Science, and Business Administration and Engineering.

The main goal of this textbook is to present foundational statistical methods and theory that are relevant in the field of data science. The authors depart from the typical approaches taken by many conventional mathematical statistics textbooks by placing more emphasis on providing the students with intuitive and practical interpretations of those methods with the aid of R programming codes...I find its particular strength to be its intuitive presentation of statistical theory and methods without getting bogged down in mathematical details that are perhaps less useful to the practitioners (Mintaek Lee, Boise State University)

The aspects of this manuscript that I find appealing: 1. The use of real data. 2. The use of R but with the option to use Python. 3. A good mix of theory and practice. 4. The text is well-written with good exercises. 5. The coverage of topics (e.g. Bayesian methods and clustering) that are not usually part of a course in statistics at the level of this book. (Jason M. Graham, University of Scranton)

商品描述(中文翻譯)

設計為一本針對數據科學家訓練的數學統計入門教科書,數據科學家的統計基礎:使用 R 和 Python 深入介紹了任何數據科學家應該熟悉的統計科學主題,包括機率分佈、描述性和推論性統計方法,以及線性模型。該書假設讀者具備基本微積分知識,因此可以專注於「為什麼這樣有效」以及「如何做到這一點」。然而,與傳統的數學統計教科書相比,該書對機率理論的強調較少,而更強調使用軟體來實現統計方法並進行模擬以說明關鍵概念。書中的所有統計分析均使用 R 軟體,附錄中展示了使用 Python 進行相同分析的內容。

本書還介紹了一些現代主題,這些主題通常不會出現在數學統計教材中,但對數據科學家來說非常相關,例如貝葉斯推斷、針對非正態反應的廣義線性模型(例如邏輯回歸和泊松對數線性模型)以及正則化模型擬合。近 500 道習題分為數據分析與應用以及方法與概念兩大類。附錄介紹了 R 和 Python,並包含奇數題的解答。書籍網站提供擴展的 R、Python 和 Matlab 附錄,以及所有示例和習題中的數據集。

艾倫·阿格雷斯提(Alan Agresti),佛羅里達大學榮譽教授,著有七本書,包括 類別數據分析(Wiley)和 統計學:從數據中學習的藝術與科學(Pearson),並在 35 個國家舉辦短期課程。他的獎項包括德蒙福特大學(英國)的榮譽博士學位和美國統計協會(芝加哥分會)的年度統計學家獎。瑪麗亞·卡特里(Maria Kateri),亞琛工業大學統計與數據科學教授,著有專著 列聯表分析:使用 R 的方法與實現(Birkhäuser/Springer)以及一本針對經濟學家的數學教科書(德文)。她在教授數據科學、數學、統計學、計算機科學和商業管理與工程的統計課程方面擁有長期經驗。

本教科書的主要目標是介紹在數據科學領域中相關的基礎統計方法和理論。作者們與許多傳統數學統計教科書所採取的典型方法有所不同,更加強調通過 R 程式碼為學生提供這些方法的直觀和實用解釋……我認為其特別的優勢在於以直觀的方式呈現統計理論和方法,而不會陷入對從業者來說或許不那麼有用的數學細節中。(Mintaek Lee,博伊西州立大學)

我認為這本手稿的吸引之處在於:1. 使用真實數據。2. 使用 R,但也提供使用 Python 的選項。3. 理論與實踐的良好結合。4. 文字寫得很好,並且有良好的習題。5. 涵蓋的主題(例如貝葉斯方法和聚類)通常不會出現在這本書的統計課程中。(Jason M. Graham,斯克蘭頓大學)

作者簡介

Alan Agresti, Distinguished Professor Emeritus at the University of Florida, is the author of seven books, including Categorical Data Analysis (Wiley) and Statistics: The Art and Science of Learning from Data (Pearson), and has presented short courses in 35 countries. His awards include an honorary doctorate from De Montfort University (UK) and the Statistician of the Year from the American Statistical Association (Chicago chapter). Maria Kateri, Professor of Statistics and Data Science at the RWTH Aachen University, authored the monograph Contingency Table Analysis: Methods and Implementation Using R (Birkhäuser/Springer) and a textbook on mathematics for economists (in German). She has a long-term experience in teaching statistics courses to students of Data Science, Mathematics, Statistics, Computer Science, and Business Administration and Engineering.

The main goal of this textbook is to present foundational statistical methods and theory that are relevant in the field of data science. The authors depart from the typical approaches taken by many conventional mathematical statistics textbooks by placing more emphasis on providing the students with intuitive and practical interpretations of those methods with the aid of R programming codes...I find its particular strength to be its intuitive presentation of statistical theory and methods without getting bogged down in mathematical details that are perhaps less useful to the practitioners (Mintaek Lee, Boise State University)

The aspects of this manuscript that I find appealing: 1. The use of real data. 2. The use of R but with the option to use Python. 3. A good mix of theory and practice. 4. The text is well-written with good exercises. 5. The coverage of topics (e.g. Bayesian methods and clustering) that are not usually part of a course in statistics at the level of this book. (Jason M. Graham, University of Scranton)

作者簡介(中文翻譯)

艾倫·阿格雷斯提(Alan Agresti),佛羅里達大學榮譽退休教授,是七本書的作者,包括《分類數據分析》(Categorical Data Analysis,Wiley)和《統計學:從數據中學習的藝術與科學》(Statistics: The Art and Science of Learning from Data,Pearson),並在35個國家舉辦過短期課程。他的獎項包括德蒙福特大學(De Montfort University,英國)頒發的榮譽博士學位,以及美國統計協會(American Statistical Association,芝加哥分會)頒發的年度統計學家獎。瑪麗亞·卡特里(Maria Kateri),亞琛工業大學(RWTH Aachen University)統計學與數據科學教授,著有專著《列聯表分析:使用R的方法與實現》(Contingency Table Analysis: Methods and Implementation Using R,Birkhäuser/Springer)以及一本針對經濟學家的數學教科書(德文)。她在教授數據科學、數學、統計學、計算機科學及商業管理與工程的統計課程方面擁有長期經驗。

本教科書的主要目標是介紹在數據科學領域中相關的基礎統計方法和理論。作者們與許多傳統數學統計教科書的典型方法有所不同,更加強調通過R程式碼為學生提供這些方法的直觀和實用解釋……我認為其特別的優勢在於以直觀的方式呈現統計理論和方法,而不會陷入對於實務工作者來說或許不那麼有用的數學細節中。(米恩泰克·李,博伊西州立大學)

我認為這本手稿的吸引之處在於:1. 使用真實數據。2. 使用R,但也提供使用Python的選項。3. 理論與實踐的良好結合。4. 文字寫得很好,並有良好的練習題。5. 涵蓋的主題(例如貝葉斯方法和聚類)通常不會出現在這本書的統計課程中。(傑森·M·格雷厄姆,斯克蘭頓大學)