Data Science Using Python and R Wiley Series on Methods and Applications in Data Mining
暫譯: 使用 Python 和 R 的資料科學

Chantal D. Larose, Daniel T. Larose

商品描述

Learn data science by doing data science! 

Data Science Using Python and R will get you plugged into the world’s two most widespread open-source platforms for data science: Python and R.

Data science is hot. Bloomberg called data scientist “the hottest job in America.” Python and R are the top two open-source data science tools in the world. In Data Science Using Python and R, you will learn step-by-step how to produce hands-on solutions to real-world business problems, using state-of-the-art techniques. 

Data Science Using Python and R is written for the general reader with no previous analytics or programming experience. An entire chapter is dedicated to learning the basics of Python and R. Then, each chapter presents step-by-step instructions and walkthroughs for solving data science problems using Python and R.

Those with analytics experience will appreciate having a one-stop shop for learning how to do data science using Python and R. Topics covered include data preparation, exploratory data analysis, preparing to model the data, decision trees, model evaluation, misclassification costs, naïve Bayes classification, neural networks, clustering, regression modeling, dimension reduction, and association rules mining.

Further, exciting new topics such as random forests and general linear models are also included. The book emphasizes data-driven error costs to enhance profitability, which avoids the common pitfalls that may cost a company millions of dollars.

Data Science Using Python and R provides exercises at the end of every chapter, totaling over 500 exercises in the book. Readers will therefore have plenty of opportunity to test their newfound data science skills and expertise. In the Hands-on Analysis exercises, readers are challenged to solve interesting business problems using real-world data sets.

商品描述(中文翻譯)

學習數據科學就是透過實作數據科學!

《使用 Python 和 R 的數據科學》將讓你接觸到全球兩個最廣泛使用的開源數據科學平台:Python 和 R。

數據科學正當紅。《彭博社》稱數據科學家為「美國最熱門的工作」。Python 和 R 是全球排名前兩的開源數據科學工具。在《使用 Python 和 R 的數據科學》中,你將逐步學習如何使用最先進的技術,針對現實世界的商業問題提供實作解決方案。

《使用 Python 和 R 的數據科學》是為沒有任何分析或程式設計經驗的一般讀者所撰寫的。書中有一整章專門介紹 Python 和 R 的基礎知識。接下來,每一章都提供逐步的指導和解說,幫助讀者使用 Python 和 R 解決數據科學問題。

具備分析經驗的讀者將會欣賞這本書作為學習如何使用 Python 和 R 進行數據科學的一站式資源。涵蓋的主題包括數據準備、探索性數據分析、數據建模準備、決策樹、模型評估、錯誤分類成本、朴素貝葉斯分類、神經網絡、聚類、回歸建模、降維和關聯規則挖掘。

此外,書中還包括隨機森林和一般線性模型等令人興奮的新主題。這本書強調以數據驅動的錯誤成本來提升盈利能力,避免可能使公司損失數百萬美元的常見陷阱。

《使用 Python 和 R 的數據科學》在每一章的結尾提供練習題,總計超過 500 道練習題。因此,讀者將有充足的機會來測試他們新學到的數據科學技能和專業知識。在實作分析練習中,讀者將面臨使用現實世界數據集解決有趣商業問題的挑戰。

作者簡介

CHANTAL D. LAROSE, PHD, is an Assistant Professor of Statistics & Data Science at Eastern Connecticut State University (ECSU). She has co-authored three books on data science and predictive analytics and helped develop data science programs at ECSU and SUNY New Paltz. Her PhD dissertation, Model-Based Clustering of Incomplete Data, tackles the persistent problem of trying to do data science with incomplete data.

DANIEL T. LAROSE, PHD, is a Professor of Data Science and Statistics and Director of the Data Science programs at Central Connecticut State University. He has published many books on data science, data mining, predictive analytics, and statistics. His consulting clients include The Economist magazine, Forbes Magazine, the CIT Group, and Microsoft.

作者簡介(中文翻譯)

CHANTAL D. LAROSE, PHD,是東康乃狄克州立大學(ECSU)統計與數據科學的助理教授。她共同撰寫了三本關於數據科學和預測分析的書籍,並協助在ECSU和紐約州立大學新帕爾茨校區(SUNY New Paltz)開發數據科學課程。她的博士論文《基於模型的不完整數據聚類》探討了在不完整數據下進行數據科學的持續挑戰。

DANIEL T. LAROSE, PHD,是中康乃狄克州立大學的數據科學與統計教授,以及數據科學課程的主任。他出版了多本關於數據科學、數據挖掘、預測分析和統計的書籍。他的顧問客戶包括《經濟學人》雜誌、《福布斯》雜誌、CIT集團和微軟。