Data Science with Java: Practical Methods for Scientists and Engineers
暫譯: 使用 Java 的資料科學:科學家與工程師的實用方法

Michael R. Brzustowicz PhD

相關主題

商品描述

Data Science is booming thanks to R and Python, but Java brings the robustness, convenience, and ability to scale critical to today’s data science applications. With this practical book, Java software engineers looking to add data science skills will take a logical journey through the data science pipeline. Author Michael Brzustowicz explains the basic math theory behind each step of the data science process, as well as how to apply these concepts with Java.

You’ll learn the critical roles that data IO, linear algebra, statistics, data operations, learning and prediction, and Hadoop MapReduce play in the process. Throughout this book, you’ll find code examples you can use in your applications.

  • Examine methods for obtaining, cleaning, and arranging data into its purest form
  • Understand the matrix structure that your data should take
  • Learn basic concepts for testing the origin and validity of data
  • Transform your data into stable and usable numerical values
  • Understand supervised and unsupervised learning algorithms, and methods for evaluating their success
  • Get up and running with MapReduce, using customized components suitable for data science algorithms

商品描述(中文翻譯)

資料科學因為 R 和 Python 的興起而蓬勃發展,但 Java 提供了穩健性、便利性以及擴展性,這些都是當今資料科學應用所必需的。這本實用的書籍將帶領希望增強資料科學技能的 Java 軟體工程師,邏輯性地探索資料科學流程。作者 Michael Brzustowicz 解釋了資料科學過程中每個步驟背後的基本數學理論,以及如何使用 Java 應用這些概念。

您將學習到資料輸入輸出(data IO)、線性代數、統計學、資料操作、學習與預測,以及 Hadoop MapReduce 在這個過程中扮演的關鍵角色。在整本書中,您會找到可以在應用程式中使用的程式碼範例。

- 檢視獲取、清理和整理資料至其最純粹形式的方法
- 理解您的資料應該採取的矩陣結構
- 學習測試資料來源和有效性的基本概念
- 將您的資料轉換為穩定且可用的數值
- 理解監督式和非監督式學習演算法,以及評估其成功的方法
- 使用適合資料科學演算法的自訂元件,快速上手 MapReduce