Practical Apache Spark: Using the Scala API (實用 Apache Spark:使用 Scala API)
Subhashini Chellappan, Dharanitharan Ganesan
相關主題
商品描述
Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You’ll follow a learn-to-do-by-yourself approach to learning – learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure.
On completion, you’ll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You’ll also become familiar with machine learning algorithms with real-time usage.
What You Will Learn
- Discover the functional programming features of Scala
- Understand the complete architecture of Spark and its components
- Integrate Apache Spark with Hive and Kafka
- Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries
- Work with different machine learning concepts and libraries using Spark's MLlib packages
Who This Book Is For
Developers and professionals who deal with batch and stream data processing.
商品描述(中文翻譯)
使用Scala與Apache Spark一起工作,以部署和設置單節點、多節點和高可用性集群。本書討論了Spark的各個組件,如Spark Core、DataFrames、Datasets和SQL、Spark Streaming、Spark MLib以及R on Spark,並提供了每個主題的實用代碼片段。《實用Apache Spark》還涵蓋了將Apache Spark與Kafka集成的示例。您將遵循一種自學的方法來學習-學習概念,練習Scala中的代碼片段,並完成分配的任務,以獲得全面的經驗。
完成後,您將瞭解Scala的函數式編程方面,並具有各種Spark組件的實踐經驗。您還將熟悉實時使用的機器學習算法。
您將學到什麼:
- 發現Scala的函數式編程特性
- 瞭解Spark的完整架構及其組件
- 將Apache Spark與Hive和Kafka集成
- 使用Spark SQL、DataFrames和Datasets使用傳統SQL查詢處理數據
- 使用Spark的MLlib套件處理不同的機器學習概念和庫
本書適合開發人員和處理批處理和流處理數據的專業人士。