An Introduction to Apache Spark™ APIs

Deep Dive into RDDs, DataFrames, and Datasets
An Introduction to Apache Spark™ APIs

Apache Spark™ APIs are both powerful and easy-to-use and are the foundation of Spark’s vast ecosystem of tools and libraries. The combination of general APIs and high-performance execution makes Spark a powerful platform for interactive and production applications.

Databricks, founded by the team that originally created Apache Spark, is glad to share this eBook, in which we cover:

  • A deep dive into Spark’s three sets of APIs — RDDs, DataFrames, and Datasets.
  • Key concepts on the various performance and optimization benefits of each API.
  • Tips and best practices for how and when to use them through real world examples.

Get the eBook to learn why developers love to work with Apache Spark APIs.

Get the eBook