What's New in the Upcoming Apache Spark™ 2.3 Release?

On-Demand Webinar

The upcoming Apache Spark™ 2.3 release marks a big step forward in speed, unification, and API support.

Reynold Xin and Jules Damji from Databricks will walk through how you can benefit from the upcoming improvements:

New DataSource APIs that enable developers to more easily read and write data for Continuous Processing in Structured Streaming.
PySpark support for vectorization, giving Python developers the ability to run native Python code fast.
Improved performance by taking advantage of NVMe SSDs.
Native Kubernetes support, marrying the best of container orchestration and distributed data processing.

Presenters

Reynold Xin
Co-Founder and Chief Architect of Databricks

Reynold oversees Databricks' technical contributions to Apache Spark and Databricks Runtime, initiating efforts such as DataFrames, Project Tungsten, and Spark 2.0. To demonstrate Spark's scalability and performance, he led the efforts in the 2014 Daytona GraySort contest and set the 2014 world record, beating the previous record held by Hadoop with 30X higher per-node efficiency. He was also part of the team that set the 2016 CloudSort record for the most efficient and lowest cost software to sort 100TB of data in the cloud, beating the 2015 record by 3X.

Jules S. Damji

Spark Community Evangelist - Databricks

Jules S. Damji is a Apache Spark Community Evangelist with Databricks. He is a hands-on developer with over 15 years of experience and has worked at leading companies building large-scale distributed systems.

What's New in the Upcoming Apache Spark™ 2.3 Release?

On-Demand Webinar

Presenters

Sign up today