Apache Spark 2.0™: Faster, Easier, and Smarter

On-Demand Webinar

In this webcast, Reynold Xin from Databricks will be speaking about Apache Apache Spark™'s new 2.0 major release.
 
The major themes for Spark 2.0 are:

  • Unified APIs: Emphasis on building up higher level APIs including the merging of DataFrame and Dataset APIs
  • Structured Streaming: Simplify streaming by building continuous applications on top of DataFrames allow us to unify streaming, interactive, and batch queries.
  • Tungsten Phase 2: Speed up Apache Spark by 10X
Presenters
  • Reynold Xin

    Co-Founder and Chief Architect of Databricks

    Reynold oversees Databricks' technical contributions to Apache Spark and Databricks Runtime, initiating efforts such as DataFrames, Project Tungsten, and Spark 2.0. To demonstrate Spark's scalability and performance, he led the efforts in the 2014 Daytona GraySort contest and set the 2014 world record, beating the previous record held by Hadoop with 30X higher per-node efficiency. He was also part of the team that set the 2016 CloudSort record for the most efficient and lowest cost software to sort 100TB of data in the cloud, beating the 2015 record by 3X.

  • Jules S. Damji

    Spark Community and Developer Advocate at Databricks

    Jules S. Damji is an Apache Spark Community Advocate with Databricks. He is a hands-on developer with over 15 years of experience and has worked at leading companies building large-scale distributed systems.