Deep Learning on Apache Spark™: Workflows and Best Practices
Deep Learning has shown a tremendous success, yet it often requires a lot of effort to leverage its power. Existing Deep Learning frameworks require writing a lot of code to work with a model, let alone in a distributed manner.
This webinar is the first of a series in which we survey the state of Deep Learning at scale, and where we introduce the Deep Learning Pipelines, a new open-source package for Apache Spark™. This package simplifies Deep Learning in three major ways:
1. It has a simple API that integrates well with enterprise Machine Learning pipelines.
2. It automatically scales out common Deep Learning patterns, thanks to Spark.
3. It enables exposing Deep Learning models through the familiar Spark APIs, such as MLlib and Spark SQL.
In this webinar, we will look at a complex problem of image classification, using Deep Learning and Spark. Using Deep Learning Pipelines, we will show:
how to build deep learning models in a few lines of code;
how to scale common tasks like transfer learning and prediction; and
how to publish models in Spark SQL.
Sue Ann Hong
Software Engineer, Databricks
Sue Ann Hong is a software engineer in the Machine Learning team at Databricks where she contributes to MLlib and Deep Learning Pipelines Library. She got her Ph.D. at CMU studying machine learning and distributed optimization and worked as a software engineer at Facebook in Ads and Commerce.
Jules S. Damji
Spark Community Evangelist - Databricks
Jules S. Damji is a Apache Spark Community Evangelist with Databricks. He is a hands-on developer with over 15 years of experience and has worked at leading companies building large-scale distributed systems.