Lessons for Large-Scale Machine Learning
Deployments on Apache® Spark™

A collection of technical content from Databricks

Apache Spark™ has rapidly emerged as the de facto standard for big data processing across all industries and use cases—from providing recommendations based on user behavior to analyzing millions of genomic sequence data to accelerate drug innovation and development for personalized medicine.



This eBook, the third of a series, picks up where the second book left off on the topic of advanced analytics, and jumps straight into practical tips for performance tuning and powerful integrations with other machine learning tools.


Whether you are just getting started with Spark or are already a Spark power user, this eBook will arm you with the knowledge to be successful on your next Spark project including:

  • Apache Spark integrations with popular deep learning framework TensorFlow and the python library scikit-learn.
  • Tools and tips to overcome common roadblocks in developing machine learning algorithms on Apache Spark.
  • A selection of Spark machine learning use cases from ad tech, retail, financial services, and many other industries.

Download the eBook, Lessons for Large-Scale Machine Learning Deployments on Apache Spark, to learn more.


Get the eBook