Databricks

Databricks on AWS Training Series

Watch On-Demand


Databricks provides a Unified Analytics Platform that accelerates innovation by bringing data and machine learning together. Founded by the original creators of Apache Spark™, Databricks unites Data Engineering and Data Science efforts to create automated data pipelines that enable thousands of organizations to succeed at multiple data science use cases. 

You should consider attending these trainings if you are a data engineer or data scientist interested in learning to use Apache Spark or Databricks

The three training sessions are:

  • Getting Started with Apache Spark
  • Data Engineering and Streaming Analytics
  • Machine Learning

Databricks Training: Getting Started with Apache Spark™

Are the tools you’re using not running data and analytics workloads as quickly as you need? Are you looking to take advantage of the power of Apache Spark™ to support your full range of data engineering and data science projects, including data management and pipelines, streaming analytics, and machine learning?

In this free 2-hour online training, we’ll teach you how to get started with Apache Spark on Databricks:

  • Introduction to RDDs, DataFrames and Datasets for data transformation
  • Write your first Apache Spark job to load and work with data
  • Analyze your data and visualize your results in a Databricks Notebook
  • Intro Parquet and Delta Lakes on AWS S3 for data storage
 
 

Databricks Training: Data Engineering and Streaming Analytics

Are you looking for more efficiency with extract, transform and load of massive structured and unstructured data sets? Are the tools you’re using not running workloads as quickly as you need?

In this free two-hour training, we will show you how to leverage Databricks to build an ETL pipeline from raw ingest to the Data Warehouse and learn how to process streaming data and create visualizations based on continuously-updated aggregate results.

Learn how to:

  • Batch ingest using Databricks
  • Transform data using Spark SQL and DataFrames
  • Connect to Kinesis as streaming data sources
  • Use the DataFrame API to transform streaming data
  • Output the results to a variety of sinks
  • Create dynamic visualizations from real-time analytics on streaming data

Databricks Training: Machine Learning

Do you need to speed up building machine learning models and putting them into production? Databricks helps you develop, train, and tune accurate models faster. Get insights faster by collaborating via shared notebooks between multiple analysts and data scientists. Run cutting-edge machine learning on larger data sets, leveraging the increased speed and scale enabled by MLlib’s algorithms, which are optimized for parallelization.

In this free 2 hour online training, we’ll show you how you can use MLlib and MLflow with Databricks to train your own models, run reproducible experiments, and deploy into production with fewer failing jobs.

You should attend if you are a data engineer or data scientist excited about using Apache Spark™ to streamline your data prep, model training, hyperparameter tuning and deploying to production.