Build Your Lakehouse

Virtual Hands-on Workshops | 3 Part Series

Get data, analytics and AI on one platform

Join the 3-part virtual Databricks hands-on workshops, taking place online from the comfort of your own home.

Take a deeper dive into the lakehouse and discover how it can enhance your data science, ML, data engineering, data warehousing and analytics. Achieve success with Databricks in less time by joining our hands-on workshops. Learn from a live instructor who will provide step-by-step guidance while we run queries together within the Databricks platform.

Part 1: Delta and the Lakehouse in Production (2 hours)

A lakehouse architecture is the ideal data architecture for data-driven organizations. It combines the best qualities of data warehouses and data lakes to provide a single solution for all major data workloads and supports use cases from streaming analytics to BI, data science and AI.

In this workshop, we’ll walk through the evolution of data management and the shift to a lakehouse architecture. We’ll discuss how a lakehouse enables data teams to collaborate across the entire data and AI workflow. We’ll also explore Delta Lake — an open, reliable, performant and secure data storage and management layer for your data lakes. Delta Lake provides ACID transactions, scalable metadata handling, and unifies streaming and batch data processing. We’ll take a look at checkpoints, error alerting and job retries along with Delta performance enhancements like data skipping, caching and z-ordering.

In the hands-on portion, you’ll learn how to create, manage and query a Delta Lake. You’ll ingest real-time streaming data, refine it and serve it for downstream machine learning and business intelligence use cases. You will also incorporate the best practices described above to ensure production-grade performance, visibility and fault tolerance. This pipeline will serve as a reusable template that you can tailor to meet your specific use cases in the future!

Who should attend: Suited for data professionals across data engineering, data science and business analytics, with little to no experience of Databricks

Topics to be covered:
Introduction (5 mins)

Lakehouse and Delta Lake Overview (40 mins)

Workshop (65 mins)

Ending Q&A (10 mins)

Part 2: Databricks SQL and Delta (1.5 hours)

The Lakehouse Platform combines the best elements of data lakes and data warehouses — delivering data management and performance typically found in data warehouses with the low-cost, flexible object stores offered by data lakes.

Join this workshop to get hands-on guidance for semi-structured data delivery at scale for downstream analytical consumption via Databricks SQL. We will demonstrate how to architect JSON pipelines and then show you how to explore and visualize semi-structured data with Databricks SQL.

Who should attend: Suited for data professionals across data engineering, data science and business analytics, with little to no experience of Databricks

Topics to be covered:

Introduction (5 mins)

Delta Lake Recap (5 mins)

Spark Streaming (10 mins)

Medallion Architecture using JSON, SQL parsing features (20 mins)

Databricks SQL Warehouse (35 mins)

Connecting SQL Warehouses to external BI tools (5 mins)

Ending Q&A (10 mins)

Part 3: End-to-End ML (1.5 hours)

The future is here, it’s just not evenly distributed. While 83% of CEOs say machine learning is a strategic priority, 87% of data science initiatives never make it to production.

The Databricks ML platform simplifies the end-to-end machine learning lifecycle from data preparation to feature development, model training, deployment and monitoring. End-to-end machine learning on Databricks enables you to create new value for your customers using data, analytics and AI.

In this workshop, we'll walk you through an end-to-end ML solution on Databricks. We will discuss how you can use the collaborative ML development environment, prepare data at scale and leverage AutoML capabilities for rapid prototyping of ML models. Then we’ll use the training code generated by the Glassbox approach to refine your model, track experiment runs and deploy using MLflow Model Serving. To end, we’ll take your deployed model and show how you can monitor drift over time using Databricks SQL. This workshop will provide you with a reusable template to jumpstart your journey putting models in production with Databricks.

Topics to be covered:

Introduction (5 mins)

Databricks Machine Learning Overview (15 mins)

Feature Store Overview (15 mins)

Experiments & AutoML (30 mins)

Model Registry & Serving (10 mins)

Ending Q&A (15mins)

Watch Now

This element should not be removed. It will be automatically hidden