Building Reliable Data Lakes with Delta Lake | Hands-on LabDelivering production data pipelines with Delta Lake
The widespread adoption of Apache Spark™, the first unified analytics engine, has helped data professionals make great strides in data science and machine learning. Yet, their upstream data lakes still face reliability challenges when it comes to building production data pipelines at scale to power these initiatives.
Delta Lake is an open source storage layer that brings reliability to data lakes. It has numerous reliability features including ACID transactions, scalable metadata handling, and unified streaming and batch data processing. It also offers DML commands to update, delete, and merge data for your data lifecycle, such as for GDPR/CCPA. Delta Lake runs on top of your existing data lake, such as on Azure Data Lake Storage, AWS S3, Hadoop HDFS, or on-premise, and is fully compatible with Apache Spark APIs.
Join this hands-on lab to learn how Delta Lake can help you build robust production data pipelines at scale. This event will give you the opportunity to:
- Gain an understanding of the Delta Lake open source project
- Learn how to build highly scalable and reliable data pipelines using Delta Lake
- See Delta Lake in action with a demo and hands-on code walkthrough
- Ask Databricks experts your most challenging data questions
- Network and learn from your data engineering and data science peers
AGENDA AT A GLANCE
08:30 – 09:00 Registration, Breakfast & Networking
09:00 – 09:20 Opening Remarks - Delta Lake Overview
09:20 – 09:30 Delta Lake in Action - Customer Cases
09:30 – 10:00 Delta Lake: Hands-on Walkthrough - Part 1
10:00 – 10:30 Break
10:30 – 11:30 Delta Lake: Hands-on Walkthrough - Part 2
11:30 – 11:45 Productionizing ML with Delta Lake Demo
11:45 – 12:15 Ask the Expert: bring your most challenging data problems
12:15 – 12:30 Wrap Up
Space is limited for this event. Sign up today to reserve your spot!