March 18th, 2020
This is a virtual event

Thank you for your interest in the AWS | Databricks Cloud Data Lake Dev Day Workshop in Menlo Park, your health and safety is of utmost importance to us. Due to the current COVID-19 (coronavirus) situation, Databricks has decided to move this from an in person event to a web-based event. We will use Zoom for a virtual meeting environment, Zoom link will be sent to you upon registration. We look forward to seeing you on March 18th, 2020 at 8:30AM PT.   

AWS | Databricks Cloud Data Lake Dev Day Workshop
Enabling Cloud Data Lakes for Analytics

Every organization wants to leverage the wealth of data accumulated in their data lake for deep analytics insights. However, most organizations struggle with how to make that data analytics ready, and how to automate data pipelines to leverage new data as data lakes are constantly updated.
 
In this workshop, we’ll cover best practices for organizations to use powerful open source technologies to build and extend your AWS investments to make your data lake analytics ready. You’ll learn about the advantages of cloud-based data lakes in terms of security and cost. And finally, you’ll learn how data professionals are having a huge impact - lowering costs, changing time to market, and even revolutionizing industries.
 
Join this half-day workshop to learn how you can leverage your data lake for powerful analytics insights. This free workshop will give you the opportunity to:
 
  • Learn how to build highly scalable and reliable data pipelines for analytics 
  • How you can make your existing S3 data lake analytics-ready with open-source Delta Lake technology
  • Evaluate options to migrate current on-premise data lakes (Hadoop, etc) to AWS with Databricks Delta
  • Integrate that data with services such as Amazon SageMaker, Amazon Redshift, AWS Glue, and Amazon Athena, as well as leveraging your AWS security and roles without moving your data out of your account 
  • Understand open source technologies like Delta Lake and Apache Spark that are portable and powerful at any organization and for any analytics use case
  • Network and learn from your data professional peers

AGENDA AT A GLANCE
 
9:00-9:45 - Opening Remarks - Enabling Cloud Data Lakes for Analytics
9:45-10:15 - Customer Stories and Use Cases 
10:15-10:45 - Q&A | Break
10:45-11:30 - Architecture and Cluster Configuration Best Practices  - Incorporating Roles; Scaling, Launching and Managing Clusters; Managing Pools and Containers 
11:30-12:15 - Notebooks - Data Pipelines using Delta Lake, integrated with Amazon Glue, Amazon Athena and Amazon Redshift for data delivery to analysts
12:15-12:30 - Q&A 


Space is limited for this event. Sign up today to reserve your spot!

Please fill out the form to confirm your spot