Wednesday, April 29th, 2020
Virtual Workshop
9:00 AM ET

AWS | Databricks Cloud Data Lake Dev Day Workshop
Enabling Cloud Data Lakes for Analytics

Every organization wants to leverage the wealth of data accumulated in their data lake for deep analytics insights. However, most organizations struggle with how to make that data analytics ready, and how to automate data pipelines to leverage new data as data lakes are constantly updated.

 

In this virtual workshop, we’ll cover best practices for organizations to use powerful open source technologies to build and extend your AWS investments to make your data lake analytics ready. You’ll learn about the advantages of cloud-based data lakes in terms of security and cost. And finally, you’ll learn how data professionals are having a huge impact - lowering costs, changing time to market, and even revolutionizing industries.

 

Join this virtual workshop to learn how you can leverage your data lake for powerful analytics insights. This virtual workshop will give you the opportunity to:

 

  • Learn how to build highly scalable and reliable data pipelines for analytics 
  • How you can make your existing S3 data lake analytics-ready with open-source Delta Lake technology
  • Evaluate options to migrate current on premise data lakes (Hadoop, etc) to AWS with Databricks Delta
  • Integrate that data with services such as Amazon SageMaker, Amazon Redshift, AWS Glue, and Amazon Athena, as well as leveraging your AWS security and roles without moving your data out of your account 
  • Understand open source technologies like Delta Lake and Apache Spark that are portable and powerful at any organization and for any analytics use case
  • Network virtually and learn from your data professional peers


AGENDA AT A GLANCE

 

9:00 am Databricks Keynote
9:45 am Customer Use Case
10:15 am Immuta Presentation
10:30 am Break
10:40 am Architecture - Incorporating Roles; Scaling, Launching and Managing Clusters; Managing Pools and Containers 
11:15 am Notebooks - Data Loading using Scala, Machine Learning Analytics using Python, and Redshift integration for data delivery to analysts 
11:50 am Q&A
12:00 pm Finish


We will use Zoom for a virtual meeting environment. Your Zoom link will be sent to you upon registration. 


We look forward to connecting with you soon!


Event Sponsor: Databricks (Databricks Privacy Policy)
Event Co-Sponsor: Immuta (Immuta Privacy Policy)


Please fill out the form to confirm your spot