Unified Data Analytics | Workshop
Unifying Data Pipelines, Business Analytics and Machine Learning with Apache Spark™
Every enterprise today wants to accelerate innovation by building AI into their business. However, most companies struggle with preparing large datasets for analytics, managing the proliferation of ML frameworks, and moving models in development to production.
In this workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your data and ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll learn how to use ML frameworks (i.e. Tensorflow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements. And finally, you can learn how to use MLflow to track experiment runs between multiple users within a reproducible environment, and manage the deployment of models to production.
Join this half-day workshop to learn how unified data analytics can bring data science, business analytics, and engineering together to accelerate your data and ML efforts. This free workshop will give you the opportunity to:
- Learn how to build highly scalable and reliable pipelines for analytics
- Deeper insight into Apache Spark, including the latest updates with Delta Lake.
- Train a model against data and learn best practices for working with ML frameworks (i.e. - XGBoost, Scikit-Learn, etc.)
- Learn about MLflow to track experiments, share projects and deploy models in the cloud and on-prem
- Network and learn from your ML and Apache Spark peers
AGENDA AT A GLANCE
- 8:30-9:00 AM Registration, Breakfast & Networking
- 9:00-9:45 AM Opening Remarks - Unifying Data Science, Business Analytics, and Data Engineering
- 9:45-10:15 AM Use Case highlight
- 10:15-10:45 AM Networking with Peers
- 10:45-11:30 AM Data Engineering Interactive Demo & Best Practices: Preparing Data for Analytics
- 11:30-12:15 PM Data Science Interactive Demo & Best Practices: Model Training and Machine Learning
- 12:15-12:30 PM Q&A