Accelerate and Scale Joint Genotyping in the Cloud

December 10th, 2019 - 10:00 AM - 11:00 AM PST

Simplify multi-sample variant calling with Databricks, Apache Spark^TM, Delta Lake and Glow

Many organizations have successfully scaled single sample variant calling pipelines to support hundreds of thousands of whole genomes. Multi-sample variant calling stands as the next step to further improve the accuracy of these population-scale studies. However, transitioning from single sample calling gVCFs to a project VCF is a challenge. Organizations struggle to orchestrate the GATK’s CombineGVCFs and GenotypeGVCFs commands across tens of thousands of gVCFs. This is made even more challenging on the cloud, as the storage layers of the GATK’s joint genotyping stack are difficult to integrate with cloud storage systems.

To simplify this process, the Databricks Unified Data Analytics Platform for Genomics offers a fully managed implementation of the GATK4’s joint genotyping engine that takes less than 5 minutes to configure, improves CPU efficiency by 2x and seamlessly scales in the cloud.

Join this webinar to learn:

About the opportunities and challenges presented by multi-sample variant calling
How Databricks rearchitected the GATK4’s Joint Genotyping engine to leverage Apache Spark and Delta Lake to scale for larger cohorts while retaining high accuracy
Live demo of joint genotyping on the Databricks Genomics Runtime, demonstrating how joint genotyping can be set up in less than 5 minutes in the cloud
How joint genotyping can be combined with Project Glow—open source software jointly developed by Databricks and the Regeneron Genetics Center—to rapidly move into tertiary analytics on genotype data

Speakers:

Frank Austin Nothaft, Healthcare and Life Sciences Technical Director, Databricks
Michael Ortega, Industry and Solutions Marketing, Databricks

Register Now

First Name:

Last Name:

Company Email:

Company Name:

Job Title:

Phone Number:

Country:

Keep me informed with occasional updates about Databricks and related open source products

Person Source:

UTM Campaign:

UTM Ad Group:

UTM Keyword:

UTM Medium:

UTM Offer:

UTM Source:

mkto_sfdc_campaign_id:

UTM Content:

UTM Ad:

UTM Term:

ITM:

GCLID: