How to Secure your Cloud Data Lake and Apache Spark™ Pipelines
As organizations push to leverage their data to make more intelligent decisions, a core requirement is to “Democratize data.” In other words, open up access to previously siloed and restricted data to broader parts of the organization. However, this new model of expanded access causes a fair bit of trepidation across organizations, particularly the C-suite, regarding a data breach.
Many opt to build their own data lakes and advanced analytics solutions by cobbling together a plethora of data processing (Apache Spark, Hive) and AI/ML tools (SparkML, Tensorflow, PyTorch), many of which are open source. This can introduce behaviors that increase security risk. According to Gartner, 80% of organizations will fail to develop a consolidated data security policy across silos.
In this webinar, you will learn some of the challenges of a DIY platform and how to overcome those including:
— How to break silos and secure your big data and ML workflows with a unified approach to data security
— How to deploy and operate your analytics securely, and govern data at scale
— How to evaluate your organization's and your partners' security culture
Chief Information Security Officer, Databricks
Security SME, Databricks