We’re excited to share the complete text of O’Reilly’s new Learning Spark, 2nd Edition with the
PASS community for free!
Build reliable data lakes with ACID transactions Delta Lake and Apache Spark. In Chapter 9 we discuss the limitations of data lakes and how lakehouses are the natural evolution combining the best elements of data lakes and data warehouses for OLAP workloads.
|
Simplify working with your big data and easily integrate with external data sources including SQL Server, Azure Cosmos DB, and more!
|
This ebook also provides a primer from Machine Learning fundamentals to designing machine learning pipelines (in Chapter 10). In Chapter 11, we discuss how to manage, deploy, and scale your machine learning pipelines including model management with MLflow to distributed hyperparameter tuning.
|