Navigating Data Management: Warehouses, Lakes and Lakehouses
In today’s dynamic data management landscape, the terminology and concepts related to data storage and processing have become more...
Navigating Data Management: Warehouses, Lakes and Lakehouses
One Billion Row Challenge - view from sidelines
Inverted Indexes: A Step-by-Step Implementation Guide
The Do's and Don'ts of Apache Spark - Best Practices for Efficient Data Processing
Testing Spark StructuredStreaming locally with EmbeddedKafka - part 2, now with objects
Testing Spark Streaming locally with EmbeddedKafka
Evaluating management
Spark: understanding Physical Plans
Database Internals: A very short conspect
Starting up Spark Standalone Cluster with Docker
Testing Spark apps locally with Scalatest