top of page
Chashnikov.dev


Navigating Data Management: Warehouses, Lakes and Lakehouses
In today’s dynamic data management landscape, the terminology and concepts related to data storage and processing have become more...
Feb 18, 20245 min read
30 views
0 comments


One Billion Row Challenge - view from sidelines
In the last couple of days I’ve been hearing, reading and poking around the 1 Billion Row Challenge (1BRC) - a ”contest” for Java / JVM...
Jan 27, 20245 min read
18,345 views
0 comments


Inverted Indexes: A Step-by-Step Implementation Guide
Inverted Indexes: why do you need one, and how to implement in Scala quickly and easily
Jun 12, 20235 min read
11,389 views
1 comment


The Do's and Don'ts of Apache Spark - Best Practices for Efficient Data Processing
Apache Spark has emerged as one of the most popular big data processing frameworks due to its speed, scalability, and ease of use....
May 29, 20236 min read
14,139 views
0 comments


Testing Spark StructuredStreaming locally with EmbeddedKafka - part 2, now with objects
This is a continuation of "Testing Spark Streaming locally with EmbeddedKafka". If you're not familar with EmbeddedKafka - I'd recommend...
Apr 22, 20235 min read
3,165 views
0 comments


Testing Spark Streaming locally with EmbeddedKafka
It's been a while since my previous article in Spark/Scala series, where we ran Spark locally using Docker. And even before that we...
Mar 25, 20235 min read
938 views
1 comment


Evaluating management
Performance reviews season being in full swing, got me thinking about role of managers in it, expectations from Individual Contributor's...
Jan 22, 20232 min read
287 views
0 comments


Spark: understanding Physical Plans
You have some kind of query - maybe it's written using Dataset API, maybe using Spark SQL. It reads from one or several Hive tables, or...
Dec 15, 20224 min read
10,600 views
0 comments


Database Internals: A very short conspect
About a year ago I listened to Software Engineering Radio podcast episode with Alex Petrov, where Alex was discussing his new book,...
Nov 14, 202214 min read
5,315 views
0 comments


Starting up Spark Standalone Cluster with Docker
In previous post we've created a simple Spark app, and used Scalatest to check that it actually works. Even though we were creating a...
Oct 27, 20223 min read
4,950 views
0 comments


Testing Spark apps locally with Scalatest
Years ago I wrote a blog post describing process of building and deploying a simple Spark app. This post is now too old to be of any use...
Oct 22, 20223 min read
10,779 views
2 comments
bottom of page