kubernetes

Complete guide to deploy Spark on Kubernetes

Kubernetes is rapidly becoming the default orchestration platform for Spark. While an object storage platform is used for storage instead of HDFS. In this post we take a deep dive on creating and deploying Spark containers on a Kubernetes cluster.

Read
data-lake

Modern Data Lakes Overview

As Data volumes grow to new, unprecedented levels, new tools and techniques are coming into picture to handle this growth. One of the fields that evolved is Data Lakes. In this post we'll take a look at the story of evolution of Data Lakes and how modern Data Lakes like Iceberg, Delta Lake are solving important problems.

Read
kubernetes

Kubernetes client tools overview

All of us know kubectl, but with wide adoption of Kubernetes over the last few years, several interesting client side tools have come up that can help improve your daily interaction with a Kubernetes cluster.

Read
kubernetes

Prometheus Self Discovery on Kubernetes

Kubernetes Self Discovery configurations allow retrieving scrape targets automatically, as and when new targets come up. Prometheus deployment on Kubernetes can automatically detect and scrape metrics from this application without any additional configuration on either the application or Prometheus.

Read
cloud-native

Persist Kafka Messages to MinIO

In this post, we'll see how to use Kafka and MinIO to ingest huge data volumes and store it in persistent manner to ensure data is available for later analysis and consumption.

Read