storage

Storage: Complete Overview for Developers

Storage is one of the most important components of a technology stack and still it is generally misunderstood because of just so many options. Additionally storage has been used as a blanket term for all types of underlying systems which sometimes makes it difficult for people new to storage.

Read

Get started with computer science papers

One of the best ways to learn and improve in your field is doing something on your own, either as a part of your job or a weekend project. If you can't do that, the next best thing is to read. In this post, we'll introduce you to some of interesting papers on open source, large scale projects.

Read
streaming-data

Streaming Data Tools & Techniques

Streaming data is one of the fastest growing sectors in big data, largely due to real time and near real time behavior in modern systems. IoT Data, Financial Markets, Application Logs, Click-streams, etc generate huge volumes of streaming data. In this post, we'll take a look at Tools and Techniques used commonly to handle such streaming data.

Read
kubernetes

Complete guide to deploy Spark on Kubernetes

Kubernetes is rapidly becoming the default orchestration platform for Spark. While an object storage platform is used for storage instead of HDFS. In this post we take a deep dive on creating and deploying Spark containers on a Kubernetes cluster.

Read
data-lake

Modern Data Lakes Overview

As Data volumes grow to new, unprecedented levels, new tools and techniques are coming into picture to handle this growth. One of the fields that evolved is Data Lakes. In this post we'll take a look at the story of evolution of Data Lakes and how modern Data Lakes like Iceberg, Delta Lake are solving important problems.

Read
kubernetes

Kubernetes client tools overview

All of us know kubectl, but with wide adoption of Kubernetes over the last few years, several interesting client side tools have come up that can help improve your daily interaction with a Kubernetes cluster.

Read
kubernetes

Prometheus Self Discovery on Kubernetes

Kubernetes Self Discovery configurations allow retrieving scrape targets automatically, as and when new targets come up. Prometheus deployment on Kubernetes can automatically detect and scrape metrics from this application without any additional configuration on either the application or Prometheus.

Read
cloud-native

Persist Kafka Messages to MinIO

In this post, we'll see how to use Kafka and MinIO to ingest huge data volumes and store it in persistent manner to ensure data is available for later analysis and consumption.

Read