Prometheus Self Discovery on Kubernetes


Prometheus offers advanced service discovery mechanisms that allow scraping new targets without making any changes to the target or Prometheus server. This allows a simple, effective and loosely coupled way to observe and monitor all the micro-services in your environment. When launched initially in June, 2015 Prometheus just supported self discovery via DNS Records and Consul. Today, more than 12 self discovery targets are supported. This includes Kubernetes, Azure, GCP, OpenStack among others.

In this post, I'll explain ways to enable automatic service discovery for Prometheus server running in a Kubernetes cluster.

Prometheus is an open-source systems monitoring and alerting toolkit originally built at SoundCloud. Since its inception in 2012, many companies and organizations have adopted Prometheus, and the project has a very active developer and user community. It is now a standalone open source project and maintained independently of any company.

As we'll focus on Prometheus deployment on Kubernetes in this post, you'll need a Prometheus server running on your Kubernetes cluster to see this in action. You can do this via one of the below approaches.

Self Discovery Basics

Kubernetes Self Discovery configurations allow retrieving scrape targets automatically, as and when new targets come up. The scraping is based on Kubernetes service names, so even if the IP address change (and they will), Prometheus can seamlessly scrape the targets. This means, if you have a new application being deployed on Kubernetes, Prometheus can automatically detect and scrape metrics from this application without any additional configuration on either the application or Prometheus.

Prometheus Self discovery setup is done via the Prometheus configuration file. You can use the flag --config.file while starting Prometheus to specify a valid Prometheus configuration to be used.

Understanding the Configuration fields

Here is a sample Prometheus config

- job_name: 'kubernetes-apiservers'

  - role: pod

  scheme: https

    ca_file: /var/run/secrets/

  bearer_token_file: /var/run/secrets/

  - source_labels: [__meta_kubernetes_pod_label_app_type]
    separator: ;
    regex: (storage-app.*)
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_pod_label_app_mode]
    separator: ;
    regex: (sg-.*)
    replacement: $1
    action: keep

Prometheus self discovery is based on Kubernetes labels and annotations explained here. This allows a great deal of granularity on choosing the applications to be scraped. It is also important here to understand the role field, since it defines the behavior of a scraping job. role defines the type of Kubernetes resource you want Prometheus to look for. Currently it can be endpoints, service, pod, node, or ingress. For example, if role is set to pod, Prometheus will discover a target for each pod and exposes their containers as targets.

Each role has a specific set of meta labels available that help control specific apps to be scraped. We'll cover the meta labels for pod role below to give you an idea of how it works.

__meta_kubernetes_namespace_: The namespace in which the pod resides.
__meta_kubernetes_pod_name_: The name of the pod object.
__meta_kubernetes_pod_ip_: The IP Address of the pod object.
__meta_kubernetes_pod_label_<labelname>_: Allows selecting pods based on its labels.
__meta_kubernetes_pod_labelpresent_<labelname>_: Set this to true to filter pods based on given label name from the pod object.
__meta_kubernetes_pod_annotation_<annotationname>_:  Allows selecting pods based on its annotations.
__meta_kubernetes_pod_annotationpresent_<annotationname>_: Set this to true to filter pods based on given annotation name from the pod object.
__meta_kubernetes_pod_container_init_: Allows selecting a container if it is an InitContainer
__meta_kubernetes_pod_container_name_: Name of the container the target address points to.
__meta_kubernetes_pod_container_port_name_: Allows to select containers with given port name.
__meta_kubernetes_pod_container_port_number_: Allows to select containers with given port number.
__meta_kubernetes_pod_container_port_protocol_: Protocol of the container port.
__meta_kubernetes_pod_ready_: Set to true or false for the pods ready state.
__meta_kubernetes_pod_phase_: Set to Pending, Running, Succeeded, Failed or Unknown in the lifecycle.
__meta_kubernetes_pod_node_name_: The name of the node the pod is scheduled onto.
__meta_kubernetes_pod_host_ip_: The current host IP of the pod object.
__meta_kubernetes_pod_uid_: The UID of the pod object.
__meta_kubernetes_pod_controller_kind_: Object kind of the pod controller.
__meta_kubernetes_pod_controller_name_: Name of the pod controller.

As you can see, these meta labels allow a powerful way to configure Prometheus to scrape only the containers / applications that you need.


I hope this post was useful in understand the intricacies of how Prometheus self discovery works. Here are some of the other relevant resources that may be helpful