Introduction

Grafana Labs® is a company that believes in open source and open standards. And so do I. They have developed some nice tools over the past years. And it is safe to assume they won’t stop here.

Currently the stack I am familiar with is the LGTMP (Looks Good To Me Pal 😉) stack.

  • Loki™, like Prometheus, but for logs.
  • Grafana®, the open and composable observability and data visualization platform.
  • Tempo®, a high volume, minimal dependency distributed tracing backend.
  • Mimir, the most scalable Prometheus backend.
  • Phlare, horizontally-scalable, highly-available, multi-tenant continuous profiling aggregation system.

These tools alone will give you the platform you need to achieve great insights. But these tools alone are not enough. Additionally on top of this I use:

  • Grafana Agent Operator, The Grafana Agent Operator is a Kubernetes operator that makes it easier to deploy Grafana Agent.
  • Kube State Metrics, kube-state-metrics (KSM) is a simple service that listens to the Kubernetes API server and generates metrics about the state of the objects.
  • Prometheus Node Exporter, The Prometheus Node Exporter exposes a wide variety of hardware- and kernel-related metrics.
  • Prometheus Blackbox Exporter, The blackbox exporter allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP, ICMP and gRPC.

The full code can be found also on my GitHub.

Setup

Prerequisites

The file structure in this post is described in one of my other posts: Using Flux in Kubernetes. Therefor I will not dive into the kustomization and overlay parts.

You could use the posts referenced below as a refresher with a working example.

Helm Repository

Setting up the Grafana® helm repository. This repository contains all the helm charts for all Grafana® products.


apiVersion: source.toolkit.fluxcd.io/v1beta1
kind: HelmRepository
metadata:
  name: grafana
  namespace: flux-system
spec:
  url: https://grafana.github.io/helm-charts
  interval: 10m

Grafana Loki™

For Grafana Loki™ we will use the helm chart hosted by Grafana Labs®.

Setting up the base overlay:


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: grafana-loki
resources:
- namespace.yaml
- release.yaml


apiVersion: v1
kind: Namespace
metadata:
  name: grafana-loki

For this release there are a few caveats to mention:

  • Loki™ recording rules only work if
    1. You have a rule defined (fake or working).
    2. You have the correct permissions for writing files in container. Special note to (and accompanied extraVolumes):
      • loki.ruler.storage.local.directory
      • loki.ruler.storage.wal.dir
      • loki.ruler.remote_write
  • By default the tenant name is fake if you have multi tenancy disabled. For my purposes this was sufficient. But this does mean you need to write your recording rules under ruler.directories.fake where fake is the tenant name.
  • Having multi tenancy disabled also means you require to have mimir.structuredConfig.multitenancy_enabled: false as well in the Mimir helm chart. Else recording rule generated metrics are not shown in Mimir.

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: loki-distributed
spec:
  chart:
    spec:
      chart: loki-distributed
      interval: 1m
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: ">=0.66.4"
  interval: 5m
  values:
    fullnameOverride: loki
    loki:
      structuredConfig:
        ingester:
          chunk_idle_period: 1h
          chunk_target_size: 1536000
          max_chunk_age: 1h
          max_transfer_retries: 0
        limits_config:
          max_global_streams_per_user: 5000
          max_query_length: 0h
          max_query_parallelism: 32
          max_query_series: 11000
          max_streams_per_user: 0
        ruler:
          remote_write:
            client:
              url: http://grafana-mimir-nginx.grafana-mimir.svc.cluster.local/api/v1/push
            enabled: true
          rule_path: /tmp/loki/rules
          storage:
            local:
              directory: /etc/loki/rules
            type: local
          wal:
            dir: /tmp/loki/ruler-wal
        schema_config:
          configs:
          - from: "2022-12-02"
            index:
              period: 24h
              prefix: loki_index_
            object_store: aws
            schema: v11
            store: boltdb-shipper
        server:
          grpc_server_max_concurrent_streams: 1000
          grpc_server_max_recv_msg_size: 104857600
          grpc_server_max_send_msg_size: 104857600
        storage_config:
          filesystem:
            directory: /tmp/loki/data
          boltdb_shipper:
            shared_store: aws
    ruler:
      directories:
        fake:
          rules.yaml: |
            groups:
              - name: should_fire
                rules:
                  - alert: HighPercentageError
                    expr: |
                      sum(rate({app="foo", env="production"} |= "error" [5m])) by (job)
                        /
                      sum(rate({app="foo", env="production"}[5m])) by (job)
                        > 0.05
                    for: 10m
                    labels:
                        severity: page
                    annotations:
                        summary: High request latency
              - name: credentials_leak
                rules: 
                  - alert: http-credentials-leaked
                    annotations: 
                      message: "{{ $labels.job }} is leaking http basic auth credentials."
                    expr: 'sum by (cluster, job, pod) (count_over_time({namespace="prod"} |~ "http(s?)://(\\w+):(\\w+)@" [5m]) > 0)'
                    for: 10m
                    labels: 
                      severity: critical
      enabled: true
      extraVolumeMounts:
      - mountPath: /rules
        name: loki-rules-generated
      extraVolumes:
        name: loki-rules-generated
    serviceAccount:
      create: true
      name: grafana-loki
    serviceMonitor:
      enabled: true
      labels:
        instance: primary

Grafana®

For Grafana® we will use the helm chart hosted by Grafana Labs®.

Setting up the base overlay:


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: grafana
resources:
- namespace.yaml
- release.yaml


apiVersion: v1
kind: Namespace
metadata:
  name: grafana

The Grafana® HelmRelease below also has a list of dashboards related to this LGTMP stack. And also those for flux.

A few things about this HelmRelease I would like to mention.

  • The name in the YAML array of dashboardProviders.dashboardproviders.yaml correlates to the key in dashboards.
  • The chart is setup to expose Grafana® pprof data for Grafana Phlare.
  • You should update the host example.com to a domain you own or localhost.

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: grafana
spec:
  chart:
    spec:
      chart: grafana
      interval: 5m
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: ">=6.29.2"
  interval: 5m
  values:
    dashboardProviders:
      dashboardproviders.yaml:
        apiVersion: 1
        providers:
        - disableDeletion: false
          editable: false
          folder: flux
          name: flux
          options:
            path: /var/lib/grafana/dashboards/flux
          orgId: 1
          type: file
        - disableDeletion: false
          editable: false
          folder: grafana / Mimir
          name: grafana-mimir
          options:
            path: /var/lib/grafana/dashboards/grafana-mimir
          orgId: 1
          type: file
        - disableDeletion: false
          editable: false
          folder: grafana / Loki
          name: grafana-loki
          options:
            path: /var/lib/grafana/dashboards/grafana-loki
          orgId: 1
          type: file
    dashboards:
      flux:
        cluster:
          datasource: Mimir
          url: https://raw.githubusercontent.com/fluxcd/flux2/main/manifests/monitoring/monitoring-config/dashboards/cluster.json
        control-plane:
          datasource: Mimir
          url: https://raw.githubusercontent.com/fluxcd/flux2/main/manifests/monitoring/monitoring-config/dashboards/control-plane.json
        logs:
          datasource: Loki
          url: https://raw.githubusercontent.com/fluxcd/flux2/main/manifests/monitoring/monitoring-config/dashboards/logs.json
      grafana-loki:
        chunks:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-chunks.json
        deletion:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-deletion.json
        logs:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-logs.json
        operational:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-operational.json
        reads:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-reads.json
        reads-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-reads-resources.json
        recording-rules:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-mixin-recording-rules.json
        retention:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-retention.json
        writes:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-writes.json
        writes-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/loki/main/production/loki-mixin-compiled/dashboards/loki-writes-resources.json
      grafana-mimir:
        alertmanager:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-alertmanager.json
        alertmanager-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-alertmanager-resources.json
        compactor:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-compactor.json
        compactor-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-compactor-resources.json
        config:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-config.json
        object-store:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-object-store.json
        overrides:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-overrides.json
        overview:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-overview.json
        overview-networking:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-overview-networking.json
        overview-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-overview-resources.json
        queries:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-queries.json
        reads:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-reads.json
        reads-networking:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-reads-networking.json
        reads-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-reads-resources.json
        remote-ruler-reads:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-remote-ruler-reads.json
        remote-ruler-reads-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-remote-ruler-reads-resources.json
        rollout-progress:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-rollout-progress.json
        ruler:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-ruler.json
        scaling:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-scaling.json
        slow-queries:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-slow-queries.json
        tenants:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-tenants.json
        top-tenants:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-top-tenants.json
        writes:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-writes.json
        writes-networking:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-writes-networking.json
        writes-resources:
          datasource: Mimir
          url: https://raw.githubusercontent.com/grafana/mimir/main/operations/mimir-mixin-compiled/dashboards/mimir-writes-resources.json
    datasources:
      datasources.yaml:
        apiVersion: 1
        datasources:
        - isDefault: true
          jsonData:
            timeInterval: 15s
          name: Mimir
          type: prometheus
          uid: mimir
          url: http://grafana-mimir-nginx.grafana-mimir.svc.cluster.local/prometheus/
        - isDefault: false
          name: Loki
          type: loki
          uid: loki
          url: http://loki-gateway.grafana-loki.svc.cluster.local/
        - isDefault: false
          jsonData:
            nodeGraph:
              enabled: true
          name: Tempo
          type: tempo
          uid: tempo
          url: http://grafana-tempo-query-frontend.grafana-tempo.svc.cluster.local:3100
        - name: Phlare
          type: phlare
          uid: phlare
          url: http://grafana-phlare-querier.grafana-phlare.svc.cluster.local:4100/
    env:
      GF_DIAGNOSTICS_PROFILING_ADDR: 0.0.0.0
      GF_DIAGNOSTICS_PROFILING_ENBLED: true
      GF_DIAGNOSTICS_PROFILING_PORT: 6060
      GF_FEATURE_TOGGLES_ENABLE: flameGraph
    grafana.ini:
      server:
        root_url: https://grafana.example.com
    ingress:
      annotations:
        cert-manager.io/cluster-issuer: letsencrypt
        ingress.kubernetes.io/ssl-redirect: "true"
      enabled: true
      hosts:
      - grafana.example.com
      ingressClassName: haproxy
      tls:
      - hosts:
        - grafana.example.com
        secretName: grafana.example.com
    podAnnotations:
      phlare.grafana.com/port: "6060"
      phlare.grafana.com/scrape: "true"
    rbac:
      pspEnabled: false
      pspUSeAppArmor: false
    sidecar:
      alerts:
        enabled: true
        ignoreAlreadyProcessed: true
        label: grafana-alert
        resource: both
        searchNamespace: ALL
      dashboards:
        enabled: true
        folder: /tmp/dashboards
        folderAnnotation: grafana-dashboard-folder
        label: grafana-dashboard
        labelValue: null
        provider:
          allowUiUpdates: false
          disableDelete: false
          foldersFromFilesStructure: true
          name: sidecarProvider
          orgid: 1
        searchNamespace: ALL
        watchMethod: WATCH
      enableUniqueFilenames: false
      plugins:
        enabled: true
        label: grafana-plugin
        labelValue: null
        resource: both
        searchNamespace: null
        watchMethod: WATCH

Grafana Tempo®

For Grafana Tempo® we will use the helm chart hosted by Grafana Labs®.

Setting up the base overlay:


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: grafana-tempo
resources:
- namespace.yaml
- release.yaml

- agent/clusterrole.yaml
- agent/clusterrolebinding.yaml
- agent/deployment.yaml
- agent/service.yaml
- agent/serviceaccount.yaml
- agent/servicemonitor.yaml

configMapGenerator:
- name: grafana-agent-traces

patches:
- path: agent/configmap.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: grafana-tempo


apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: grafana-agent-traces
rules:
- apiGroups:
  - ""
  resources:
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  - events
  verbs:
  - get
  - list
  - watch
- nonResourceURLs:
  - /metrics
  verbs:
  - get


apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: grafana-agent-traces
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: grafana-agent-traces
subjects:
- kind: ServiceAccount
  name: grafana-agent-traces
  namespace: grafana-tempo

apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana-agent-traces
spec:
  minReadySeconds: 10
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: grafana-agent-traces
  template:
    metadata:
      labels:
        name: grafana-agent-traces
    spec:
      containers:
      - args:
        - -config.file=/etc/agent/agent.yaml
        - -server.http.address=0.0.0.0:3100
        command:
        - /bin/agent
        env:
        - name: HOSTNAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        image: grafana/agent:v0.30.2
        imagePullPolicy: IfNotPresent
        name: grafana-agent-traces
        ports:
        - containerPort: 3100
          name: http-metrics
        - containerPort: 4317
          name: otlp
          protocol: TCP
        volumeMounts:
        - mountPath: /etc/agent
          name: grafana-agent-traces
      serviceAccountName: grafana-agent-traces
      volumes:
      - configMap:
          name: grafana-agent-traces-29468dgd96
        name: grafana-agent-traces

apiVersion: v1
kind: Service
metadata:
  labels:
    name: grafana-agent-traces
  name: grafana-agent-traces
  namespace: grafana-tempo
spec:
  ports:
  - name: grafana-agent-traces-http-metrics
    port: 3100
    targetPort: 3100
  - name: grafana-agent-traces-otlp
    port: 4317
    protocol: TCP
    targetPort: 4317
  selector:
    name: grafana-agent-traces

apiVersion: v1
kind: ServiceAccount
metadata:
  name: grafana-agent-traces

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: grafana-agent-traces
  labels:
    instance: primary
spec:
  endpoints:
  - port: grafana-agent-traces-http-metrics
    relabelings:
    - replacement: grafana-tempo/agent
      sourceLabels:
      - job
      targetLabel: job
    scheme: http
  namespaceSelector:
    matchNames:
    - grafana-tempo
  selector:
    matchLabels:
      name: grafana-agent-traces

apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-agent-traces
data:
  agent.yaml: |
    traces:
      configs:
        - name: tempo
          batch:
            send_batch_size: 1000
            timeout: 5s
          receivers:
            otlp:
              protocols:
                grpc: null
                http: null
          service_graphs:
            enabled: true
          remote_write:
            - endpoint: grafana-tempo-distributor:4317
              insecure: true
          scrape_configs:
            - bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
              job_name: kubernetes-pods
              kubernetes_sd_configs:
                - role: pod
              relabel_configs:
                - action: replace
                  source_labels:
                    - __meta_kubernetes_namespace
                  target_label: namespace
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_name
                  target_label: pod
                - action: replace
                  source_labels:
                    - __meta_kubernetes_pod_container_name
                  target_label: container
              tls_config:
                  ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
                  insecure_skip_verify: false

And now for Grafana Tempo® it self:

tempo.structuredConfig.query_frontend.search.max_duration default value is 1h. This chart has the value to 12h0m0s so you can actually search further then just 1 hour.


apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: grafana-tempo
spec:
  chart:
    spec:
      chart: tempo-distributed
      interval: 15m
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: ">=1.2.2"
  interval: 15m
  values:
    compactor:
      config:
        compaction:
          block_retention: 12h
      resources:
        limits:
          memory: 2Gi
        requests:
          memory: 2Gi
    distributor:
      autoscaling:
        enabled: false
      config:
        log_received_spans:
          enabled: true
          filter_by_status_error: true
          include_all_attributes: true
      replicas: 1
    global:
    global_overrides:
      max_bytes_per_tag_values_query: 0
      metrics_generator_processors:
      - service-graphs
      - span-metrics
    ingester:
      autoscaling:
        enabled: false
      replicas: 2
    metaMonitoring:
      grafanaAgent:
        enabled: false
      serviceMonitor:
        enabled: true
        labels:
          instance: primary
    metricsGenerator:
      config:
        storage:
          remote_write:
          - url: http://grafana-mimir-nginx.grafana-mimir.svc.cluster.local/api/v1/push
      enabled: true
    minio:
      enabled: false
    search:
      enabled: true
    serviceAccount:
      name: grafana-tempo
    storage:
      trace:
        backend: local
    tempo:
      structuredConfig:
        query_frontend:
          search:
            max_duration: 12h0m0s
        storage:
          trace:
            block:
              version: vParquet
            pool:
              max_workers: 100
              queue_depth: 10000
    traces:
      otlp:
        grpc:
          enabled: true

Grafana Mimir

For Grafana Mimir® we will use the helm chart hosted by Grafana Labs®.

Setting up the base overlay:


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: grafana-mimir
resources:
- namespace.yaml
- release.yaml


apiVersion: v1
kind: Namespace
metadata:
  name: grafana-mimir


apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: grafana-mimir
spec:
  chart:
    spec:
      chart: mimir-distributed
      interval: 5m
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: ">=4.2.0"
  interval: 5m
  values:
    alertmanager:
      persistentVolume:
        enabled: true
      replicas: 1
      statefulSet:
        enabled: true
    compactor:
      persistentVolume:
        size: 50Gi
    distributor:
      replicas: 1
    ingester:
      affinity:
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: target
                operator: In
                values:
                - ingester
            topologyKey: kubernetes.io/hostname
      persistentVolume:
        size: 50Gi
      replicas: 3
      zoneAwareReplication:
        enabled: false
    metaMonitoring:
      serviceMonitor:
        enabled: true
        labels:
          instance: primary
    mimir:
      structuredConfig:
        blocks_storage:
          backend: filesystem
        frontend:
          align_queries_with_step: true
          log_queries_longer_than: 10s
        ingester_client:
          grpc_client_config:
            max_recv_msg_size: 104857600
            max_send_msg_size: 104857600
        limits:
          ingestion_burst_size: 2000000
          ingestion_rate: 200000
          max_global_series_per_user: 0
          out_of_order_time_window: 30m
        multitenancy_enabled: false
        server:
          grpc_server_max_concurrent_streams: 1000
          grpc_server_max_recv_msg_size: 104857600
          grpc_server_max_send_msg_size: 104857600
    minio:
      enabled: false
    overrides_exporter:
      replicas: 1
    querier:
      replicas: 2
    query_frontend:
      replicas: 1
    rbac:
      create: false
    rollout_operator:
      enabled: false
    ruler:
      replicas: 1
    store_gateway:
      persistentVolume:
        size: 50Gi
      replicas: 1
      zoneAwareReplication:
        enabled: false

Grafana Phlare

For Grafana Phlare we will use the helm chart hosted by Grafana Labs®.

Setting up the base overlay:


apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: grafana-phlare
resources:
- namespace.yaml
- release.yaml


apiVersion: v1
kind: Namespace
metadata:
  name: grafana-phlare

apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: grafana-phlare
spec:
  chart:
    spec:
      chart: phlare
      interval: 5m
      sourceRef:
        kind: HelmRepository
        name: grafana
        namespace: flux-system
      version: ">=0.1.0"
  interval: 5m
  values:
    minio:
      enabled: false
    phlare:
      components:
        agent:
          kind: Deployment
          replicaCount: 1
          resources:
            limits:
              memory: 512Mi
            requests:
              cpu: 50m
              memory: 128Mi
        distributor:
          kind: Deployment
          replicaCount: 2
          resources:
            limits:
              memory: 1Gi
            requests:
              cpu: 500m
              memory: 256Mi
        ingester:
          kind: StatefulSet
          replicaCount: 3
        querier:
          kind: Deployment
          replicaCount: 2
          resources:
            limits:
              memory: 1Gi
            requests:
              cpu: 100m
              memory: 256Mi

Topology

Once all deployed, you’ll have a architecture deployed something like:

N a m D i N N e e n a a s p g m D D D D D D D S S S S m D D D D S S p l r e e e e e e e e t t t t e e e e e t t a o e s p p p p p p p a a a a s p p p p a a c y s p l l l l l l l t t t t p l l l l t t e m s a o o o o o o o e e e e a o o o o e e : e - c y y y y y y y f f f f c y y y y f f n c e m m m m m m m u u u u e m m m m u u g t o : e e e e e e e l l l l : e e e e l l r : n n n n n n n n s s s s n n n n s s a t g t t t t t t t e e e e g t t t t e e f a r r t t t t r t t a g o a : : : : : : : : : : : a : : : : : : n e l f f a n l a D N O Q Q Q R A C I S a D G Q R I Q - t e n i g v u u u u l o n t n i a u u n u a - r a s i e e e e l e m g o a s t e l g e g o - t n r r r r e r p e r - t e r e e r e p m r x r i y y r t a s e l r w y r s i n e i i i e - - m c t - o i a - t e t r g m b d r f s a t e g k b y f e r - a r i u e r c n o r a i u r r o t a r t s o h a r t t o p o f o - n e g e o n e r a r e t d e w r t r n x e u r a e a a p n l y n t . o d e d o e r r r x t | a e l m r o p g l e . c o m N m e a t m D S r e e t s i s p a s v c p l t v c s a o e c c y f A e m u N g : e l a e n s m D n g t e e e t r t s p s a : : p l f a o s a a a c y e n g g e m D A T A S O U R C E S n a e e : e d - n n n i a t t g t l n g - r : o g e l a g n o f G s d t g a r a s n a s t a f v a a c n s a v c N N a a m m D D D S e D D D D D D S S e e e e t s e e e e e e t t s p p p a p p p p p p p a a p l l l t a l l l l l l t t a o o o e c o o o o o o e e c y y y f e y y y y y y f f e m m m u : m m m m m m u u : e e e l e e e e e e l l n n n s g n n n n n n s s g t t t e r t t t t t t e e r t a t t a : : : : f : : : : : : : : f a a A D Q I n A C D M Q Q I M n g i u n a g o i e u u n e a e s e g - e m s t e e g m - n t r e t n p t r r r e c p t r i s e t a r i i y s a h i e t m - c i c e - t c l b r e p t t b s r f e h a u r o r o u - r r e r t a r t g o d e o c o e n r e r n t s e e r n a d t o r

Disclaimer

The Grafana Labs Marks are trademarks of Grafana Labs, and are used with Grafana Labs’ permission. We are not affiliated with, endorsed or sponsored by Grafana Labs or its affiliates.