Skip to main content

Manifests

info

This page covers the app repo structure and key manifest configurations for the observability stack.

Observability stack series

  1. Observability stack
  2. Architecture
  3. Manifests - You are here
  4. Flux integration
  5. Operations

App repo structure

The app repo observability/monitoring contains all Kubernetes manifests for the monitoring stack.

observability/monitoring/
├── .sops.yaml # SOPS encryption config
├── README.md
└── k8s/prod/
├── kustomization.yaml
├── 00-secret-slack.enc.yaml # SOPS-encrypted Slack webhook
├── 10-helmrelease-kube-prom.yaml # Prometheus stack
├── 20-helmrelease-loki.yaml # Loki
├── 30-helmrelease-alloy.yaml # Log collector
├── 40-uptime-kuma/ # External monitoring
│ ├── kustomization.yaml
│ ├── 10-pvc.yaml
│ ├── 20-deployment.yaml
│ └── 30-service.yaml
├── 50-ingress-grafana.yaml # Grafana ingress
└── 51-ingress-uptime-kuma.yaml # Uptime Kuma ingress

SOPS configuration

The .sops.yaml at the repo root configures encryption for secrets:

---
# .sops.yaml
creation_rules:
- path_regex: k8s/.*\.enc\.ya?ml$
encrypted_regex: '^(data|stringData)$'
age: ['age1your-public-key-here...']
warning

Encrypt the Slack secret before committing. Pushing unencrypted secrets to Git exposes your webhook URL.

sops -e -i k8s/prod/00-secret-slack.enc.yaml

Slack secret

The Alertmanager reads the Slack webhook URL from a Kubernetes secret.

# k8s/prod/00-secret-slack.enc.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
name: alertmanager-slack
namespace: monitoring
type: Opaque
stringData:
webhook-url: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"

Version pinning

All HelmReleases use pinned versions rather than ranges or latest. This ensures:

  • Reproducible deployments - the same commit always produces the same cluster state
  • Controlled upgrades - changes only happen when you explicitly update the version
  • Rollback safety - you can revert to a known-good version via Git history

When upgrading, test the new version in a dev environment first, then update the version in the manifest and commit.

kube-prometheus-stack HelmRelease

The main HelmRelease deploys Prometheus, Alertmanager, Grafana, node-exporter, and kube-state-metrics.

# k8s/prod/10-helmrelease-kube-prom.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: kube-prometheus-stack
namespace: monitoring
spec:
interval: 1h
timeout: 15m
chart:
spec:
chart: kube-prometheus-stack
version: "80.6.0"
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: monitoring
values:
alertmanager:
config:
route:
receiver: slack
group_by: [alertname, namespace]
routes:
- receiver: "null"
matchers:
- alertname = "Watchdog"
- receiver: slack
matchers:
- severity =~ "critical|warning"
receivers:
- name: "null"
- name: slack
slack_configs:
- api_url_file: /etc/alertmanager/secrets/alertmanager-slack/webhook-url
channel: "#alerts"
send_resolved: true
alertmanagerSpec:
secrets:
- alertmanager-slack
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 5Gi
# Disable kube-proxy monitoring (no metrics endpoint)
kubeProxy:
enabled: false
# Disable noisy overcommit alerts
defaultRules:
disabled:
KubeMemoryOvercommit: true
KubeCPUOvercommit: true
prometheus:
prometheusSpec:
retention: 15d
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 50Gi
grafana:
enabled: true
persistence:
enabled: true
size: 5Gi
additionalDataSources:
- name: Loki
type: loki
url: http://loki:3100
access: proxy

Key configurations:

SettingValuePurpose
alertmanager.config.routesWatchdog to nullSilence dead man's switch alert
alertmanager.config.receiversSlack webhook from secretAlert notifications
alertmanagerSpec.secretsalertmanager-slackMount secret into Alertmanager
kubeProxy.enabledfalseDisable (no metrics endpoint available)
defaultRules.disabledOvercommit alertsDisable noisy capacity alerts
prometheus.retention15dMetrics retention period
prometheus.storage50GiMetrics storage
grafana.additionalDataSourcesLokiPre-configure Loki connection

Loki HelmRelease

Loki runs in single-binary mode for simplicity.

# k8s/prod/20-helmrelease-loki.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: loki
namespace: monitoring
spec:
chart:
spec:
chart: loki
version: "6.46.0"
sourceRef:
kind: HelmRepository
name: grafana
namespace: monitoring
values:
deploymentMode: SingleBinary
loki:
auth_enabled: false
commonConfig:
replication_factor: 1
storage:
type: filesystem
singleBinary:
replicas: 1
persistence:
enabled: true
size: 20Gi
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0

Grafana Alloy HelmRelease

Alloy collects logs from all pods and pushes to Loki.

# k8s/prod/30-helmrelease-alloy.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: alloy
namespace: monitoring
spec:
chart:
spec:
chart: alloy
version: "1.5.1"
sourceRef:
kind: HelmRepository
name: grafana
namespace: monitoring
dependsOn:
- name: loki
values:
alloy:
configMap:
content: |
discovery.kubernetes "pods" {
role = "pod"
}

discovery.relabel "pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_phase"]
regex = "Running"
action = "keep"
}
rule {
source_labels = ["__meta_kubernetes_namespace"]
target_label = "namespace"
}
rule {
source_labels = ["__meta_kubernetes_pod_name"]
target_label = "pod"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_name"]
target_label = "container"
}
}

loki.source.kubernetes "pods" {
targets = discovery.relabel.pods.output
forward_to = [loki.write.default.receiver]
}

loki.write "default" {
endpoint {
url = "http://loki:3100/loki/api/v1/push"
}
}
controller:
type: daemonset

The dependsOn ensures Loki is ready before Alloy starts pushing logs.

HelmRelease dependencies

Uptime Kuma deployment

Uptime Kuma is deployed as a standard Kubernetes deployment with a PVC for persistence.

# k8s/prod/40-uptime-kuma/20-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: uptime-kuma
namespace: monitoring
spec:
replicas: 1
strategy:
type: Recreate
template:
spec:
containers:
- name: uptime-kuma
image: louislam/uptime-kuma:1.23.16
ports:
- containerPort: 3001
volumeMounts:
- name: data
mountPath: /app/data
livenessProbe:
httpGet:
path: /
port: 3001
readinessProbe:
httpGet:
path: /
port: 3001
volumes:
- name: data
persistentVolumeClaim:
claimName: uptime-kuma-data

Ingress

Two ingress resources expose Grafana and Uptime Kuma on the local network using the wildcard certificate.

# k8s/prod/50-ingress-grafana.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: monitoring
spec:
ingressClassName: nginx
tls:
- hosts:
- grafana.example.local
rules:
- host: "grafana.example.local"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kube-prometheus-stack-grafana
port:
number: 80
# k8s/prod/51-ingress-uptime-kuma.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: uptime-kuma-ingress
namespace: monitoring
spec:
ingressClassName: nginx
tls:
- hosts:
- uptime.example.local
rules:
- host: "uptime.example.local"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: uptime-kuma
port:
number: 3001
ServiceHostPort
Grafanagrafana.example.local80
Uptime Kumauptime.example.local3001

TLS is handled by the cluster wildcard certificate for *.example.local.

Kustomization

The kustomization ties all resources together:

# k8s/prod/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization

resources:
- 00-secret-slack.enc.yaml
- 10-helmrelease-kube-prom.yaml
- 20-helmrelease-loki.yaml
- 30-helmrelease-alloy.yaml
- 40-uptime-kuma
- 50-ingress-grafana.yaml
- 51-ingress-uptime-kuma.yaml

Note that HelmReleases reference HelmRepositories in the monitoring namespace, not flux-system. The HelmRepositories are created via the Flux config repo.