Manifests
This page covers the app repo structure and key manifest configurations for the observability stack.
Observability stack series
- Observability stack
- Architecture
- Manifests - You are here
- Flux integration
- Operations
App repo structure
The app repo observability/monitoring contains all Kubernetes manifests for the monitoring stack.
observability/monitoring/
├── .sops.yaml # SOPS encryption config
├── README.md
└── k8s/prod/
├── kustomization.yaml
├── 00-secret-slack.enc.yaml # SOPS-encrypted Slack webhook
├── 10-helmrelease-kube-prom.yaml # Prometheus stack
├── 20-helmrelease-loki.yaml # Loki
├── 30-helmrelease-alloy.yaml # Log collector
├── 40-uptime-kuma/ # External monitoring
│ ├── kustomization.yaml
│ ├── 10-pvc.yaml
│ ├── 20-deployment.yaml
│ └── 30-service.yaml
├── 50-ingress-grafana.yaml # Grafana ingress
└── 51-ingress-uptime-kuma.yaml # Uptime Kuma ingress
SOPS configuration
The .sops.yaml at the repo root configures encryption for secrets:
---
# .sops.yaml
creation_rules:
- path_regex: k8s/.*\.enc\.ya?ml$
encrypted_regex: '^(data|stringData)$'
age: ['age1your-public-key-here...']
Encrypt the Slack secret before committing. Pushing unencrypted secrets to Git exposes your webhook URL.
sops -e -i k8s/prod/00-secret-slack.enc.yaml
Slack secret
The Alertmanager reads the Slack webhook URL from a Kubernetes secret.
# k8s/prod/00-secret-slack.enc.yaml (before encryption)
apiVersion: v1
kind: Secret
metadata:
name: alertmanager-slack
namespace: monitoring
type: Opaque
stringData:
webhook-url: "https://hooks.slack.com/services/YOUR/WEBHOOK/URL"
Version pinning
All HelmReleases use pinned versions rather than ranges or latest. This ensures:
- Reproducible deployments - the same commit always produces the same cluster state
- Controlled upgrades - changes only happen when you explicitly update the version
- Rollback safety - you can revert to a known-good version via Git history
When upgrading, test the new version in a dev environment first, then update the version in the manifest and commit.
kube-prometheus-stack HelmRelease
The main HelmRelease deploys Prometheus, Alertmanager, Grafana, node-exporter, and kube-state-metrics.
# k8s/prod/10-helmrelease-kube-prom.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: kube-prometheus-stack
namespace: monitoring
spec:
interval: 1h
timeout: 15m
chart:
spec:
chart: kube-prometheus-stack
version: "80.6.0"
sourceRef:
kind: HelmRepository
name: prometheus-community
namespace: monitoring
values:
alertmanager:
config:
route:
receiver: slack
group_by: [alertname, namespace]
routes:
- receiver: "null"
matchers:
- alertname = "Watchdog"
- receiver: slack
matchers:
- severity =~ "critical|warning"
receivers:
- name: "null"
- name: slack
slack_configs:
- api_url_file: /etc/alertmanager/secrets/alertmanager-slack/webhook-url
channel: "#alerts"
send_resolved: true
alertmanagerSpec:
secrets:
- alertmanager-slack
storage:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 5Gi
# Disable kube-proxy monitoring (no metrics endpoint)
kubeProxy:
enabled: false
# Disable noisy overcommit alerts
defaultRules:
disabled:
KubeMemoryOvercommit: true
KubeCPUOvercommit: true
prometheus:
prometheusSpec:
retention: 15d
storageSpec:
volumeClaimTemplate:
spec:
resources:
requests:
storage: 50Gi
grafana:
enabled: true
persistence:
enabled: true
size: 5Gi
additionalDataSources:
- name: Loki
type: loki
url: http://loki:3100
access: proxy
Key configurations:
| Setting | Value | Purpose |
|---|---|---|
alertmanager.config.routes | Watchdog to null | Silence dead man's switch alert |
alertmanager.config.receivers | Slack webhook from secret | Alert notifications |
alertmanagerSpec.secrets | alertmanager-slack | Mount secret into Alertmanager |
kubeProxy.enabled | false | Disable (no metrics endpoint available) |
defaultRules.disabled | Overcommit alerts | Disable noisy capacity alerts |
prometheus.retention | 15d | Metrics retention period |
prometheus.storage | 50Gi | Metrics storage |
grafana.additionalDataSources | Loki | Pre-configure Loki connection |
Loki HelmRelease
Loki runs in single-binary mode for simplicity.
# k8s/prod/20-helmrelease-loki.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: loki
namespace: monitoring
spec:
chart:
spec:
chart: loki
version: "6.46.0"
sourceRef:
kind: HelmRepository
name: grafana
namespace: monitoring
values:
deploymentMode: SingleBinary
loki:
auth_enabled: false
commonConfig:
replication_factor: 1
storage:
type: filesystem
singleBinary:
replicas: 1
persistence:
enabled: true
size: 20Gi
backend:
replicas: 0
read:
replicas: 0
write:
replicas: 0
Grafana Alloy HelmRelease
Alloy collects logs from all pods and pushes to Loki.
# k8s/prod/30-helmrelease-alloy.yaml (key sections)
apiVersion: helm.toolkit.fluxcd.io/v2
kind: HelmRelease
metadata:
name: alloy
namespace: monitoring
spec:
chart:
spec:
chart: alloy
version: "1.5.1"
sourceRef:
kind: HelmRepository
name: grafana
namespace: monitoring
dependsOn:
- name: loki
values:
alloy:
configMap:
content: |
discovery.kubernetes "pods" {
role = "pod"
}
discovery.relabel "pods" {
targets = discovery.kubernetes.pods.targets
rule {
source_labels = ["__meta_kubernetes_pod_phase"]
regex = "Running"
action = "keep"
}
rule {
source_labels = ["__meta_kubernetes_namespace"]
target_label = "namespace"
}
rule {
source_labels = ["__meta_kubernetes_pod_name"]
target_label = "pod"
}
rule {
source_labels = ["__meta_kubernetes_pod_container_name"]
target_label = "container"
}
}
loki.source.kubernetes "pods" {
targets = discovery.relabel.pods.output
forward_to = [loki.write.default.receiver]
}
loki.write "default" {
endpoint {
url = "http://loki:3100/loki/api/v1/push"
}
}
controller:
type: daemonset
The dependsOn ensures Loki is ready before Alloy starts pushing logs.
HelmRelease dependencies
Uptime Kuma deployment
Uptime Kuma is deployed as a standard Kubernetes deployment with a PVC for persistence.
# k8s/prod/40-uptime-kuma/20-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: uptime-kuma
namespace: monitoring
spec:
replicas: 1
strategy:
type: Recreate
template:
spec:
containers:
- name: uptime-kuma
image: louislam/uptime-kuma:1.23.16
ports:
- containerPort: 3001
volumeMounts:
- name: data
mountPath: /app/data
livenessProbe:
httpGet:
path: /
port: 3001
readinessProbe:
httpGet:
path: /
port: 3001
volumes:
- name: data
persistentVolumeClaim:
claimName: uptime-kuma-data
Ingress
Two ingress resources expose Grafana and Uptime Kuma on the local network using the wildcard certificate.
# k8s/prod/50-ingress-grafana.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: grafana-ingress
namespace: monitoring
spec:
ingressClassName: nginx
tls:
- hosts:
- grafana.example.local
rules:
- host: "grafana.example.local"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: kube-prometheus-stack-grafana
port:
number: 80
# k8s/prod/51-ingress-uptime-kuma.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: uptime-kuma-ingress
namespace: monitoring
spec:
ingressClassName: nginx
tls:
- hosts:
- uptime.example.local
rules:
- host: "uptime.example.local"
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: uptime-kuma
port:
number: 3001
| Service | Host | Port |
|---|---|---|
| Grafana | grafana.example.local | 80 |
| Uptime Kuma | uptime.example.local | 3001 |
TLS is handled by the cluster wildcard certificate for *.example.local.
Kustomization
The kustomization ties all resources together:
# k8s/prod/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- 00-secret-slack.enc.yaml
- 10-helmrelease-kube-prom.yaml
- 20-helmrelease-loki.yaml
- 30-helmrelease-alloy.yaml
- 40-uptime-kuma
- 50-ingress-grafana.yaml
- 51-ingress-uptime-kuma.yaml
Note that HelmReleases reference HelmRepositories in the monitoring namespace, not flux-system. The HelmRepositories are created via the Flux config repo.