Skip to main content

Deployment

info

This page covers deploying the school platform to Kubernetes with Flux GitOps, including NFS storage, database backup infrastructure, CI/CD pipeline, and operational commands.

School Platform series

  1. School Platform
  2. Architecture
  3. AI features
  4. Calendar integration
  5. Deployment - You are here

Deployment patterns

All patterns mirror the coaching platform for consistency across my cluster:

PatternImplementation
Image taggingprod-YYYYMMDD.{pipeline_id}
PV naming{app}-{env}-pv, {app}-{env}-db-pv
Secret naming{app}-app-secret, {app}-db-secret, {app}-registry
DB deploymentDeployment with Recreate strategy (not StatefulSet)
Secret encryptionSOPS with age

Two-repo pattern

Two repositories are involved:

  1. App repo (school/platform) - application code and k8s/prod/ manifests
  2. flux-config repo - Flux objects in clusters/my-cluster/school-platform/

NFS storage

The platform uses three NFS paths from the QNAP NAS:

PV pathPurposeCreated by
/schoolPlatform storage (evidence uploads)NAS admin
/school/dbPostgreSQL dataNAS admin
/school/backupsDatabase backupsNAS admin

The database volume mount uses subPath: postgres to create a subdirectory within the NFS mount.

Database as Deployment

PostgreSQL runs as a Deployment with Recreate strategy rather than a StatefulSet. This is simpler for single-replica databases on NFS:

  • strategy.type: Recreate ensures the old pod stops before the new one starts (required for single-writer persistent storage)
  • securityContext.fsGroup: 999 (postgres group) is required for NFS file permissions

Init container for schema sync

The app deployment includes a schema-sync init container that runs psql commands to ensure required tables and columns exist before the application starts. This replaces running prisma db push in production.

All statements use IF NOT EXISTS or ADD COLUMN IF NOT EXISTS so the init container is idempotent and safe to run on every deployment. The SQL must be kept in sync with the Prisma schema.

CI/CD pipeline

The GitLab CI pipeline has three stages:

StageJobWhenWhat
lintlintAll pushesESLint check
testtestAll pushesTest script
buildbuild:mainPush to main onlyBuild and push Docker image

Images are tagged with prod-YYYYMMDD.{pipeline_id}. The pipeline skips when the commit message contains [skip ci], which is used by Flux ImageUpdateAutomation to prevent infinite loops.

Database backup infrastructure

The platform includes automated daily backups with 30-day retention and Slack notifications.

Backup CronJob

A Kubernetes CronJob runs daily at 3:00 AM AWST (Australia/Perth timezone):

  1. Creates a timestamped SQL dump using pg_dump | gzip
  2. Saves to /backups/school_platform_YYYYMMDD_HHMMSS.sql.gz
  3. Deletes backups older than 30 days
  4. Logs completion with file size
ResourceNamePurpose
PersistentVolumeschool-backups-pv-prodNFS storage for backups
PersistentVolumeClaimschool-backupsMounts /backups in pods
CronJobschool-platform-db-backupDaily backup with cleanup

Slack notifications

All backup and restore events are posted to a Slack channel:

EventExample
Backup success[Cron] School Platform -- Backup completed
Backup failure[Manual] School Platform -- Backup FAILED
Restore started[Restore] School Platform -- Started
Restore completed[Restore] School Platform -- Completed successfully
Restore failure[Restore] School Platform -- FAILED

Backup verification

Each backup can have a .meta.json sidecar file containing metadata: timestamp, table counts, pg_dump version, file size, and a verified flag. The verification endpoint checks gzip integrity, collects metadata, and checks schema compatibility.

StatusMeaning
compatibleBackup migration matches current app
needs-migrationBackup is older, migrations will run on restore
unknownNo migration info

Restore process

  1. Creates a safety backup before any restore
  2. Posts Slack notification that restore is starting
  3. Drops and recreates the database
  4. Loads backup via psql
  5. Runs prisma migrate deploy if migrations exist
  6. Posts Slack notification on success or failure

Admin UI

The Admin page provides a backup management table where you can view backup history with timestamps, sizes, and verification status. From there you can create manual backups, verify integrity and compatibility, restore from any backup with a confirmation dialog, and delete old backups.

Admin page showing calendar connection, import, and backup management sections.

Kubernetes manifests

The k8s/prod/ directory contains 15 manifests:

FilePurpose
kustomization.yamlKustomize configuration
05-secret-registry.enc.yamlRegistry pull secret (SOPS encrypted)
10-secret-app.enc.yamlApp secrets (SOPS encrypted)
20-secret-db.enc.yamlDatabase secrets (SOPS encrypted)
25-pv-platform.yamlPlatform storage PV
27-pv-backups.yamlBackup storage PV
28-pvc-backups.yamlBackup storage PVC
30-pvc-platform.yamlPlatform storage PVC
35-pv-db.yamlDatabase PV
36-pvc-db.yamlDatabase PVC
40-db-statefulset.yamlPostgreSQL Deployment and Service
50-app-deployment.yamlApp Deployment, Service, and init container
60-ingress.yamlIngress with TLS
70-cronjob-db-backup.yamlDaily backup CronJob
75-cronjob-calendar-sync.yamlCalendar sync (not deployed)

Monitoring

The platform is monitored via the cluster observability stack:

LayerToolWhat is monitored
Pod healthPrometheus and kube-state-metricsCrashes, restarts, OOM
LogsAlloy to LokiAll stdout collected
Node resourcesnode-exporterDisk, CPU, memory
External availabilityUptime KumaHTTP health check every 60 seconds

Operational commands

# View application logs
kubectl logs -n school deployment/school-platform -f

# View database logs
kubectl logs -n school deployment/school-platform-db

# Restart the application
kubectl rollout restart deployment/school-platform -n school

# Trigger a manual backup
kubectl create job --from=cronjob/school-platform-db-backup manual-backup -n school

# Check Flux status
flux get kustomization school-platform-prod

# Force reconciliation
flux reconcile kustomization school-platform-prod --with-source