Integrations (Euka and Mathspace)
This page covers the nightly scraper that pulls Year 7 lesson progress from Euka and Mathspace into the school platform, automatically scheduling work and capturing evidence for moderator meetings.
School Platform series
- School Platform
- Architecture
- AI features
- Calendar integration
- Programme
- Explainers
- Schedule
- Integrations - You are here
- Deployment
Architecture
Components
| Component | Repository | Purpose |
|---|---|---|
| Scraper pod | school/scrapers (separate repo) | Playwright + Node container that navigates Euka and Mathspace |
| Ingest route | school/platform | Receives structured lesson data, writes DB rows |
| State route | school/platform | Provides locked terms and priority candidates to the scraper |
| ScraperSyncWatcher | school/platform | Global client component that fires toast notifications on completion |
Three-phase ingest protocol
Every scraper run produces exactly one ScraperRun row that moves through three states:
Phase: initial
Fires before any per-lesson Playwright work. Creates the ScraperRun row and runs the full upsert chain (terms, subjects, lessons, activities, schedule items, learning area links, first-seen timestamps). Returns a runId for subsequent calls.
Phase: enrich
Runs after the scraper has fetched intro text and apply-tab scores for completed lessons. Updates the same ScraperRun row with enrichment data (intro text, learning objectives, apply scores, feedback) and generates evidence JSON summaries. Never overwrites base lesson state or timestamps.
Phase: error
Fired when the scraper throws. Marks the ScraperRun as failed with an error message. Short-circuits without any lesson writes.
This separation means a mid-run crash during enrichment never loses the base lesson state captured in the initial phase.
What happens automatically
When a lesson transitions from not-started to in-progress or completed:
| Action | Detail |
|---|---|
| Activity created | Titled "Euka - Subject - Lesson N - Title" |
| Schedule item created | Dated to the Perth calendar day the work was first observed |
| Learning area linked | Mapped from Euka subject to WA curriculum area |
| Evidence captured | JSON summary with apply score, feedback, and lesson metadata (on completion only) |
The platform never overwrites first-seen timestamps. startedSeenAt records when the lesson was first observed as in-progress. completedSeenAt records when it was first observed as complete. These provide honest history even when precision is limited to the scraper's cadence.
Admin UX
Sync Now button
The /admin page has a Sync Now button that creates a one-shot Kubernetes Job using the same image and environment as the nightly CronJob. No kubectl access needed.
Global toast notifications
ScraperSyncWatcher is a background client component mounted in Providers that:
- Polls
/api/admin/scrapers/latestevery 4 seconds after a sync is triggered - Fires a sticky Sonner toast (success or error) when the run completes
- Survives navigation between pages
- Works cross-tab via localStorage events
Auto-refresh
When a scraper run completes, /today, /schedule, and /integrations/euka re-fetch their data automatically so new lessons appear without a manual reload.
Term locking
Terms the family has not purchased can be locked to stop the scraper attempting to fetch them (avoiding 28+ wasted 403 requests per run). Locking is available manually via the admin UI. The scraper also auto-locks a term when every week returns 403, provided at least one other term in the same run succeeds (guarding against auth-expiry blanket-403s).
CronJob configuration
| Setting | Value | Rationale |
|---|---|---|
| Schedule | 15 18 * * * Australia/Perth | After homework window, before dinner |
| Concurrency | Forbid | Prevents race conditions on shared scraper state |
| Active deadline | 1200 seconds (20 min) | Hard kill if a Playwright page hangs |
| TTL after finished | 259200 seconds (3 days) | Keeps failed-pod logs available for diagnosis |
| Backoff limit | 2 | Two retries before the Job is abandoned |
Per-lesson resilience
Each completed lesson's apply-tab scrape has a 20-second Promise.race timeout. A single hung Learnosity page cannot stall the entire run. Failed lessons are logged individually and counted against a per-run cap (20 apply fetches per run) to keep total runtime within the 20-minute deadline.