Queue Observability Snapshot — 2026-06-07

Queue Depth

StateCountNotes
done832Healthy throughput
active65Normal load
blocked2Clean — no stale items >7d
queued4Low backlog

Throughput (last 24h)

  • 106 tasks completed — strong processing rate
  • No significant throughput decline from prior days

DLQ Signals (failure signatures in done/ last 24h)

CategoryCountSeverityNotes
API connection failures9MEDIUMTransient infra — within normal range
HTTP 500 errors0✅ CLEANNo backend instability detected
Timeouts8LOW-MEDSome latency but not spiking
Tool missing (exit 127)12⚠️ ELEVATEDWorker environment gaps — kubectl/binaries not in PATH
Unknown failure tags0✅ CLEANClassifier working correctly since May 4 fix

Stuck Tasks (>6h idle in active/)

  • 3 stale tasks >72h: cronjob-remediation, cluster-fix-calliope-kanban-bug, 4a346f5d-vt-004-mastodon
  • 7 tasks between 1–3 days idle — likely completed by another run or superseded

Active Project Cards (queue/)

CardPriorityStatus
queue-baseline-phase-0URGENT → RESOLVEDCompleted May 4
queue-feeder-cronHIGHOperational — no gaps detected
fix-recurring-unknownHIGH → RESOLVEDFixed May 6, zero unknown tags today
schema-drift-20260502MEDIUMQueued, awaiting processing
project-wiki-lint-dailyLOWBlocked (coaching.md — 1d old)

Metrics Emitter Status

  • queue-baseline-metrics.jsonl: 450 lines, last modified 4.1h ago
  • File is still growing — emitter operational
  • Post-Phase-0 stub-only tick pattern persists (known limitation)

Kanban Board: queue-ops

  • 63 tasks tracked across board
  • No duplicate daily session for today detected
  • Previous 2026-06-07 task was from overlapping cron run — marked complete

Summary Assessment

MetricStatusTrend
Throughput✅ Healthy (106/24h)Stable
HTTP 500✅ Clean (0)Improved (was 13 on Jun 5)
Tool missing⚠️ Elevated (12)Persistent — PATH fix needed
API failuresMEDIUM (9)Normal baseline (~8-10/day)
Unknown tags✅ Clean (0)Fixed since May 6
Stuck tasksLOW (3 stale)Cleanup recommended

Action Items for pvs

  1. Tool missing (exit 127): 12 occurrences suggest systematic worker PATH gap. Consider adding kubectl and common CLI tools to the cron worker container image.
  2. Stale active cards: 3 items >72h idle in active/ should be reviewed — either archived or unblocked.
  3. Blocked coaching card: project-wiki-lint-daily-coaching.md is 1 day old in blocked/ — unblock if the blocker resolved.