Wiki Lint Daily — Project Review (2026-06-22)

What it is

Daily wiki hygiene tool. A Python scanner (lint_scan.py) walks the SilverBullet wiki tree checking for frontmatter validity, required fields, type compliance, and tag taxonomy violations. Runs as part of autopilot Mode D rotation. Goal: keep metadata consistent so the wiki stays navigable.

Current state

  • Script exists at scripts/lint_scan.py — functional, loads tags dynamically from SCHEMA.md
  • Last meaningful scan: Jun 22 (321 pages, 387 issues). Stable since Jun 17 (~369-387 issues across runs)
  • 30+ session logs spanning Jun 5–Jun 22 showing repeated scans with similar results
  • Key past work done: fixed broken wikilinks in concepts/, added frontmatter to 278 files, created missing root index.md, resolved smart-groceries dead session links

Gaps and risks

  • No regression tracking: Jun 22 scan noted “baseline not found on disk” — no persistent baseline file means delta comparison is impossible
  • Issues plateauing at ~370+: Same structural issues keep appearing (missing created/updated, bad types in archived dirs) but nothing systematic fixes them
  • Schema drift risk: VALID_TYPES and REQUIRED_FIELDS are hardcoded in the script. SCHEMA.md changes won’t propagate unless someone manually updates lint_scan.py
  • No remediation automation: Scanner only reports; doesn’t fix. The same missing-fields issues appear every scan because nothing auto-generates the missing created/updated fields
  • Duplicate project dirs: t-002 in todos.json flagged “wiki-lint-daily vs wikilint-daily duplicate dir” — still open, status=doing since Jun 18
  • Tag loading fragile: _load_allowed_tags() extracts backtick-wrapped words from SCHEMA.md using a broad regex. Any formatting change to SCHEMA could break it silently

Build automated remediation on top of the existing scanner. Fix what’s fixable (missing dates → inject today’s date, bad types → map to closest valid type), then focus effort only on structural/schema gaps that require human decisions.

Keep the “daily scan” name but split into two phases:

  1. Auto-fix: inject missing fields, normalize types — no-op unless something changes
  2. Report-only: flag what requires schema updates or human judgment

Phased plan

Phase 1 — Stabilise (t-001 to t-004)

  • Write baseline after each scan to data/baseline.json for regression tracking
  • Add auto-fix mode: inject created/updated from filesystem mtime or today’s date when missing
  • Clean up the duplicate dir issue (wikilint-daily vs wiki-lint-daily)

Phase 2 — Schema-aware linting (t-005 to t-007)

  • Parse SCHEMA.md VALID_TYPES programmatically instead of hardcoding in script
  • Add type-mapping table: known invalid types → their closest valid equivalent (e.g., goal-reflectionsession)
  • Tag loading: tighten regex or parse structured sections

Phase 3 — Continuous enforcement (t-008+)

  • Wire auto-fix into autopilot cron so scans both detect and repair on each tick
  • Add a weekly summary that shows issue delta (up/down/stable) instead of raw counts
  • Consider archiving old session logs (>30 days) to reduce scan noise