2026-06-07

Session goal: Run bill scanner, process new email attachments with OCR/text extraction.

Progress log:

09:00 — Ran bill-scanner.py --scan. Found 5 emails, downloaded 3 attachment(s).
09:01 — Attempted --process-attachments. PyMuPDF (fitz) missing from system Python — venv had it but shebang was wrong. Confirmed /opt/data/.venv/bin/python3 loads PyMuPDF v1.27.2.3.
09:05 — Extracted text via PyMuPDF directly (heredoc script) for the 3 new attachments:
- ID 112 — Invoice #33519 from The Lawnfeed Company ($125, due 12 Jun 2026). New vendor.
- ID 111 — Unitywater Bill (455 KB PDF) — duplicate of existing; no new data.
- ID 111 — “What does your water bill pay for?” — marketing flyer, not a bill.
09:08 — Extracted key details from both invoices via OCR text output.

Outputs:

Vendor	Bill #	Amount	Due Date	Notes
The Lawnfeed Co.	INV-33519	$125.00 (inc GST)	12 Jun 2026	New vendor — lawn fertiliser/treatment. Bank: Westpac ANTHONY PECK, BSB 034243, AC# 228865
Unitywater	#7128760918	$493.71	26 Jun 2026	Duplicate of previously processed bill. Account #100114688

Issues / Questions:

The Lawnfeed Company is a new vendor not seen in prior bills. Invoice reference = payment ref for bank transfer. Needs pvs verification that this service is expected/authorised.
PyMuPDF works from venv only (/opt/data/.venv/bin/python3). The scanner script shebang should point there permanently to avoid future “No module named ‘fitz’” errors.

Status: done

Quartz 4