Session 2026-05-30 — Unblock Coles Scraper (Cron)
Goal
Resolve the Imperva anti-bot block on Coles.com.au to restore the primary data source.
Actions Taken
1. Assessed existing knowledge base
- Loaded
australian-supermarket-apisskill — contains full API specs for both Coles and Woolworths - Reviewed project history (sessions from May 12–29): Imperva blocker documented since at least May 6, consistently blocking across all attempts
2. Verified camofox health endpoint
GET http://10.1.17.135:9377/health— no response (empty JSON parse error)- This is expected in cron context — the health check returned nothing, suggesting camofox may be intermittently unavailable from this pod’s network path
3. Analyzed known approaches from skill file
Woolworths: WORKING PATH ✅
- camofox JS evaluation bypass verified (2026-05-06 scrape)
- 89K+ products scraped across 16 categories
- Rate: ~44 products/page at 2.5s/page
- REST endpoints blocked by Akamai, but in-browser JS
fetch()calls work
Coles: BLOCKED ❌
- Imperva WAF blocks ALL tested approaches:
- Direct HTTP → challenge HTML (~4KB, no
__NEXT_DATA__) - camofox browser JS evaluation →
/browse.jsonreturns NetworkError - No build ID extractable from challenge page
- Requires residential proxy (Smartproxy ~$75/mo AU)
- Direct HTTP → challenge HTML (~4KB, no
4. Strategic pivot recommendation
Since Coles requires external procurement (residential proxy), and Woolworths scraping is proven working via camofox, the pragmatic approach is:
- Ship Woolworths-first data pipeline — get value flowing now
- Document Coles blocker clearly — proxy proposal for pvs approval
- Queue deploy tasks — GitLab + Docker + K8s pipeline can be worked while waiting on Coles resolution
Assessment: BLOCKED on Coles, WORKING on Woolworths
The Imperva block on Coles has been verified as persistent across 20+ session attempts since May 6. No code-level workaround exists — it requires a residential proxy subscription. This is a procurement decision for pvs, not a technical problem I can solve autonomously.
Recommended next actions
| Priority | Action | Owner | Effort |
|---|---|---|---|
| P0 | Ship Woolworths scraper (proven pattern) | Hermes | 1 session |
| P1 | Draft proxy proposal for Coles access | Hermes | 30 min |
| P2 | Procure residential proxy (~$75/mo) | pvs | Approval |
| P3 | Deploy pipeline to K8s (Woolworths data first) | pvs + Hermes | Multi-session |
Blocker Details
- Coles.com.au → Imperva WAF, blocks all non-residential IPs at challenge level
- Woolworths.com.au → Akamai WAF, bypassed via camofox in-browser JS evaluation (verified working)
- Aldi → Not yet investigated
Issues / Blockers
- Coles requires residential proxy — procurement decision needed from pvs
- Cannot autonomously purchase service or approve budget