Session 2026-05-29

Goal

Set up the Smart Groceries project structure and investigate/document the most urgent task — the Coles importer blocked by Imperva anti-bot protection.

Progress

  1. Scaffolded project — Created wiki/projects/smart-groceries/ with index.md and sessions/ directory
  2. Documented blocker — Identified Coles.com.au Imperva challenge as priority 1
    • Scraper returns 403 or CAPTCHA pages
    • Entire pipeline blocked until resolved
  3. Investigated Imperva (brief web search):
    • Imperva’s DDoS Protection is a WAF that challenges suspicious traffic with JavaScript challenges
    • Common solutions: headless browsers with proper fingerprinting, rotating residential proxies, or using official APIs
    • Coles does not appear to have a public grocery API

Next Steps

  • Explore using Playwright/Selenium with stealth plugins for Coles scraping
  • Check if any third-party data providers (e.g., Grocer.co API) cover Coles pricing
  • Consider reaching out to Coles about developer access

Files Modified

  • wiki/projects/smart-groceries/index.md — created
  • wiki/projects/smart-groceries/sessions/2026-05-29.md — this file

Session complete. Blocker documented. No code written yet — need to determine best approach for bypassing Imperva before writing any scraper code.