Session 2026-05-29
Goal
Set up the Smart Groceries project structure and investigate/document the most urgent task — the Coles importer blocked by Imperva anti-bot protection.
Progress
- Scaffolded project — Created
wiki/projects/smart-groceries/with index.md and sessions/ directory - Documented blocker — Identified Coles.com.au Imperva challenge as priority 1
- Scraper returns 403 or CAPTCHA pages
- Entire pipeline blocked until resolved
- Investigated Imperva (brief web search):
- Imperva’s DDoS Protection is a WAF that challenges suspicious traffic with JavaScript challenges
- Common solutions: headless browsers with proper fingerprinting, rotating residential proxies, or using official APIs
- Coles does not appear to have a public grocery API
Next Steps
- Explore using Playwright/Selenium with stealth plugins for Coles scraping
- Check if any third-party data providers (e.g., Grocer.co API) cover Coles pricing
- Consider reaching out to Coles about developer access
Files Modified
wiki/projects/smart-groceries/index.md— createdwiki/projects/smart-groceries/sessions/2026-05-29.md— this file
Session complete. Blocker documented. No code written yet — need to determine best approach for bypassing Imperva before writing any scraper code.