How does the workflow extract product data from screenshots?

The workflow sends full-page screenshots to the Google Gemini AI model via Google PaLM API. The AI reads the image to find product titles, prices, brands, and promotions and returns structured data.

What happens if the AI cannot get enough data from screenshots?

If screenshot data is incomplete, the workflow fetches the page HTML from ScrapingBee and sends that text to the AI as a fallback. This helps extract missing details using HTML parsing.

How does the workflow save scraped data for further use?

Scraped product information is parsed into JSON and then appended as rows in a ‘Results’ tab in Google Sheets. This makes the data easy to access for reports or analysis.

How can a beginner start using this workflow in n8n?

Download the workflow file from the page, import it into n8n using ‘Import from File,’ add all required API keys and credentials, update Google Sheets IDs if necessary, run a manual test, then activate it for production use.

Vision-Based AI Scraper With Python, Gemini & Google Sheets

What this workflow does

This workflow gets product details from many e-commerce webpages automatically. It solves the problem of manual copying and mistakes in price and brand information collection. The workflow uses AI to read screenshots and HTML to get product names, prices, brands, and promotions. The result is clean, organized data saved in Google Sheets for easy analysis.

This helps save many hours of work and improves the quality of competitor data.

Who should use this workflow

This is good for anyone who needs to track product info on many websites. Especially helpful for marketing teams, price watchers, and analysts who want fast and accurate data without typing or copying manually.

You do not need deep technical skill but should know basic n8n workflow operation.

Tools and services used in this workflow

Google Sheets: Stores product URLs and saves scraped product data.

ScrapingBee API: Captures full-page screenshots and raw HTML of product pages.

Google PaLM API (Google Gemini): Analyzes screenshots with vision AI to extract product details.

n8n platform: Runs the automation workflow connecting services and processing data.

Inputs, processing steps, and outputs

Inputs

A list of product URLs from a Google Sheet named “List of URLs”.

API credentials for ScrapingBee and Google PaLM.

Processing Steps

Read product page URLs from Google Sheets.

Send URLs to ScrapingBee to get full-page screenshots.

Use Google Gemini AI model to read screenshots and extract product information.

If screenshot data is incomplete, fallback to fetch page HTML via ScrapingBee and parse with AI.

Parse AI output into structured JSON with product fields.

Split the JSON array into individual product items.

Append parsed product details into Google Sheets “Results” tab.

Outputs

Clean, structured product data including title, price, brand, promotions saved in Google Sheets.

Data ready for analysis and reporting with minimal manual work.

Beginner step-by-step: How to use this workflow in n8n for production

Step 1: Download and import the workflow

Download the workflow file using the Download button on this page.

Open the n8n editor where you want to use this workflow.

Use the Import from File feature in n8n to upload this workflow JSON file.

Step 2: Add credentials and configure nodes

Add ScrapingBee API key under credentials in n8n.

Add Google PaLM API credentials for the Google Gemini model.

Provide Google Sheets service account credentials with access to the correct spreadsheet.

Update the Google Sheets node with the correct document ID for the URLs sheet and the results tab if needed.

If required, update any email addresses or channels in notifications or sub-workflows.

Step 3: Test the workflow

Use the manual trigger Manual Trigger (When clicking ‘Test workflow’) to run the workflow once.

Check the workflow logs and Google Sheets results to make sure data is fetched and saved as expected.

Step 4: Activate for production use

After confirming test works, activate the workflow in n8n.

Optionally replace the manual trigger with a time trigger to run daily or weekly.

Monitor execution and errors regularly to ensure consistent data flow.

For users running on their own server, consider using self-host n8n for more control and reliability.

Edge cases and failure handling

The workflow uses a fallback method if the AI cannot extract data from screenshots. It fetches the HTML version of the page and retries AI parsing there. This helps catch missing or unclear info.

If API keys are wrong or expired, the workflow will stop. Check credentials regularly.

Google Sheets formatting mistakes like mismatched columns may cause data to save incorrectly. Make sure sheet columns match expected fields exactly.

Customization ideas

Change fields extracted by updating the JSON schema in the Structured Output Parser. For example, add product ratings or stock availability.

Use different AI models compatible with LangChain instead of Google Gemini if wanted for other AI behavior.

Set up automated triggers to run scraping regularly without manual intervention.

Add filtering nodes to only scrape certain domains or categories based on URL patterns.

Capture only relevant screenshot areas for focused AI reading and reduced API usage.

Summary of benefits and results

✓ Saves many hours of manual scraping work
✓ Improves accuracy and consistency of product competitor data
✓ Produces structured, detailed product info for easy analysis
✓ Uses AI vision with screenshot and HTML fallback for better data collection
✓ Integrates with Google Sheets for convenient data storage and reporting

→ Data is ready for pricing strategies, competitive analysis, and market insights faster
→ Automation reduces errors and manual effort in e-commerce data gathering

Vision-Based AI Scraper with Python, Gemini & Google Sheets

What this workflow does

Who should use this workflow

Tools and services used in this workflow

Inputs, processing steps, and outputs

Inputs

Processing Steps

Outputs

Beginner step-by-step: How to use this workflow in n8n for production

Step 1: Download and import the workflow

Step 2: Add credentials and configure nodes

Step 3: Test the workflow

Step 4: Activate for production use

Edge cases and failure handling

Customization ideas

Summary of benefits and results

Frequently Asked Questions

2 Months of Sales Navigator 👉 FREE

10,000+ n8n Workflows to Download & Learn Building

Automate your LinkedIn Posts

1:1 - Meeting FREE

Get Self-Host n8n

Promoted by BULDRR AI

Learn by Category

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

AI SEO Blog Writer Automation Workflows in n8n

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

Automate Telegram Invoices to Notion with AI Summaries & Reports

Automate Email Replies with n8n and AI-Powered Summarization

Automate Email Campaigns Using n8n with Gmail & Google Sheets

Browse by Apps