What This Automation Does
This workflow helps you get fresh data from websites without doing boring copying yourself.
It picks the best Bright Data scraping tool based on what you want and gets the page content as Markdown or HTML.
Then it sends the results to a webhook and saves it on your computer as a JSON file.
Google Gemini AI guides the process by understanding your requests so the right tools get used.
Running this can save hours of work and give clear, error-free data fast.
Who Should Use This Workflow
If you need to check competitor websites often and hate copying data by hand,
this can help you get data updated and ready for analysis without coding skills.
This is good for marketing workers, analysts, or anyone who wants to automate data gathering.
Tools and Services Used
- Bright Data MCP Client API: To access multiple scraping tools.
- Google Gemini (PaLM) API: To understand and create smart scraping requests.
- n8n automation platform: With community nodes for integrating MCP and Google Gemini.
- Webhook receiver service (like webhook.site): To catch the scraped data.
Inputs, Processing, and Outputs
Inputs
- Website URL you want to scrape.
- Format type, either Markdown or HTML.
- Webhook URL to send scraped data.
Processing Steps
- The AI Agent reads the request and picks the best Bright Data scraping tool.
- Bright Data MCP Client scrapes the webpage as Markdown or HTML.
- Google Gemini AI helps understand and process the user input for accuracy.
- Data is saved locally as JSON for a permanent record.
- HTTP Request sends data to the webhook instantly.
Outputs
- Structured data sent to the webhook URL for real-time use.
- JSON file saved on disk for archive.
Beginner step-by-step: How to use this workflow in n8n production
1. Download and Import the Workflow
- Click the Download button on this page to get the workflow file.
- Inside your n8n editor, click “Import from File” and upload the downloaded file.
2. Configure Credentials and URLs
- Add your MCP Client API Key in the n8n credentials manager.
- Add your Google Gemini API Key similarly.
- Find all nodes requiring these credentials and select the correct keys.
- Update or confirm the Set node fields for the scrape URL, webhook URL, and output format.
3. Test the Workflow
- Run the workflow manually using the Manual Trigger node.
- Check if the data arrives at your webhook and if the JSON file saves correctly.
4. Activate for Production
- Toggle the workflow active switch in n8n.
- Schedule or connect to triggers as you want to run this automatically.
- Consider self-host n8n to run this in production safely and reliably.
Customizations
- Add new scrape formats by putting them in the Set node and configuring matching MCP nodes.
- Change the AI Agent prompt if want to tell it more specific instructions about what content to scrape.
- Adjust where the JSON file saves by editing the file path in the ReadWriteFile node.
- Send scraped data to more webhooks or add data details before sending in the HTTP Request node.
Troubleshooting
- Authentication fails in MCP Client nodes
Check that your MCP API Key is right and saved in n8n. - AI Agent answers are empty or generic
Make sure prompt expressions like{{ $json.url }}are correct and Gemini API Key is set. - Scraped data not reaching webhook
Verify webhook URL is correct and HTTP Request node is sending the body.
Pre-Production Checklist
- Confirm MCP API credentials are active.
- Verify Google Gemini API key works.
- Test URLs in the Set node are correct.
- Test the webhook URL by sending test data.
- Run the workflow once inside n8n to check data flow.
- Backup existing JSON files before starting.
Deployment Guide
Activate the workflow in the n8n editor by switching on the active state.
You can add schedules or triggers to automate scraping when needed.
Watch execution logs for any errors or delays.
If many scrapes needed, think about self-host n8n for reliability.
Summary of Results
✓ Saves hours weekly by automating manual scraping.
✓ Reduces errors from manual copying.
✓ Delivers fresh, formatted data ready to use.
✓ Uses AI to pick the right tool per request.
✓ Lets you test easily before full run.