Enrich Company Data with n8n, OpenAI, and ScrapingBee

This workflow automates enriching company profiles from a Google Sheet by scraping website data with ScrapingBee and analyzing it using OpenAI GPT. It extracts business insights and updates your sheet automatically, saving hours of manual research.
googleSheets
lmChatOpenAi
httpRequest
+8
Workflow Identifier: 2335
NODES in Use: googleSheets, splitInBatches, set, httpRequest, markdown, lmChatOpenAi, outputParserStructured, agent, toolWorkflow, executeWorkflowTrigger, webhook
Enrich company data with n8n and OpenAI

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What This Automation Does

This workflow takes company names and website URLs from a Google Sheet. It scrapes the homepage HTML using ScrapingBee. Then, it converts HTML to Markdown to keep data simple. It feeds this to OpenAI’s GPT-4o-mini model. The AI analyzes and returns clear business info like Business Area, Products, Value Proposition, Business Model, and Ideal Customer Profile.

The workflow also checks for missing or mismatched info. Final results get automatically written back into Google Sheets. This replaces long manual research with fast, structured enrichment.


Who Should Use This Workflow

This fits people who work with lots of company data in Google Sheets. Especially useful for business analysts, marketers, and sales teams. Anyone tired of copying info from websites one by one. It helps non-technical users get clean business details without guessing or reading sites themselves.


Tools and Services Used

  • Google Sheets: Your source list and final storage.
  • ScrapingBee API: Fetches company homepage HTML.
  • OpenAI GPT-4o-mini: AI model that interprets content.
  • n8n: Workflow automation platform running all steps.

Inputs, Processing Steps, and Output

Inputs

  • Google Sheet with at least columns: Company and Website.
  • Valid ScrapingBee API Key.
  • Valid OpenAI API Key with GPT-4o-mini access.

Processing Steps

  1. Read all company rows from Google Sheet.
  2. Loop each company separately using SplitInBatches to avoid overload.
  3. Extract company website URL and set as a scraping target.
  4. Use ScrapingBee to get homepage raw HTML.
  5. Convert the raw HTML to markdown to simplify the text for AI.
  6. Feed markdown content into OpenAI GPT-4o-mini with a prompt to find: Business Area, Offers, Value Proposition, Business Model, Ideal Customer Profile.
  7. Parse AI output into structured JSON format using the LangChain Structured Output Parser.
  8. Write back the parsed info into the correct row in Google Sheet under the columns Business Area, Offer, Value Proposition, Business Model, ICP, and Additional Information.
  9. Detect and report on cases where data is missing or scraping fails for diagnostics.

Output

  • Google Sheet updated with clean, structured business details for each company.
  • Logs and error information in the workflow for troubleshooting.

Beginner Step-by-Step: How to Use This Workflow in n8n

1. Download and Import Workflow

  1. Click the Download button on this page to save the workflow JSON file.
  2. Open the n8n editor where automation workflows are created.
  3. Use the Import from File option in n8n to import the downloaded workflow JSON.

2. Configure Credentials and IDs

  1. Add your Google Sheets API credentials in n8n for accessing spreadsheet data.
  2. Input your ScrapingBee API Key in the HTTP Request node querying the scraper.
  3. Enter your OpenAI API key in the OpenAI Chat Model node.
  4. If needed, update the Google Sheet document ID and sheet name to match your file.
  5. Check any email, folder, channel, or table-specific settings and update as per your environment.

3. Test the Workflow

  1. Run the workflow manually to ensure it reads, scrapes, analyzes, and updates data properly.
  2. Watch execution logs for any errors to fix credential or configuration issues.

4. Activate for Production

  1. When tests are successful, turn on the workflow trigger or run it on a schedule.
  2. If using self hosting n8n, consider real server deployment with links like self-host n8n to keep it reliable.
  3. Monitor runs regularly and update credentials when necessary.

Edge Cases and Potential Failures

  • Missing or invalid URLs: Scraping will fail if URLs do not exist or are malformed.
  • ScrapingBee no data returned: Check API key and URL parameters.
  • OpenAI max tokens error: Can happen if HTML content is too big without conversion to markdown.
  • Row update mismatch: Ensure the Google Sheet has a reliable row_number column to map AI results back.
  • Insufficient data in site content: AI prompt handles by adding diagnostic notes in Additional Info field.

Customization Ideas

  • Expand scraping to “About Us” or “Pricing” pages to get deeper info.
  • Add extra validation and skip rows if data looks incomplete or error-prone.
  • Replace Google Sheets output with CRM integration for real-time company enrichment.
  • Modify the AI prompt to support multiple languages and do auto translation before parsing.
  • Add a Webhook node trigger to run enrichment on live lead submissions.

Summary of Benefits

✓ Saves hours or days each week by automating company data enrichment.
✓ Reduces errors caused by manual copy-pasting and guessing.
✓ Produces structured, easy-to-use business details in Google Sheets.
✓ Helps teams make smarter sales and marketing choices faster.
✓ Simple to set up and run inside n8n with API keys and sheet updates.


Enrich company data with n8n and OpenAI

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

Yes, but ScrapingBee is recommended for its reliability and compatibility with this workflow’s API calls.
Token use depends on website content size, but converting HTML to Markdown reduces the tokens sent to OpenAI.
The workflow uses a row_number column to match and update exact rows with AI-enriched data.
Yes, with proper API key security and secure n8n hosting, company data stays private.
Author
Written By
Vikash Kumar
Building AI agents, n8n workflows and end-to-end automation for 30+ Brands across India, the US, Europe, Dubai & Australia. 7+ years of Experience saving founders real hours every week - no code required.

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.