Enrich Pipedrive Organization Data Using GPT-4 and ScrapingBee

This workflow automates enrichment of Pipedrive organization data by scraping the company website and using GPT-4 to generate detailed notes. It saves time spent on manual research and improves CRM information accuracy.
pipedriveTrigger
httpRequest
openAi
+4
Learn how to Build this Workflow with AI:
Workflow Identifier: 1994
NODES in Use: pipedriveTrigger, httpRequest, openAi, pipedrive, markdown, code, slack

Press CTRL+F5 if the workflow didn't load.

Visit through Desktop for Best experience

1. Opening Problem Statement

Meet Sarah, a sales operations manager at a growing tech startup. Every time a new organization is added to their Pipedrive CRM, Sarah spends upwards of 30 minutes researching the company website, gathering information about their products, target market, and competitors. This manual process not only takes too much valuable time but also leads to inconsistent, incomplete notes in the CRM. Missing or shallow data causes her sales team to lose context in follow-ups and delays closing deals.

Sarah’s challenge? How to automatically enrich new Pipedrive organization records with deep, structured insights extracted directly from company websites. She needs an automation that pulls website content, processes it intelligently, and attaches a meaningful summary as a CRM note — saving hours weekly and improving sales effectiveness.

2. What This Automation Does

This unique n8n workflow tackles Sarah’s pain by:

  • Triggering whenever a new organization is created in Pipedrive.
  • Automatically retrieving the organization’s website URL from a custom Pipedrive field.
  • Using the ScrapingBee API to scrape the homepage content of the organization’s website.
  • Sending the scraped HTML data to OpenAI’s GPT-4o model with a system prompt to analyze and summarize company info, including products, target market, USPs, and competitors.
  • Creating a rich HTML note within the Pipedrive organization with the AI-generated summary.
  • Converting the note into Slack’s markdown format and sending a notification to a Slack channel to alert the sales team.

By automating this flow, Sarah can eliminate manual research, ensure accurate and standardized CRM notes, and keep her team instantly informed — saving over 10 hours weekly.

3. Prerequisites ⚙️

  • Pipedrive account configured with a custom “website” field on organizations
  • ScrapingBee API key for website content scraping
  • OpenAI account with GPT-4 API access
  • Slack workspace and bot token with permissions to post messages
  • n8n cloud or self-hosted account for workflow automation

🔑 Make sure your Pipedrive API credentials and Slack OAuth tokens are set up in n8n.

4. Step-by-Step Guide

Step 1: Setup Pipedrive Trigger for New Organizations

Navigate to Nodes → Add Node → Pipedrive Trigger. Configure it to trigger on action: added and object: organization. This means the workflow runs whenever a new organization creates in Pipedrive.

You should see webhook settings generated in n8n. Connect your Pipedrive API credentials here.

Expectation: When a new organization appears in Pipedrive, it will start this workflow.

Common Mistake: Forgetting to set the action to “added” or the object to “organization” disables triggers.

Step 2: Scrape the Organization’s Website Content

Add an HTTP Request node (in this case named “ScrapingBee – Get Organization’s URL content”), set to GET the ScrapingBee API endpoint (https://app.scrapingbee.com/api/v1).

Set Query Parameters:

  • api_key: Your ScrapingBee API key
  • url: Expression that accesses Pipedrive’s custom website field like {{$json.current.}}
  • render_js: false (no JavaScript rendering)

You should see the HTML content of the organization’s homepage returned in the response.

Common Mistake: Not referencing the correct custom field ID in Pipedrive for the website URL.

Step 3: Send Website Content to OpenAI GPT-4o

Add an OpenAI node configured to use model gpt-4o. The prompt must include clear instructions to analyze the HTML and extract specific info: company services, target market, unique selling propositions, and competitors. Use the system role message supplied in the workflow.

Connect the scraped content from the HTTP Request node as input.

Outcome: GPT-4o returns a well-formatted HTML summary designed for a Pipedrive note.

Common Mistake: Using a smaller model that truncates input or returns less detailed output.

Step 4: Create a Note in Pipedrive with AI Output

Use the Pipedrive node to create a new note linked to the organization. Map the note content to {{$json.message.content}} from the OpenAI node output and set org_id to the organization ID from the trigger node.

This adds the rich AI-generated summary directly into the organization record.

Common Mistake: Not correctly mapping the organization ID causes notes to attach to the wrong record.

Step 5: Convert HTML to Markdown

Add a Markdown node configured to convert the HTML note content into Markdown format. Input {{$json.content}} from the Pipedrive note creation node.

You will see the HTML transformed into standard Markdown.

Step 6: Transform Markdown to Slack Markdown

Add a Code node and paste this JavaScript code:

const inputMarkdown = items[0].json.data;

function convertMarkdownToSlackFormat(markdown) {
    let slackFormatted = markdown;
    
    // Convert headers
    slackFormatted = slackFormatted.replace(/^# (.*$)/gim, '*$1*');
    slackFormatted = slackFormatted.replace(/^## (.*$)/gim, '*$1*');
    
    // Convert unordered lists
    slackFormatted = slackFormatted.replace(/^* (.*$)/gim, '➡️ $1');
    
    // Convert tables
    const tableRegex = /n|.*|n|.*|n((|.*|n)+)/;
    const tableMatch = slackFormatted.match(tableRegex);
    if (tableMatch) {
        const table = tableMatch[0];
        const rows = table.split('n').slice(3, -1);
        const formattedRows = rows.map(row => {
            const columns = row.split('|').slice(1, -1).map(col => col.trim());
            return `*${columns[0]}*: ${columns[1]}`;
        }).join('n');
        slackFormatted = slackFormatted.replace(table, formattedRows);
    }
    
    return slackFormatted;
}

const slackMarkdown = convertMarkdownToSlackFormat(inputMarkdown);
console.log(slackMarkdown);

// Return data
return [{ slackFormattedMarkdown: slackMarkdown }];

This code converts the HTML-based Markdown into Slack-compatible Markdown formatting for proper display.

Step 7: Send Notification to Slack

Add a Slack node to send a message to a chosen Slack channel. Configure the message to include the formatted Markdown text from the previous node and dynamically insert the organization name from the trigger node:

=*New Organization {{ $('Pipedrive Trigger - An Organization is created').item.json.current.name }} created on Pipedrive* :


 {{ $json.slackFormattedMarkdown }}

This message keeps your sales team informed with detailed company insights right when a new organization appears in your CRM.

5. Customizations ✏️

  • Use a Different Scraping API: Replace the ScrapingBee HTTP Request node with your preferred scraping service or the n8n HTTP Request node. Just update the URL and query parameters accordingly.
  • Adjust OpenAI Prompt: Modify the system prompt in the OpenAI node to tailor the summary style or add additional requested data points like industry trends or company leadership.
  • Add More CRM Fields: Extend the workflow to update additional custom fields in Pipedrive using the OpenAI output by parsing and mapping the data.
  • Notification Channel: Change the Slack node’s channel ID to alert different teams or create a conditional path to notify only on specific industries.
  • Model/Cost Optimization: Swap GPT-4o with a smaller model like GPT-3.5 if budget constraints exist, but expect less detailed summaries.

6. Troubleshooting 🔧

Problem: “No website URL found in Pipedrive data”
Cause: The custom website field ID is incorrect or missing.
Solution: Verify the custom field ID in Pipedrive, update it in the HTTP Request node’s expression to match exactly.

Problem: “OpenAI token limit exceeded or input too long”
Cause: Scraped HTML content is very large and GPT-4o has input size limits.
Solution: Limit scraping scope in the HTTP Request node, or pre-process content to reduce size before sending to OpenAI.

Problem: “Slack message formatting looks wrong”
Cause: Slack markdown conversion code isn’t handling some HTML tags.
Solution: Adjust the JavaScript in the Code node based on Slack formatting documentation or simplify HTML in OpenAI output.

7. Pre-Production Checklist ✅

  • Verify Pipedrive API credentials connect and trigger fires on new organization creation.
  • Test the HTTP Request node with a sample company website URL to confirm HTML scraping works.
  • Check OpenAI node returns expected detailed HTML summary format.
  • Ensure the Pipedrive Note node attaches correctly with proper organization ID mapping.
  • Validate Markdown and Code nodes convert output properly for Slack.
  • Confirm Slack node sends messages to the right channel.
  • Run end-to-end test by creating a dummy organization and monitor logs for errors.

8. Deployment Guide

Once configured and tested, activate the workflow in n8n. Monitor executions via the n8n dashboard for successful runs and failures.

Setup alerting or logs if needed for production monitoring. You can self-host n8n for full control or use the n8n cloud platform.

Regularly review cost impact especially around OpenAI usage since GPT-4o is premium.

9. FAQs

Q: Can I use a free web scraping tool instead of ScrapingBee?
A: Yes, but ensure it provides reliable, structured HTML and complies with scraping laws. Also verify the URL input method matches n8n’s expectations.

Q: Does this use many OpenAI tokens?
A: Yes, GPT-4o consumes a large number of tokens especially with detailed HTML input and output. Budget accordingly.

Q: Is my data secure?
A: All connections use secure API tokens and OAuth. Keep credentials safe and use environment variables in n8n.

Q: Can I customize the Slack message format?
A: Absolutely, by modifying the Code node’s JavaScript and Slack node’s message template.

10. Conclusion

By following this guide, you’ve built a powerful automation that enriches your Pipedrive CRM organization records with deep insights pulled directly from company websites using the ScrapingBee API and GPT-4o. You save hours of manual data entry every week and ensure your sales team has richer context for every new lead.

Next, consider automations like updating deals with AI-generated risk scores or syncing enriched data with marketing platforms.

Keep experimenting, and you’ll unlock even more productivity gains with n8n and AI-powered workflows!

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n (Beginner Guide)

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free