Extract Web Page Entities with Google NLP in n8n Workflow

Struggling to manually extract meaningful entities from web pages? This unique n8n workflow automates entity extraction using Google’s Natural Language API, delivering structured insights like people, organizations, and locations directly from any URL you provide.
webhook
httpRequest
code
+2
Workflow Identifier: 1594
NODES in Use: webhook, httpRequest, code, respondToWebhook, stickyNote
Automate entity extraction with Google NLP in n8n

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What This Workflow Does

This workflow helps you get named entities from any webpage URL you send it.

You send a URL to a webhook, and the workflow fetches the page HTML.

Then it sends the HTML to Google Natural Language API for entity detection.

You get back a list of entities like people, organizations, and locations found on the page.

This saves you from reading and tagging web content by hand.


Who Should Use This Workflow

If you do content analysis and spend hours extracting company names, people, or places from web articles, this workflow is for you.

It is good for content analysts, marketers, or researchers wanting fast, structured text data.

Anyone needing quick entity info from any webpage without manual copy-paste will find it useful.


Tools and Services Used

  • n8n Workflow Automation Platform: Runs the automation with nodes.
  • Google Cloud Natural Language API: Detects entities in raw HTML.
  • Webhook node: Receives URLs via POST requests.
  • HTTP Request node: Gets web page HTML and calls Google API.
  • Code node: Prepares the API request JSON with trimmed HTML.
  • Respond to Webhook node: Sends entity data back to caller.

Inputs, Processing, and Outputs Explained

Input

The workflow takes a POST request with JSON body that has a “url” field.

Example input:

{
  "url": "https://example.com"
}

Processing

  1. The Webhook node listens for the incoming POST and extracts the URL.
  2. The HTTP Request node fetches the full HTML content of the page from that URL.
  3. The Code node trims the HTML if too big and builds the JSON request body for Google NLP API with the HTML content.
    // Clean and prepare HTML for API request
    const html = $input.item.json.data;
    const trimmedHtml = html.length > 100000 ? html.substring(0, 100000) : html;
    
    return {
      json: {
        apiRequest: {
          document: {
            type: "HTML",
            content: trimmedHtml
          },
          encodingType: "UTF8"
        }
      }
    }
    
  4. A second HTTP Request node posts this JSON to Google Natural Language API’s analyzeEntities endpoint, sending the API Key as a query parameter.
  5. The Respond to Webhook node sends the Google API’s response back to the original requester.

Output

The caller receives JSON containing entity details.

It includes types like PERSON, ORGANIZATION, LOCATION, salience scores, metadata, and text mentions.


Beginner Step-by-Step: How to Use This Workflow in n8n Production

Step 1: Import Workflow

  1. Download the workflow file using the Download button on this page.
  2. Open n8n editor and choose “Import from File”.
  3. Select the downloaded workflow JSON file to import it.

Step 2: Configure Credentials and Settings

  1. Open the Google Entities HTTP Request node.
  2. Replace YOUR-GOOGLE-API-KEY in the query parameters with your actual Google Cloud API Key.
  3. If there are any IDs, emails, channel names, or table references in the workflow nodes, update them properly for your setup.

Step 3: Test the Workflow

  1. Send a POST request to the webhook URL with JSON body:
    {
      "url": "https://example.com"
    }
    
  2. Check the response to verify entity extraction works as expected.

Step 4: Activate Workflow for Production

  1. Save the workflow if you made changes.
  2. Toggle the active switch to live the workflow.
  3. Start using the webhook URL in your other apps or automations.
  4. If running n8n on your own server, consider self-host n8n for better control.

Common Problems and How to Fix Them

  • 403 Forbidden or Invalid API Key errors: Check API key correctness and that Google’s Natural Language API is enabled in Google Cloud.
  • Empty responses from webhook: Ensure POST requests include JSON with valid “url” field, not GET requests.
  • Google API request size too large: The Code node trims input HTML to 100,000 characters by default; adjust this if needed.

Customization Ideas

  • Add Sentiment Analysis: Add features: {extractEntitySentiment: true} to the Google NLP request JSON in the Code node.
  • Filter Entities: Add a Code node after Google Entities to keep only certain types like PERSON or ORGANIZATION.
  • Save to Google Sheets: Add a Google Sheets node after entity extraction to log data for reports.
  • Change NLP Features: Modify the Google API endpoint or parameters to include syntax analysis or content classification.

Summary

✓ Automate named entity extraction from any webpage URL with this workflow.

✓ Save time and remove manual copy-paste or tagging errors.

✓ Get structured entity data instantly, including types and salience scores.

✓ Easy configure for production by importing, adding API keys, testing, and activating.

✓ Expand with sentiment analysis, filtering, or saving results as needed.


Automate entity extraction with Google NLP in n8n

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

Check if the Google Cloud Natural Language API is enabled and use the correct API key in the HTTP Request node’s query parameters.
Send a POST request with a JSON body containing a “url” field holding the webpage address to analyze.
Ensure the Code node trims the webpage HTML content to under 100,000 characters or adjust as per Google API limits.
Yes, the workflow imports normally and runs on self-hosted n8n; check self-host n8n for options.

Promoted by BULDRR AI

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.