Build a RAG & GenAI App for WordPress with n8n

This workflow automates embedding your WordPress content into a vector database and enables Generative AI-powered chat on your website. Save hours on manual content indexing and enhance user engagement with live AI responses sourced from your site content.
wordpress
embeddingsOpenAi
postgres
+14
Learn how to Build this Workflow with AI:
Workflow Identifier: 1160
NODES in Use: Manual Trigger, Wordpress, Merge, Set, Filter, Markdown, Token Splitter, Default Data Loader, Embeddings OpenAI, Supabase Vector Store, Schedule Trigger, Postgres, Switch, Loop Over Items, AI Agent, Respond to Webhook, Supabase

Press CTRL+F5 if the workflow didn't load.

Visit through Desktop for Best experience

Opening Problem Statement

Meet Sarah, a content manager for a growing WordPress site packed with hundreds of blog posts and pages. Every time she wants to power up her website with AI chat capabilities that provide accurate answers based on her content, she faces a tedious and error-prone process of manually extracting, converting, and indexing her posts. With constant updates and new published content, Sarah often wastes hours ensuring the AI knowledge base is up to date. Any missed content or outdated embeddings leads to irrelevant responses and poor user experience, costing potential leads and lowering visitor retention.

This scenario is exactly what the “RAG & GenAI App With WordPress Content” workflow solves. It fully automates the retrieval of WordPress content, applies advanced text embeddings, stores them in a vector database, and enables a powerful chat interface to interact with site visitors using Generative AI models like OpenAI’s GPT. By eliminating manual workflow steps, Sarah can save hours weekly, ensure fresh and relevant AI responses, and dramatically boost her website’s interactivity and user satisfaction.

What This Automation Does

When activated, this workflow automates the entire lifecycle of creating and maintaining a Generative AI application for your WordPress website content. Here’s what happens:

  • Retrieves all published and unprotected WordPress posts and pages via WordPress API, including updates since the last run.
  • Converts HTML content into clean Markdown suitable for embedding and AI comprehension.
  • Splits large text content into smaller chunks with a token-based splitter that respects semantic boundaries for better embedding quality.
  • Generates vector embeddings using OpenAI’s text-embedding-3-small model for each content chunk.
  • Stores embeddings and metadata in a Supabase-backed vector store and Postgres tables with support for efficient similarity searches and chat memory.
  • Enables a chat interface triggered by visitor questions that retrieves relevant documents from your vector store to formulate real-time AI-powered responses via OpenAI GPT-4o-mini.
  • Maintains embedding and workflow execution history to update embeddings incrementally only for new or modified content, avoiding unnecessary processing.

This workflow reduces manual indexing time from hours to minutes and ensures your AI chat always reflects the latest website content, improving accuracy and visitor engagement.

Prerequisites ⚙️

  • n8n automation platform account (cloud or self-hosted) 🔌
  • WordPress website with REST API access and an API credential 🔑
  • OpenAI account with API key for embedding and chat models 🔐
  • Supabase account configured for vector storage (or PostgreSQL with pgvector extension) 📁
  • PostgreSQL database with pgvector extension enabled for vector similarity search 📊
  • Basic familiarity with n8n workflow editor and credentials configuration ⏱️

Step-by-Step Guide to Build This Workflow

1. Set Up WordPress API Access

In n8n, create credentials for your WordPress site API to retrieve posts and pages. This requires the site URL and an authentication token if needed.

Navigation: Credentials → New Credential → WordPress API

Tip: Ensure your REST API endpoints like /wp-json/wp/v2/posts and /wp-json/wp/v2/pages are accessible.

2. Trigger Initial Content Retrieval Manually

Use the Manual Trigger node (named “When clicking ‘Test workflow’”) to start the content loading process manually.

After clicking “Execute Workflow”, it pulls all posts and pages using the WordPress nodes.

3. Merge and Filter WordPress Content

Combine posts and pages with the Merge node for unified processing. Then use a Set node to extract relevant fields like title, content, URL, publication and modification dates.

Apply the Filter node “Only published & unprotected content” to exclude drafts and protected content not meant for public AI indexing.

4. Convert HTML Content to Markdown

Pass the content to the Markdown node to transform HTML into clean markdown text better suited for embedding generation.

5. Split Text With Token Splitter

Feed the markdown content to a Token Splitter node configured to split text into chunks of 300 tokens with 30 token overlaps. This ensures manageable chunk sizes for OpenAI embeddings.

6. Load Documents and Generate Vector Embeddings

Use the Default Data Loader node to prepare each chunk with metadata, then generate embeddings with the Embeddings OpenAI node using the “text-embedding-3-small” model.

7. Store Embeddings in Supabase/Postgres

The embeddings are inserted into the Supabase vector store via the Supabase Vector Store node. For PostgreSQL, the workflow creates required tables including a vector-enabled documents table and tracking table n8n_website_embedding_histories.

This step includes SQL commands to enable the pgvector extension and create an efficient similarity search function.

8. Set Up Scheduled Trigger for Incremental Updates

The Schedule Trigger node fires every 30 seconds (configurable), initiating a query to find posts and pages modified since the last workflow execution. These recent changes are filtered, converted, chunked, re-embedded, and stored just like the initial run.

9. Handle New and Existing Documents Differently

The PostgreSQL node checks if documents exist based on the ID metadata. A Switch node routes new documents to insertion and existing ones to deletion plus reinsertion to refresh embeddings.

10. Enable AI-Powered Chat Interface

The When chat message received open webhook listens for visitor queries. It calls the vector store to retrieve relevant chunks based on the visitor’s input embedding.

The AI Agent node leverages the GPT-4o-mini model with a tailored system prompt to answer questions, integrating metadata context like URLs and content dates directly in responses.

Finally, the Respond to Webhook node sends the AI-generated reply back to the visitor’s chat interface.

Customizations ✏️

  • Adjust chunk size and overlap in the Token Splitter node to tune memory and context for embeddings. Smaller chunks reduce token usage but might lose context, larger chunks improve context but use more tokens.
  • Extend metadata passed in the Default Data Loader’s metadataValues to include custom WordPress fields or tags for richer information in your AI responses.
  • Use different OpenAI models by changing the Embeddings and Chat Model nodes’ model parameters to versions that suit your budget or required performance.
  • Increase polling frequency in the Schedule Trigger node to handle more frequent content updates from your site.
  • Secure chat webhook by configuring authentication or IP whitelisting within the When chat message received node for privacy.

Troubleshooting 🔧

  • Problem: “OpenAI API rate limit exceeded”
    Cause: Too many requests sent in a short time by embedding or chat nodes.
    Solution: Add delays between nodes or reduce polling frequency in Schedule Trigger, and monitor usage in your OpenAI dashboard.
  • Problem: “WordPress API returns empty or error response”
    Cause: Incorrect API credentials or endpoint restrictions.
    Solution: Verify API token and ensure the site allows REST API access. Test endpoints with tools like Postman.
  • Problem: “Embedding insert failed due to PostgreSQL vector dimension mismatch”
    Cause: Mismatched vector size in pgvector extension and OpenAI embeddings.
    Solution: Confirm OpenAI embedding model output dimension matches PostgreSQL vector column size (e.g., 1536 for text-embedding-3-small).

Pre-Production Checklist ✅

  • Verify API credentials for WordPress, OpenAI, Supabase/PostgreSQL in n8n under Credentials section.
  • Confirm PostgreSQL has pgvector extension enabled and tables created successfully by running setup queries.
  • Test manual workflow run via the Manual Trigger node to see if posts/pages are retrieved and embeddings generated.
  • Send test chat input to your webhook URL and observe if AI returns relevant answers with context metadata.
  • Backup Postgres database and Supabase data before enabling scheduled triggers.

Deployment Guide

Activate the scheduled trigger node to start periodic incremental embedding updates based on your website’s publishing frequency. Monitor the workflow’s execution logs in n8n to quickly catch any errors during API calls or database inserts.

For hosting, you can use n8n cloud or self-host on your own infrastructure—for reliable uptime, the self-hosting option with services like Hostinger can be considered.

Maintain logs for chat queries to audit AI responses and improve prompt engineering over time.

FAQs

  • Q: Can I use a different vector database instead of Supabase?
    A: Yes, you can switch to any vector-capable database supported by LangChain in n8n, but you’ll need to adjust the vector store nodes accordingly.
  • Q: Does embedding consume many OpenAI API credits?
    A: Yes, embeddings and chat completions use your OpenAI quota. Efficient chunking and update scheduling help minimize costs.
  • Q: Is my website content secure during this process?
    A: All API communications happen over HTTPS. You can add authentication for the webhook node to protect chat access.

Conclusion

By following this guide, you’ve created a powerful automated pipeline that fetches WordPress content, converts it, generates OpenAI embeddings, and enables a smart chat interface with contextual, up-to-date answers. This setup saves you countless hours of manual indexing and empowers engagement with precise AI responses sourced from your latest website content.

Next, you might explore adding support for additional languages, integrating more advanced AI prompt engineering, or expanding the chat interface with multimedia support.

With this automation, you’re well-equipped to turn your WordPress site into an AI-powered hub that not only informs but interacts seamlessly with your visitors.

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n (Beginner Guide)

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free