Build an AI-Powered WhatsApp Chatbot with n8n & OpenAI

This workflow automates AI-driven responses on WhatsApp for texts, voice, images, and PDFs, saving hours in customer interaction handling with instant, smart replies powered by OpenAI and n8n.
whatsAppTrigger
openAi
httpRequest
+9
Workflow Identifier: 1175
NODES in Use: whatsAppTrigger, httpRequest, openAi, lmChatOpenAi, agent, extractFromFile, if, code, set, whatsApp, switch, memoryBufferWindow

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

What this workflow does

This workflow helps handle different types of WhatsApp messages automatically using n8n and OpenAI.

It solves the problem of spending many hours manually reading and replying to texts, voice notes, images, and PDFs on WhatsApp.

The outcome is faster, smarter replies that make customer chats easier and more accurate.


Who should use this workflow

This is for anyone using WhatsApp Business API who wants to reply quickly to multi-format messages.

It is good if you get many voice notes, photos, PDFs, or long texts from clients.

Non-technical users can benefit if they use n8n and OpenAI for automation.


Tools and services used

  • WhatsApp Business API: Receives messages and sends replies.
  • n8n: Automates the workflow with nodes to process different inputs.
  • OpenAI GPT-4o-mini model: Creates smart text replies and analyzes images.
  • OpenAI Whisper model: Converts voice messages to text.
  • HTTP Request nodes: Download media like audio, images, and documents.
  • Credential nodes: Manage secure access to WhatsApp API and OpenAI API.

Inputs, processing steps, and outputs

Inputs

  • Incoming WhatsApp messages which may be of different types: text, voice audio, images, PDFs, or other documents.

Processing steps

  • The WhatsApp Trigger node catches every new message.
  • A Switch node sorts messages into text, audio, image, PDF document, or unsupported types.
  • Texts go to the AI node for response generation.
  • Voice messages get media URLs, are downloaded, then transcribed by OpenAI Whisper before AI reply.
  • Images are downloaded and analyzed by OpenAI GPT-4o-mini to create detailed descriptions.
  • PDFs are downloaded, text is extracted, summarized, then used by AI to answer.
  • Unsupported types get a polite notification about allowed formats.
  • At the end, replies are sent back via WhatsApp. Audio replies use a special fix to set correct MIME types so they play properly.

Outputs

  • Clear, relevant text or audio replies sent to users on WhatsApp.

Beginner step-by-step: How to build this in n8n

1. Import the workflow

  1. Download the workflow file using the Download button on this page.
  2. Open the n8n editor and click “Import from File” to load the workflow.

2. Configure credentials

  1. Add WhatsApp API OAuth credentials in n8n Credential Manager.
  2. Add OpenAI API Key with access to GPT-4o-mini and audio/image models.
  3. Update any IDs, emails, or channel info if your setup requires it.

3. Test and activate

  1. Send test messages in different formats to your WhatsApp business number.
  2. Check if replies come correctly: text replies for text messages, transcriptions for voice, descriptions for images, and summaries for PDFs.
  3. When satisfied, toggle the workflow live in n8n to run in production.

You can explore self-host n8n if you want full control on a private server setup.


Handling message types: Input → Process → Output

  • Text: Input text is sent to AI GPT-4o-mini which generates a quick reply text. Output is text sent back on WhatsApp.
  • Voice messages: Input audio media ID fetches URL → download the audio → transcribe with OpenAI Whisper → send the transcript to GPT-4o-mini for reply → output text or audio reply.
  • Images: Image media downloaded → base64 encoded → analyzed by GPT-4o-mini for detailed descriptions → output descriptive text reply.
  • PDF documents: Document URL fetched → file download → text extracted → summarized by AI → output reply with summary or main points.
  • Unsupported: If message type is not one above, output a polite WhatsApp message listing allowed content types.

Edge cases and failures

  • If WhatsApp media URL fails, check OAuth credentials and refresh tokens.
  • If OpenAI API returns an error, verify API keys and monitor the quota usage.
  • If audio replies do not play, the workflow fixes MIME types using a Code node before sending.

Possible customizations

  • Switch OpenAI model in the AI Agent node to change reply style or complexity.
  • Add multi-language support by changing AI prompt to detect and reply in different languages.
  • Tweak audio voice or quality in audio generation node for branding.
  • Extend supported files beyond PDFs by adding checks and extraction for other document types.
  • Customize message templates with branding or extra info in the response Set nodes.

Summary of outcomes

→ Saves hours per day by automating message handling on WhatsApp.

→ Reduces transcription mistakes in voice messages.

→ Creates quick, clear, and helpful replies for texts, images, audio, and documents.

→ Improves customer conversations with richer content and less manual work.


Frequently Asked Questions

This workflow is built specifically for WhatsApp Business API. The AI parts can be adapted, but other chat platforms need different message trigger and send nodes.
Yes, generating replies and analyzing media calls OpenAI APIs. Usage costs depend on message volume and prompt complexity.
Messages are processed in real time and not stored permanently. This limits sensitive data exposure following good security practices.
A Code node fixes the audio MIME type before sending. Make sure this node runs correctly to fix playback issues.

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation Workflows in n8n

A complete beginner guide to building an AI SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free