Automate WhatsApp AI Chatbot with n8n and Google Gemini

This n8n workflow automates WhatsApp message handling with AI, processing audio, video, image, and text inputs using Google Gemini and LangChain. It transcribes, describes, and summarizes messages to generate intelligent chatbot responses, saving hours of manual support time.
whatsAppTrigger
whatsApp
httpRequest
+8
Workflow Identifier: 1344
NODES in Use: WhatsApp Trigger, WhatsApp, HTTP Request, Split Out, Switch, Set, memoryBufferWindow, Chain LLM, AI Agent, lmChatGoogleGemini, toolWikipedia
Automate WhatsApp AI chatbot with n8n and Google Gemini

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What This Workflow Does

This workflow automates WhatsApp message handling using n8n. It processes messages with audio, video, images, or text. The workflow downloads media securely from WhatsApp servers, then uses Google Gemini and LangChain AI to transcribe, describe, or summarize the content. Finally, it sends a smart reply back to the user automatically.

The main problem solved is saving time and avoiding mistakes by replacing manual message reading. The outcome is faster, accurate responses to customers, improving support quality.


Who Should Use This Workflow

Businesses receiving many WhatsApp messages with mixed media can use this workflow. Customer support managers who want to reply faster with AI help will find it useful.

Anyone using WhatsApp Business Cloud API and wanting to add AI transcription, image analysis, or smart replies can benefit.


Tools and Services Used


Inputs, Processing Steps, and Output

Inputs

Processing Steps

  1. Start with WhatsApp Trigger to listen for new messages.
  2. Use Split Out node to separate multiple messages.
  3. Route messages by type with Switch node.
  4. For media messages, get secure download URLs using WhatsApp nodes.
  5. Download audio, video, and image files via HTTP Request nodes with WhatsApp credentials.
  6. Send audio files to Google Gemini API for transcription.
  7. Send video files to Google Gemini API for descriptive text.
  8. Analyze images with LangChain GPT4o nodes to describe content.
  9. Summarize plain text messages using LangChain.
  10. Create a structured message object combining text, captions, and sender info.
  11. Keep conversation history in memory with LangChain memoryBufferWindow node linked to user phone number.
  12. Generate AI replies using LangChain AI Agent node using context and external knowledge from Wikipedia.
  13. Send AI-generated responses back via WhatsApp node to the user.

Output

A context-aware WhatsApp reply that answers the customer’s question or concern, including summarized or described content from media.


Beginner Step-by-Step: How to Use This Workflow in n8n

1. Import the Workflow

  1. Download the workflow file using the Download button on this page.
  2. In the n8n editor, click the main menu, choose “Import from File,” and select the downloaded file.

2. Configure Credentials and Settings

  1. Add WhatsApp Business Cloud API OAuth credentials in the WhatsApp nodes.
  2. Enter Google Gemini API keys in the HTTP Request nodes used for audio and video processing.
  3. Check if any IDs, emails, or phone numbers need updating to match your WhatsApp setup.

3. Test the Workflow

  1. Send sample messages (audio, video, image, text) to your WhatsApp account linked to the workflow.
  2. Watch the workflow execution logs to confirm media downloads and AI processing occur without errors.

4. Activate for Production

  1. Once tests pass, toggle the workflow activation switch in n8n to run continuously.
  2. Make sure your webhook URL is reachable by WhatsApp servers.
  3. If using self-host n8n, verify your server is up and the endpoint URL is public.

Customization Ideas

  • Replace Google Gemini API calls with other AI services for multimodal input by updating HTTP Request endpoints and credentials.
  • Add more tools to the LangChain AI Agent to retrieve information from custom databases or additional APIs.
  • Support new WhatsApp message types like documents or location by extending the Switch node branches.
  • Change the AI Agent’s reply style by editing the system message prompt—for example, to sound more friendly or formal.
  • Save conversation logs or media to external storage like Google Sheets or databases for future audits.

Edge Cases and Troubleshooting

Issue: Receiving “401 Unauthorized” from WhatsApp API

This usually means WhatsApp OAuth credentials are wrong or expired. Update and re-authenticate credentials in WhatsApp Trigger and WhatsApp nodes.

Issue: AI Agent Replies Are Irrelevant or Empty

This can happen if message text or context is missing. Check the Set node that compiles message data. Verify the session key in the memory buffer matches user phone number and the AI Agent input is complete.


Summary of Benefits and Outcome

✓ Saves hours of manual WhatsApp message handling time.

✓ Converts audio, video, and images into text explanations.

✓ Provides quick, accurate AI-generated replies to users.

✓ Builds and uses conversation history for better context.

→ Result is better customer service and faster support communication.


Automate WhatsApp AI chatbot with n8n and Google Gemini

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

Add WhatsApp Business Cloud API OAuth credentials in the WhatsApp Trigger and WhatsApp nodes within n8n before running the workflow.
Yes. Replace HTTP Request nodes calling Google Gemini API with other AI services that handle audio or video by updating the endpoints and credentials.
This usually happens if the message text input or conversation context is missing. Ensure the message data is correctly set and that memory buffer uses the correct session key.
Import the workflow file, add all required API keys and credentials, test with example messages, then activate the workflow in n8n. Make sure the webhook endpoint is publicly reachable.
Author
Written By
Ritu Sanjali

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.