Build AI Voice Chat with Webhook, OpenAI & Google Gemini in n8n

This n8n workflow automates AI voice chat by converting speech to text, maintaining conversation context with memory nodes, generating AI responses using Google Gemini, and returning speech audio via ElevenLabs. It solves slow, disjointed voice interactions by providing seamless, contextual AI voice conversations.
memoryManager
lmChatGoogleGemini
webhook
+8
Workflow Identifier: 1341
NODES in Use: memoryManager, stickyNote, aggregate, memoryBufferWindow, lmChatGoogleGemini, respondToWebhook, httpRequest, limit, chainLlm, webhook, openAi
Automate AI voice chat with n8n and OpenAI

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What This Automation Does

This n8n workflow turns voice messages into smart, memory-aware AI chat replies.

It fixes the problem where AI forgets chat history.

The result is faster, natural voice conversations that remember past talks.

Here’s how it works: it listens to voice, writes down words, remembers chats, thinks with Google Gemini AI, then talks back using ElevenLabs voice.


Tools and Services Used

  • Webhook: Gets voice messages.
  • OpenAI Speech to Text model: Changes voice to text.
  • Get Chat Memory Manager node: Fetches previous chats.
  • Aggregate node: Collects chat history into one message.
  • Google Gemini Chat Model: Gives AI replies with full chat context.
  • Insert Chat Memory Manager node: Saves new chat messages.
  • ElevenLabs HTTP Request node: Changes text replies back to speech.
  • Respond to Webhook node: Sends voice replies out.

Inputs, Processing Steps, and Outputs

Inputs

  • Voice audio files sent to the Webhook node.

Processing Steps

  1. Transcribe incoming voice to text using OpenAI Speech to Text node.
  2. Fetch previous conversation using Get Chat Memory Manager node.
  3. Aggregate past chat messages into a single context string with Aggregate node.
  4. Send current text and context to Google Gemini Chat Model node for AI reply.
  5. Store both user text and AI reply back into chat memory via Insert Chat Memory Manager node.
  6. Convert AI reply text into natural voice audio using ElevenLabs HTTP Request node.
  7. Return audio reply to client through Respond to Webhook node.

Outputs

  • AI-generated voice audio reply sent back through the webhook.

Who Should Use This Workflow

This workflow suits people who want AI voice chat that remembers earlier talks.

It helps those tired of repeating voice messages or typing replies manually.

Users wanting a simple way to add voice AI with memory to apps will find this workflow useful.

Having API keys for OpenAI, Google Gemini, and ElevenLabs is needed.

Basic knowledge of n8n editor is helpful but not required.


Beginner Step-by-Step: How to Use This Workflow in n8n Production

Import Workflow

  1. Download the workflow JSON file using the Download button on this page.
  2. In the n8n editor, click “Import from File” and upload the downloaded file.

Configure Credentials and IDs

  1. Add your OpenAI API Key in the OpenAI Speech to Text node credentials.
  2. Set up Google Gemini API credentials in the Google Gemini Chat Model node.
  3. Insert ElevenLabs API Key and Voice ID in the ElevenLabs HTTP Request node headers and URL.
  4. Check webhook URL path and update if needed to fit your application’s endpoint.
  5. Update any session keys or environment variables if used in memory nodes.

Test the Workflow

  1. Use the webhook URL to send a sample voice message.
  2. Check that the AI returns a voice reply matching the message and remembers past chats.

Activate for Production

  1. Toggle the workflow status to “active” in n8n once testing looks good.
  2. Connect the webhook URL to your app or client to start receiving voice messages live.
  3. Monitor executions for errors or warnings in the n8n UI.

For extra control, consider self-host n8n on a server.


Customization Ideas

  • Switch AI from Google Gemini to OpenAI Chat Completion node for a different chat style.
  • Change session keys in Memory Manager nodes for handling multiple users separately.
  • Replace ElevenLabs HTTP Request with OpenAI’s Generate Audio node for speech synthesis inside OpenAI ecosystem.
  • Modify the webhook path in the Webhook node to fit different app endpoints.

Common Problems and Fixes

  • No transcription from OpenAI?
    Make sure the audio file name matches exactly “voice_message” from webhook to OpenAI node.
  • ElevenLabs API errors?
    Check Voice ID and API Key in HTTP Request headers and URL carefully for typos.
  • AI answer not matching chat?
    Verify correct session keys and proper aggregation of past chats before sending to AI.

Pre-Production Checklist

  • Test webhook is accepting POST requests and reachable.
  • Confirm OpenAI speech-to-text outputs correct transcription.
  • Ensure memory manager nodes read and write conversation history properly.
  • Validate Google Gemini AI responses are relevant and keep context.
  • Test ElevenLabs HTTP request to produce usable audio replies.
  • Run end-to-end voice chat test to verify smooth, correct audio feedback.

Deployment and Scaling

Once active, use the webhook URL inside your voice chat app.

Watch n8n executions for issues.

Protect API keys to avoid interruptions.

Adjust memory session keys to handle many users at once.

Try self-host n8n for better control if needed.


Summary

✓ Transcribes voice to text fast and reliable.

✓ Keeps memory for natural conversations.

Generates smart replies with Google Gemini AI.

✓ Converts text replies back to voice using ElevenLabs.

→ Saves time by automating chat and audio workflows.

→ Gives real-time, human-like AI voice chat experience.


Automate AI voice chat with n8n and OpenAI

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

Yes, replacing the Google Gemini Chat Model node with OpenAI Chat Completion node for AI responses is possible.
ElevenLabs API call fails mainly if the Voice ID or API Key in the HTTP Request node is incorrect or missing.
The workflow uses Get Chat and Insert Chat Memory Manager nodes with a session key to fetch and save conversation history.
Yes, the Respond to Webhook node returns the ElevenLabs-generated audio binary as the HTTP response to the calling client.

Promoted by BULDRR AI

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.