Build AI Voice Chat with Webhook, OpenAI & Google Gemini in n8n

This n8n workflow automates AI voice chat by converting speech to text, maintaining conversation context with memory nodes, generating AI responses using Google Gemini, and returning speech audio via ElevenLabs. It solves slow, disjointed voice interactions by providing seamless, contextual AI voice conversations.
memoryManager
lmChatGoogleGemini
webhook
+8
Workflow Identifier: 1341
NODES in Use: memoryManager, stickyNote, aggregate, memoryBufferWindow, lmChatGoogleGemini, respondToWebhook, httpRequest, limit, chainLlm, webhook, openAi

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

What This Automation Does

This n8n workflow turns voice messages into smart, memory-aware AI chat replies.

It fixes the problem where AI forgets chat history.

The result is faster, natural voice conversations that remember past talks.

Here’s how it works: it listens to voice, writes down words, remembers chats, thinks with Google Gemini AI, then talks back using ElevenLabs voice.


Tools and Services Used

  • Webhook: Gets voice messages.
  • OpenAI Speech to Text model: Changes voice to text.
  • Get Chat Memory Manager node: Fetches previous chats.
  • Aggregate node: Collects chat history into one message.
  • Google Gemini Chat Model: Gives AI replies with full chat context.
  • Insert Chat Memory Manager node: Saves new chat messages.
  • ElevenLabs HTTP Request node: Changes text replies back to speech.
  • Respond to Webhook node: Sends voice replies out.

Inputs, Processing Steps, and Outputs

Inputs

  • Voice audio files sent to the Webhook node.

Processing Steps

  1. Transcribe incoming voice to text using OpenAI Speech to Text node.
  2. Fetch previous conversation using Get Chat Memory Manager node.
  3. Aggregate past chat messages into a single context string with Aggregate node.
  4. Send current text and context to Google Gemini Chat Model node for AI reply.
  5. Store both user text and AI reply back into chat memory via Insert Chat Memory Manager node.
  6. Convert AI reply text into natural voice audio using ElevenLabs HTTP Request node.
  7. Return audio reply to client through Respond to Webhook node.

Outputs

  • AI-generated voice audio reply sent back through the webhook.

Who Should Use This Workflow

This workflow suits people who want AI voice chat that remembers earlier talks.

It helps those tired of repeating voice messages or typing replies manually.

Users wanting a simple way to add voice AI with memory to apps will find this workflow useful.

Having API keys for OpenAI, Google Gemini, and ElevenLabs is needed.

Basic knowledge of n8n editor is helpful but not required.


Beginner Step-by-Step: How to Use This Workflow in n8n Production

Import Workflow

  1. Download the workflow JSON file using the Download button on this page.
  2. In the n8n editor, click “Import from File” and upload the downloaded file.

Configure Credentials and IDs

  1. Add your OpenAI API Key in the OpenAI Speech to Text node credentials.
  2. Set up Google Gemini API credentials in the Google Gemini Chat Model node.
  3. Insert ElevenLabs API Key and Voice ID in the ElevenLabs HTTP Request node headers and URL.
  4. Check webhook URL path and update if needed to fit your application’s endpoint.
  5. Update any session keys or environment variables if used in memory nodes.

Test the Workflow

  1. Use the webhook URL to send a sample voice message.
  2. Check that the AI returns a voice reply matching the message and remembers past chats.

Activate for Production

  1. Toggle the workflow status to “active” in n8n once testing looks good.
  2. Connect the webhook URL to your app or client to start receiving voice messages live.
  3. Monitor executions for errors or warnings in the n8n UI.

For extra control, consider self-host n8n on a server.


Customization Ideas

  • Switch AI from Google Gemini to OpenAI Chat Completion node for a different chat style.
  • Change session keys in Memory Manager nodes for handling multiple users separately.
  • Replace ElevenLabs HTTP Request with OpenAI’s Generate Audio node for speech synthesis inside OpenAI ecosystem.
  • Modify the webhook path in the Webhook node to fit different app endpoints.

Common Problems and Fixes

  • No transcription from OpenAI?
    Make sure the audio file name matches exactly “voice_message” from webhook to OpenAI node.
  • ElevenLabs API errors?
    Check Voice ID and API Key in HTTP Request headers and URL carefully for typos.
  • AI answer not matching chat?
    Verify correct session keys and proper aggregation of past chats before sending to AI.

Pre-Production Checklist

  • Test webhook is accepting POST requests and reachable.
  • Confirm OpenAI speech-to-text outputs correct transcription.
  • Ensure memory manager nodes read and write conversation history properly.
  • Validate Google Gemini AI responses are relevant and keep context.
  • Test ElevenLabs HTTP request to produce usable audio replies.
  • Run end-to-end voice chat test to verify smooth, correct audio feedback.

Deployment and Scaling

Once active, use the webhook URL inside your voice chat app.

Watch n8n executions for issues.

Protect API keys to avoid interruptions.

Adjust memory session keys to handle many users at once.

Try self-host n8n for better control if needed.


Summary

✓ Transcribes voice to text fast and reliable.

✓ Keeps memory for natural conversations.

Generates smart replies with Google Gemini AI.

✓ Converts text replies back to voice using ElevenLabs.

→ Saves time by automating chat and audio workflows.

→ Gives real-time, human-like AI voice chat experience.


Frequently Asked Questions

Yes, replacing the Google Gemini Chat Model node with OpenAI Chat Completion node for AI responses is possible.
ElevenLabs API call fails mainly if the Voice ID or API Key in the HTTP Request node is incorrect or missing.
The workflow uses Get Chat and Insert Chat Memory Manager nodes with a session key to fetch and save conversation history.
Yes, the Respond to Webhook node returns the ElevenLabs-generated audio binary as the HTTP response to the calling client.

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation Workflows in n8n

A complete beginner guide to building an AI SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free