Compare LLMs Easily with n8n, OpenAI & Google Sheets

Struggling to select the best language model for your AI project? This unique n8n workflow helps you compare two LLMs side-by-side with outputs logged and evaluated in Google Sheets for precise decision-making.
chatTrigger
googleSheets
agent
+9
Workflow Identifier: 1993
NODES in Use: chatTrigger, splitInBatches, memoryBufferWindow, memoryManager, stickyNote, lmChatOpenRouter, set, agent, summarize, aggregate, googleSheets, splitOut
Compare LLMs with n8n and OpenAI

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What This Workflow Does

This workflow compares answers from two large language models (LLMs) side by side.
It takes a user chat message and sends it to two AI models.
Then it shows both answers one after another in chat and logs them into Google Sheets for review.
This saves time and helps decide which AI fits best for your needs.

The automation keeps chat memory separate per model so answers stay relevant.
It makes comparing multiple AI responses clear and simple.


Who Should Use This Workflow

This is for anyone who tests different AI language models and wants quick comparisons.
Non-technical users can use it with little effort after minimal setup in n8n.

It suits developers, content creators, or AI testers who want neat logs and easy side-by-side AI answers.
Users who want to cut manual copying and pasted errors will find it useful.


Tools and Services Used

  • n8n Automation Platform: Runs the workflow and connects nodes.
  • OpenRouter API: Allows access to language models like OpenAI GPT-4.1 and Mistral Large.
  • Google Sheets API: Saves data in a spreadsheet for analysis and logging.
  • Langchain Nodes in n8n: Handle chat triggers, AI agents, and memory buffers.
  • Optional Self-Hosting n8n: Use self-host n8n to control data privacy and cost.

Beginner Step-by-Step: How to Use This Workflow in n8n for Production

Import the Workflow

  1. Download the workflow file with the Download button on this page.
  2. Open n8n editor; inside it choose “Import from File”.
  3. Select the downloaded workflow file.

Configure Credentials

  1. Add your OpenRouter API Key in the relevant node settings.
  2. Add Google Sheets Service Account Credentials in the Google Sheets node.
  3. Update any spreadsheet ID, sheet name, or column mappings if needed.
  4. Check model IDs in the “Define Models to Compare” node; change if needed.

Test the Workflow

  1. Trigger test chat messages using the webhook URL from the When chat message received node.
  2. Confirm that answers from both models show up and data appends to Google Sheets.

Activate for Production

  1. Turn on the workflow toggle in n8n editor.
  2. Connect your chat interface to the webhook URL.
  3. Monitor the first few runs for any errors.

Explanation of Inputs, Processing, and Outputs

Inputs

  • The workflow listens for a user chat message via the When chat message received trigger node.
  • The user message enters as input text along with a session ID.
  • An array of two model IDs is defined to specify which LLMs to compare.

Processing Steps

  • The models list is split so each item represents one AI model call.
  • For each split model, variables store the model name, a unique session ID (combining base ID and model ID), and original chat message.
  • A Simple Memory node keeps chat history separated per model session.
  • The AI Agent node sends the message to the chosen model using the OpenRouter API and retrieves its response.
  • Results format with model name and answer in the Set node for chat display and logging.
  • Responses from both models are batched and aggregated to combine inputs, answers, and session data.
  • The combined data appends as a new row in Google Sheets for record-keeping.
  • The final concatenated answers appear back in chat for immediate visual comparison.

Outputs

  • User sees both AI model answers shown together in chat interface.
  • Google Sheets contains logs of input prompts, model outputs, and session context for evaluation.
  • Sessions retain their memory separately per model for context-aware conversations next time.

Edge Cases and Troubleshooting

Issue: No Data Appended to Google Sheets

The most common cause is wrong credentials or sheet setup.
Verify the Google Sheets node credentials are correct.

Check the spreadsheet ID, sheet name, and ensure column headers match the mapping exactly.
Incorrect or missing mappings block appending data.

Issue: Memory Context Not Keeping Between Messages

The session ID variable might not be set properly.
Confirm the Set node’s expression for sessionId combines base session and model names consistently.

All memory nodes must use this exact session key to track conversations separately.
If keys mismatch, chat history resets every time.

Issue: Model Responses Inconsistent or Empty

Check API key validity and usage limits with OpenRouter provider.
Also, verify model IDs are correct such as “openai/gpt-4.1” or “mistralai/mistral-large”.

If responses still empty, add basic system prompts in the AI Agent node to guide the model better.
This reduces confusion in replies.


Ideas for Customizing the Workflow

  • Add more models by expanding the list in the “Define Models to Compare” node.
  • Include an AI evaluator that scores or rates answers automatically, then save scores in Google Sheets.
  • Customize system prompts and tools in the AI Agent node to fit tasks like customer support or writing help.
  • Switch memory from Simple Memory to Redis or Postgres-based nodes to save longer chat context when needed.
  • Modify Google Sheets columns to add rating dropdowns or new evaluation metrics such as creativity or accuracy.

How to Handle This At Scale

This workflow works well for low to medium chat volume.
If testing many inputs quickly, use batching and queues to avoid API rate limits or slowdowns.

Also monitor OpenRouter API costs carefully as queries double sending to two models.
Consider self-host n8n to manage costs and data privacy better for heavy use.


Summary and Result

→ This workflow lets users easily compare two AI language model answers side by side in chat.
→ It keeps conversation context separate per model for accurate replies.
→ Results get saved automatically into Google Sheets for clean record keeping.
→ The user saves time and gets clear data to choose the best AI model.
✓ Saves hours of manual copying and checking.
✓ Enables transparent side-by-side AI response comparison.
✓ Provides structured chat logs for stakeholders or teams.


Compare LLMs with n8n and OpenAI

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

Yes, but the workflow needs updating to add more model IDs, change loop logic, and adjust Google Sheets to store extra answers.
Yes, each user input sends requests to two models, so API token usage and calls roughly double. Monitor quotas closely.
Yes, memory stays inside n8n environment per session keys unless configured otherwise. For extra control, use self-hosted n8n setups.
It is designed for moderate load. Large scale requires batching, queue management, or advanced hosting such as self-host n8n solutions.

Promoted by BULDRR AI

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.