Compare LLMs Easily with n8n, OpenAI & Google Sheets

Struggling to select the best language model for your AI project? This unique n8n workflow helps you compare two LLMs side-by-side with outputs logged and evaluated in Google Sheets for precise decision-making.
chatTrigger
googleSheets
agent
+9
Workflow Identifier: 1993
NODES in Use: chatTrigger, splitInBatches, memoryBufferWindow, memoryManager, stickyNote, lmChatOpenRouter, set, agent, summarize, aggregate, googleSheets, splitOut

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

What This Workflow Does

This workflow compares answers from two large language models (LLMs) side by side.
It takes a user chat message and sends it to two AI models.
Then it shows both answers one after another in chat and logs them into Google Sheets for review.
This saves time and helps decide which AI fits best for your needs.

The automation keeps chat memory separate per model so answers stay relevant.
It makes comparing multiple AI responses clear and simple.


Who Should Use This Workflow

This is for anyone who tests different AI language models and wants quick comparisons.
Non-technical users can use it with little effort after minimal setup in n8n.

It suits developers, content creators, or AI testers who want neat logs and easy side-by-side AI answers.
Users who want to cut manual copying and pasted errors will find it useful.


Tools and Services Used

  • n8n Automation Platform: Runs the workflow and connects nodes.
  • OpenRouter API: Allows access to language models like OpenAI GPT-4.1 and Mistral Large.
  • Google Sheets API: Saves data in a spreadsheet for analysis and logging.
  • Langchain Nodes in n8n: Handle chat triggers, AI agents, and memory buffers.
  • Optional Self-Hosting n8n: Use self-host n8n to control data privacy and cost.

Beginner Step-by-Step: How to Use This Workflow in n8n for Production

Import the Workflow

  1. Download the workflow file with the Download button on this page.
  2. Open n8n editor; inside it choose “Import from File”.
  3. Select the downloaded workflow file.

Configure Credentials

  1. Add your OpenRouter API Key in the relevant node settings.
  2. Add Google Sheets Service Account Credentials in the Google Sheets node.
  3. Update any spreadsheet ID, sheet name, or column mappings if needed.
  4. Check model IDs in the “Define Models to Compare” node; change if needed.

Test the Workflow

  1. Trigger test chat messages using the webhook URL from the When chat message received node.
  2. Confirm that answers from both models show up and data appends to Google Sheets.

Activate for Production

  1. Turn on the workflow toggle in n8n editor.
  2. Connect your chat interface to the webhook URL.
  3. Monitor the first few runs for any errors.

Explanation of Inputs, Processing, and Outputs

Inputs

  • The workflow listens for a user chat message via the When chat message received trigger node.
  • The user message enters as input text along with a session ID.
  • An array of two model IDs is defined to specify which LLMs to compare.

Processing Steps

  • The models list is split so each item represents one AI model call.
  • For each split model, variables store the model name, a unique session ID (combining base ID and model ID), and original chat message.
  • A Simple Memory node keeps chat history separated per model session.
  • The AI Agent node sends the message to the chosen model using the OpenRouter API and retrieves its response.
  • Results format with model name and answer in the Set node for chat display and logging.
  • Responses from both models are batched and aggregated to combine inputs, answers, and session data.
  • The combined data appends as a new row in Google Sheets for record-keeping.
  • The final concatenated answers appear back in chat for immediate visual comparison.

Outputs

  • User sees both AI model answers shown together in chat interface.
  • Google Sheets contains logs of input prompts, model outputs, and session context for evaluation.
  • Sessions retain their memory separately per model for context-aware conversations next time.

Edge Cases and Troubleshooting

Issue: No Data Appended to Google Sheets

The most common cause is wrong credentials or sheet setup.
Verify the Google Sheets node credentials are correct.

Check the spreadsheet ID, sheet name, and ensure column headers match the mapping exactly.
Incorrect or missing mappings block appending data.

Issue: Memory Context Not Keeping Between Messages

The session ID variable might not be set properly.
Confirm the Set node’s expression for sessionId combines base session and model names consistently.

All memory nodes must use this exact session key to track conversations separately.
If keys mismatch, chat history resets every time.

Issue: Model Responses Inconsistent or Empty

Check API key validity and usage limits with OpenRouter provider.
Also, verify model IDs are correct such as “openai/gpt-4.1” or “mistralai/mistral-large”.

If responses still empty, add basic system prompts in the AI Agent node to guide the model better.
This reduces confusion in replies.


Ideas for Customizing the Workflow

  • Add more models by expanding the list in the “Define Models to Compare” node.
  • Include an AI evaluator that scores or rates answers automatically, then save scores in Google Sheets.
  • Customize system prompts and tools in the AI Agent node to fit tasks like customer support or writing help.
  • Switch memory from Simple Memory to Redis or Postgres-based nodes to save longer chat context when needed.
  • Modify Google Sheets columns to add rating dropdowns or new evaluation metrics such as creativity or accuracy.

How to Handle This At Scale

This workflow works well for low to medium chat volume.
If testing many inputs quickly, use batching and queues to avoid API rate limits or slowdowns.

Also monitor OpenRouter API costs carefully as queries double sending to two models.
Consider self-host n8n to manage costs and data privacy better for heavy use.


Summary and Result

→ This workflow lets users easily compare two AI language model answers side by side in chat.
→ It keeps conversation context separate per model for accurate replies.
→ Results get saved automatically into Google Sheets for clean record keeping.
→ The user saves time and gets clear data to choose the best AI model.
✓ Saves hours of manual copying and checking.
✓ Enables transparent side-by-side AI response comparison.
✓ Provides structured chat logs for stakeholders or teams.


Frequently Asked Questions

Yes, but the workflow needs updating to add more model IDs, change loop logic, and adjust Google Sheets to store extra answers.
Yes, each user input sends requests to two models, so API token usage and calls roughly double. Monitor quotas closely.
Yes, memory stays inside n8n environment per session keys unless configured otherwise. For extra control, use self-hosted n8n setups.
It is designed for moderate load. Large scale requires batching, queue management, or advanced hosting such as self-host n8n solutions.

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation Workflows in n8n

A complete beginner guide to building an AI SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free