How does the LLM Router node select the correct Ollama model?

The LLM Router uses classification rules and a decision tree inside its system message prompt to analyze the incoming chat text and pick the most suitable Ollama model based on task type like text, code, or vision.

What should be done if Ollama API communication fails in the workflow?

Verify Ollama is running locally on http://127.0.0.1:11434 and make sure the API credentials are correctly entered in the n8n nodes configuration.

How can conversation context be maintained across multiple messages?

Memory buffer nodes named Router Chat Memory and Agent Chat Memory must be connected properly and use sessionId keys from the chat trigger node to save and recall context during conversations.

Can new Ollama models be added to the routing workflow easily?

Yes, by editing the system message in the LLM Router node to describe the new models and updating the decision rules, users can add more local Ollama models for routing.

Dynamic Ollama LLM Router With N8n For Private AI Tasks

What This Automation Does

This workflow listens for chat messages, then chooses the best local Ollama model to answer each question. It saves time by picking models made for text, coding, or vision tasks automatically. All AI processing stays on your own machine, keeping data private.

When a chat input arrives, the workflow analyzes it using clear rules and picks a specialized Ollama LLM. Conversation memory nodes keep chat history for smooth back-and-forths. The chosen model answers carefully based on the task, improving quality.

This removes manual steps of switching models and stops data from leaving your computer. You get fast, correct AI responses that fit your exact needs.

Who Should Use This Workflow

This workflow suits people who run local Ollama AI models and want to use many types without confusion. It works well for developers and AI fans who care about privacy and want the right help for each question automatically.

Non-technical users with some n8n experience can also benefit by setting this up and saving hours. Anyone needing code answers, text explanations, or image understanding from local AI will find it useful.

Tools and Services Used

n8n automation platform: Hosts and runs the workflow.

Ollama local API: Provides large language models for text, code, and vision.

LangChain community nodes: Include chat trigger and AI agent components.

Router and Agent Chat Memory nodes: Store conversation history for context.

Beginner step-by-step: How to Build This in n8n

Importing the Workflow

Click the Download button on this page to get the workflow JSON file.

Open the n8n editor where you work on automation flows.

Use the menu option Import from File to load the downloaded workflow into n8n.

Configuring the Workflow

Go to each node that needs credentials, like the Ollama API nodes, and add your API key info.

Update IDs, emails, or folder names if you use external channels or storage nodes (check if any).

Check the system and user prompt fields. Copy and paste the exact prompts or expressions as written.

Example for dynamic model selection expression:
```
= {{ $('LLM Router').item.json.output.parseJson().llm }}
```
This chooses the right model output by the router.

Testing and Activating

Send a test prompt through the webhook URL or chat input connected to the When chat message received node.

Watch the active workflow executions or logs to see if the router picks models correctly and answers return.

If tests pass, activate the workflow in n8n by switching it on.

Optionally, learn about self-host n8n to run this workflow on your server.

How the Workflow Works: Inputs, Processing, Outputs

Inputs

User chat messages arrive via the When chat message received trigger node.

Prompts include text needing answers or commands.

Processing Steps

The LLM Router analyzes prompt text with rules and a decision tree to pick the best Ollama model for text, code, or vision tasks.

Router Chat Memory keeps context so routing decisions remember past messages.

Chosen Ollama model nodes receive the prompt to generate a reply, running fully on local API without cloud calls.

The AI Agent with dynamic LLM connects to the selected Ollama model, producing answers based on conversation context.

Agent Chat Memory stores multi-turn conversation history for smooth dialogue flow.

Outputs

User sees a relevant and correct AI response chosen from the specialized models.

Conversation stays coherent thanks to memory nodes holding context.

All processing stays local; no user data leaves the machine.

Customizations

Add more Ollama models by editing the system prompt inside the LLM Router node. Describe the new models and add them to the decision logic.

Change how the router chooses models by updating the classification rules or decision tree for different tasks.

Adjust memory sizes in Router Chat Memory and Agent Chat Memory to keep longer or shorter chat histories.

Add image preprocessing steps like OCR or metadata extraction before routing if you handle images.

Update the system message in the AI Agent node to change tone, style, or add extra instructions for replies.

Troubleshooting

LLM Router not selecting correct model:
Check the system prompt and classification rules for missing details or syntax errors. Test examples matching each model.

Ollama API communication failed:
Make sure Ollama is running locally at http://127.0.0.1:11434, and API credentials in n8n are correct.

Memory nodes not saving chat context:
Ensure memory nodes connect properly and the sessionId from chat trigger is used.

Pre-Production Checklist

Confirm Ollama models are installed locally using ollama pull <model>.

Test API connection from n8n credential settings.

Make sure the webhook URL from When chat message received node is reachable if testing outside n8n.

Send multi-turn chats to verify memory nodes keep context.

Check that prompts in different categories select the right models and answers are accurate.

Backup the workflow before turning it on.

Deployment Guide

Once all setup is done and tests are good, turn on the workflow inside the n8n editor to start listening for chat messages.

Watch execution logs for any errors. Because all AI runs locally, the system does not depend on internet or external services.

Summary

✓ Automatically picks the right local Ollama model for each user chat prompt.
✓ Saves time and avoids manual model switching.
✓ Keeps all chat data strictly on local machine for privacy.
✓ Maintains chat context to support multi-turn dialogue.
✓ Easy to configure and extend inside n8n.
✓ Ideal for developers and AI users wanting precise, private local AI help.

Dynamic Ollama LLM Router with n8n for Private AI Tasks

What This Automation Does

Who Should Use This Workflow

Tools and Services Used

Beginner step-by-step: How to Build This in n8n

Importing the Workflow

Configuring the Workflow

Testing and Activating

How the Workflow Works: Inputs, Processing, Outputs

Inputs

Processing Steps

Outputs

Customizations

Troubleshooting

Pre-Production Checklist

Deployment Guide

Summary

Frequently Asked Questions

2 Months of Sales Navigator 👉 FREE

10,000+ n8n Workflows to Download & Learn Building

Automate your LinkedIn Posts

1:1 - Meeting FREE

Get Self-Host n8n

Promoted by BULDRR AI

Learn by Category

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

AI SEO Blog Writer Automation Workflows in n8n

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

Automate Telegram Invoices to Notion with AI Summaries & Reports

Automate Email Replies with n8n and AI-Powered Summarization

Automate Email Campaigns Using n8n with Gmail & Google Sheets

Browse by Apps