What This Automation Does
This workflow listens for chat messages, then chooses the best local Ollama model to answer each question. It saves time by picking models made for text, coding, or vision tasks automatically. All AI processing stays on your own machine, keeping data private.
When a chat input arrives, the workflow analyzes it using clear rules and picks a specialized Ollama LLM. Conversation memory nodes keep chat history for smooth back-and-forths. The chosen model answers carefully based on the task, improving quality.
This removes manual steps of switching models and stops data from leaving your computer. You get fast, correct AI responses that fit your exact needs.
Who Should Use This Workflow
This workflow suits people who run local Ollama AI models and want to use many types without confusion. It works well for developers and AI fans who care about privacy and want the right help for each question automatically.
Non-technical users with some n8n experience can also benefit by setting this up and saving hours. Anyone needing code answers, text explanations, or image understanding from local AI will find it useful.
Tools and Services Used
- n8n automation platform: Hosts and runs the workflow.
- Ollama local API: Provides large language models for text, code, and vision.
- LangChain community nodes: Include chat trigger and AI agent components.
- Router and Agent Chat Memory nodes: Store conversation history for context.
Beginner step-by-step: How to Build This in n8n
Importing the Workflow
- Click the Download button on this page to get the workflow JSON file.
- Open the n8n editor where you work on automation flows.
- Use the menu option Import from File to load the downloaded workflow into n8n.
Configuring the Workflow
- Go to each node that needs credentials, like the Ollama API nodes, and add your API key info.
- Update IDs, emails, or folder names if you use external channels or storage nodes (check if any).
- Check the system and user prompt fields. Copy and paste the exact prompts or expressions as written.
- Example for dynamic model selection expression:
= {{ $('LLM Router').item.json.output.parseJson().llm }}This chooses the right model output by the router.
Testing and Activating
- Send a test prompt through the webhook URL or chat input connected to the When chat message received node.
- Watch the active workflow executions or logs to see if the router picks models correctly and answers return.
- If tests pass, activate the workflow in n8n by switching it on.
- Optionally, learn about self-host n8n to run this workflow on your server.
How the Workflow Works: Inputs, Processing, Outputs
Inputs
- User chat messages arrive via the When chat message received trigger node.
- Prompts include text needing answers or commands.
Processing Steps
- The LLM Router analyzes prompt text with rules and a decision tree to pick the best Ollama model for text, code, or vision tasks.
- Router Chat Memory keeps context so routing decisions remember past messages.
- Chosen Ollama model nodes receive the prompt to generate a reply, running fully on local API without cloud calls.
- The AI Agent with dynamic LLM connects to the selected Ollama model, producing answers based on conversation context.
- Agent Chat Memory stores multi-turn conversation history for smooth dialogue flow.
Outputs
- User sees a relevant and correct AI response chosen from the specialized models.
- Conversation stays coherent thanks to memory nodes holding context.
- All processing stays local; no user data leaves the machine.
Customizations
- Add more Ollama models by editing the system prompt inside the LLM Router node. Describe the new models and add them to the decision logic.
- Change how the router chooses models by updating the classification rules or decision tree for different tasks.
- Adjust memory sizes in Router Chat Memory and Agent Chat Memory to keep longer or shorter chat histories.
- Add image preprocessing steps like OCR or metadata extraction before routing if you handle images.
- Update the system message in the AI Agent node to change tone, style, or add extra instructions for replies.
Troubleshooting
- LLM Router not selecting correct model:
Check the system prompt and classification rules for missing details or syntax errors. Test examples matching each model. - Ollama API communication failed:
Make sure Ollama is running locally at http://127.0.0.1:11434, and API credentials in n8n are correct. - Memory nodes not saving chat context:
Ensure memory nodes connect properly and the sessionId from chat trigger is used.
Pre-Production Checklist
- Confirm Ollama models are installed locally using
ollama pull <model>. - Test API connection from n8n credential settings.
- Make sure the webhook URL from When chat message received node is reachable if testing outside n8n.
- Send multi-turn chats to verify memory nodes keep context.
- Check that prompts in different categories select the right models and answers are accurate.
- Backup the workflow before turning it on.
Deployment Guide
Once all setup is done and tests are good, turn on the workflow inside the n8n editor to start listening for chat messages.
Watch execution logs for any errors. Because all AI runs locally, the system does not depend on internet or external services.
Summary
✓ Automatically picks the right local Ollama model for each user chat prompt.
✓ Saves time and avoids manual model switching.
✓ Keeps all chat data strictly on local machine for privacy.
✓ Maintains chat context to support multi-turn dialogue.
✓ Easy to configure and extend inside n8n.
✓ Ideal for developers and AI users wanting precise, private local AI help.
