1. Opening Problem Statement
Meet Sarah, a data scientist managing several local Large Language Models (LLMs) hosted through LM Studio on her network. She needs to test multiple models systematically for clarity, conciseness, and readability—crucial to choosing the best performing model for her organization’s chatbot. However, manually querying each model, capturing response metrics, and analyzing readability scores is time-consuming and error-prone, taking hours per set of tests and often resulting in inconsistent comparisons.
Sarah’s team wasted nearly 10+ hours weekly performing these tests manually, also risking overlooked data nuances. There had to be a better way.
2. What This Automation Does
This n8n workflow revolutionizes how Sarah tests multiple local LLMs with LM Studio by automating the entire process end-to-end. When a chat message is received:
- Automatically retrieves the current list of loaded LLM models from the LM Studio server.
- Sends the incoming prompt to each individual model separately to generate responses.
- Captures start and end timestamps to measure response time accurately.
- Analyzes each model’s response for key metrics including readability score (Flesch-Kincaid), word count, sentence count, average word length, and average sentence length.
- Saves the prompt, model ID, response, time metrics, and analysis results into a Google Sheet for side-by-side comparison and historic tracking.
- Provides configurable AI parameters such as temperature and presence penalty to fine-tune the output quality and diversity.
Using this workflow, Sarah cuts down testing time by over 80%, eliminates human error, and collects quantifiable data for objective model evaluation.
3. Prerequisites ⚙️
- LM Studio installed and running on your local server (https://lmstudio.ai/) to host multiple local LLM models.
- n8n account or self-hosted instance to execute this workflow.
- Google Sheets account enabled with an OAuth2 credential connected to n8n for result logging.
- Network access configured to allow HTTP requests to your LM Studio server IP and port.
- OpenAI API credentials linked inside n8n for model running node compatibility.
4. Step-by-Step Guide
Step 1: Set Up LM Studio with Your Desired Models
Download and install LM Studio from https://lmstudio.ai/. Load and configure the LLM models you want to test. Make sure the server is up and responsive.
Tip: Refer to the LM Studio basics documentation for detailed setup instructions.
Step 2: Update the Base URL in the HTTP Request Node
In the node named Get Models, update the URL to point to your LM Studio server’s IP and port, e.g., http://192.168.1.179:1234/v1/models. This URL fetches the list of currently loaded models at runtime.
Outcome: When triggered, this node returns live model IDs ensuring the workflow runs on accurate, current models.
Step 3: Configure the Chat Trigger Node
Locate the When chat message received node. This is your workflow’s starting webhook trigger that listens for a chat input. The webhook ID is automatically generated; you can use this URL to send test chat messages for simulation.
Step 4: Extract Model IDs to Run Separately
Next, the Extract Model IDsto Run Separately node splits the returned model list so the workflow can query each model individually with the input prompt.
Step 5: Capture Start Time
Use the Capture Start Time dateTime node to record when the prompt is sent to each model. This marks the beginning of processing time measurement.
Step 6: Add a System Prompt for Controlled Output
The Add System Prompt set node inserts a guiding system message to ensure that all LLM responses are concise and readable at a 5th-grade reading level. Edit this prompt to customize model behavior.
Step 7: Run Model with Dynamic Inputs
Configure the Run Model with Dunamic Inputs AI node to send the chat prompt and system prompt to each extracted model ID. The node is set to interact with the LM Studio server URL and includes parameters like temperature and presence penalty for generating versatile outputs.
Note: You need valid OpenAI API credentials integrated in n8n for this node.
Step 8: Capture End Time
Immediately after receiving responses, the Capture End Time node records the timestamp to mark completion of LLM processing.
Step 9: Calculate Time Difference
The Get timeDifference dateTime node calculates elapsed time between start and end timestamps capturing LLM processing duration.
Step 10: Prepare Data for Analysis
The Prepare Data for Analysis set node consolidates prompt, model, response, and time metadata into a unified format ready for metric calculations.
Step 11: Analyze LLM Response Metrics
This critical Analyze LLM Response Metrics code node runs JavaScript to evaluate each model response:
// Key functions include word and sentence count,
average sentence length, average word length, and
Flesch-Kincaid readability score calculation.
// Example snippet:
function countWords(text) {
return text.trim().split(/s+/).length;
}
// The complete code performs all analyses and returns results for each response.
This helps determine readability and linguistic complexity of outputs empirically.
Step 12: Save Results to Google Sheets
Finally, the Save Results to Google Sheets node appends all analyzed data including timestamps, model identifiers, responses, and readability metrics into a pre-configured Google Sheet.
Optional: You can remove this node to manually review workflow outputs inside n8n instead.
5. Customizations ✏️
- In the Add System Prompt node, modify the
system_prompttext to test model responses for tone, style, or complexity matching your project needs. - Adjust temperature, presence penalty, and top_p parameters inside the Run Model with Dunamic Inputs node to control randomness and focus of model outputs, refining testing scenarios.
- Swap out the Save Results to Google Sheets node for other storage options like databases or CSV file nodes if you prefer different logging mechanisms.
- Extend the Analyze LLM Response Metrics code node to include sentiment analysis or keyword density for richer insights.
- Include additional triggers or input sources besides the chat webhook to test bulk prompts or scripted tests by feeding data programmatically.
6. Troubleshooting 🔧
Problem: “HTTP Request to LM Studio API fails or times out”
Cause: Incorrect LM Studio IP address, server down, or network blocking requests.
Solution: Verify LM Studio server is online; update the Base URL in the Get Models node to correct IP and port. Test connectivity with curl or Postman from the same machine running n8n.
Problem: “Google Sheets node appending fails or no data saved”
Cause: OAuth credentials expired or sheet ID mismatched.
Solution: Re-authenticate Google Sheets credentials in n8n. Confirm Sheet document ID and worksheet gid match what’s configured in the node.
Problem: “Model responses not matching system prompt guidance”
Cause: System prompt not correctly set or input misformatted.
Solution: Double-check the text under Add System Prompt, ensure it passes through properly to the AI node. Clear any prior chat context as advised by clearing old chats to reset model state.
7. Pre-Production Checklist ✅
- Confirm LM Studio models are live and accessible on the set Base URL.
- Test sending sample chat messages to the webhook URL connected to the When chat message received node.
- Verify Google Sheet headers exactly match the columns defined in the Save Results to Google Sheets node for proper data mapping.
- Run the workflow with a known prompt, check that timestamps, analysis metrics, and responses are properly generated and logged.
- Backup important workflow configurations and Google Sheets data before production deployment.
8. Deployment Guide
Activate the workflow in n8n after verifying all configurations. Monitor initial runs through execution logs for any errors or unexpected results. Schedule the workflow to run in response to chat messages or integrate into your larger chatbot testing framework. Maintain LM Studio server uptime and update model URLs as needed. Consider logging workflow executions externally for auditing and troubleshooting.
9. FAQs
Q: Can I use this workflow with cloud-hosted LLMs like OpenAI?
A: Yes, but you will need to adjust the Get Models node and model running node to interact with the respective cloud APIs and endpoints. This workflow is optimized for local LM Studio instances.
Q: Does this consume OpenAI API credits?
A: Yes, the node configured to run models uses the OpenAI API, so ensure you manage your quota accordingly.
Q: Is my data secure during this testing?
A: Since this runs on local infrastructure with LM Studio, your data stays within your network. Google Sheets storage depends on your account privacy settings.
Q: How scalable is this workflow for many models?
A: The workflow dynamically retrieves models and queries each in sequence, so scalability depends on LM Studio performance and network throughput. For many models, consider batching or parallel executions with rate limiting.
10. Conclusion
By following this guide, you’ve automated testing of multiple local LLMs through LM Studio using n8n, saving countless hours and gaining precise readability and linguistic insights into model outputs. You now have a robust system to compare LLMs efficiently, objectively, and with historic context.
Next steps could include integrating sentiment analysis into your metrics, automating bulk prompt testing, or connecting your preferred storage system for further analytics. Start building smarter, clearer, and faster LLM evaluations today!