Opening Problem Statement
Meet Lisa, a knowledge manager at a rapidly growing tech firm. Every week, she receives dozens of technical PDF documents that need to be accessible for her company’s AI chatbot to answer intricate questions accurately. The manual process of extracting, embedding, and uploading this data is cumbersome, taking hours each week, prone to errors, and delaying responses to critical queries from the team. Lisa’s team loses valuable time sifting through documents, impacting project timelines and decision-making.
Lisa needs an automated, reliable solution that can watch a folder, process incoming PDFs into a searchable vector database, and enable an AI-powered chat agent to retrieve accurate answers swiftly based on those documents. This is exactly where this n8n workflow with Milvus and Cohere steps in.
What This Automation Does
This n8n workflow creates a powerful Retrieval-Augmented Generation (RAG) AI agent by connecting various tools seamlessly. Here’s what happens when the automation runs:
- Watches a specific Google Drive folder for newly added PDF files.
- Downloads new PDFs automatically as soon as they appear in the folder.
- Extracts text content from downloaded PDFs using the “Extract from File” node.
- Processes and splits the extracted text into manageable chunks for efficient embedding.
- Generates high-quality embeddings of text chunks using the Cohere Embeddings model “embed-multilingual-v3.0”.
- Inserts these embeddings into the Milvus vector database to support fast, scalable vector search.
- Enables a chat-triggered RAG agent that retrieves the most relevant documents from Milvus and crafts intelligent AI responses powered by OpenAI’s GPT-4o model.
By using this workflow, Lisa’s team can save multiple hours weekly by automating data ingestion and enabling rapid, knowledge-driven AI responses that are always up to date based on new documents.
Prerequisites ⚙️
- n8n account (cloud or self-hosted) 🔌
- Google Drive account with a folder ready for monitoring 📁
- Cohere API account for embeddings generation 🔑
- Milvus vector database account via Zilliz platform 🔐
- OpenAI account with access to GPT-4o model 🔑
- Basic knowledge of n8n interface navigation ⏱️
Optional: You can self-host n8n using providers like Hostinger for better control: https://buldrr.com/hostinger
Step-by-Step Guide to Set Up Your RAG AI Agent
1. Start with the “Watch New Files” Node to Detect New PDFs
Navigate to the node panel and add the Google Drive Trigger node named “Watch New Files.” Set it to trigger on the “fileCreated” event within the specific folder you want to monitor. For example, link your Google Drive folder by selecting it under “folderToWatch.” This node polls every minute for new PDF files you add.
You should see the node monitoring your chosen folder and ready to trigger when a new document arrives. A common issue here is forgetting to set the correct folder ID, which results in no triggers firing.
2. Download the Newly Added PDF Using “Download New” Node
Connect “Watch New Files” main output to a Google Drive node called “Download New.” Configure this node to use the file ID from the trigger to download the actual file content.
After configuring, test by adding a PDF to your Google Drive folder and verify that it downloads successfully. An error often arises if OAuth credentials are not set properly in the Google Drive node.
3. Extract Text Content from the Downloaded PDF
Add the Extract from File node to process the downloaded PDF. Configure it for PDF operation to extract textual content.
After connecting the output of “Download New” here, the node will parse the PDF and return its text. Check for parsing errors if the extraction returns empty results, sometimes due to encrypted or malformed PDFs.
4. Use “Default Data Loader” to Prepare Documents for Embedding
Add the Default Data Loader node. This node structures your extracted text into documents suitable for embedding.
Connect “Extract from File” to this node, and ensure it receives proper JSON formatted text. The output will be a structured document array.
5. Split Documents into Manageable Text Chunks with “Set Chunks”
Insert the Text Splitter: Recursive Character Text Splitter node called “Set Chunks.” Configure it with a chunk size of 700 characters and overlap of 60 to maintain context.
This node helps keep embeddings meaningful and efficient by breaking large texts into smaller parts. After running, you should see clear chunks ready for embedding. A common pitfall is setting chunk size too large, causing slower indexing and retrieval.
6. Generate Embeddings with “Embeddings Cohere”
Add the Embeddings Cohere node. Set the model to “embed-multilingual-v3.0” for handling multiple languages accurately.
Make sure your Cohere API credentials are attached. This node converts each chunk of text into vector embeddings compatible with Milvus.
7. Insert Embeddings into Milvus
Add the Insert into Milvus node. Configure it to insert mode, specify your Milvus collection name, and attach Milvus API credentials.
This step stores your vectorized data in the Milvus vector store for fast semantic search. If you get connection errors, verify your Milvus credentials and collection name are correct.
8. Set Up the Chat Trigger with “When Chat Message Received”
Add the Chat Trigger node, which listens for incoming user questions or chat messages. It triggers the RAG AI agent with the input text.
9. Configure the “RAG Agent” Node to Handle Queries
Add the RAG Agent node, connecting it to the chat trigger. This agent integrates:
- Retrieve from Milvus node set to “retrieve-as-tool” mode to fetch relevant documents.
- Memory Buffer Window node named “Memory” to maintain conversation context.
- OpenAI GPT-4o Chat node “OpenAI 4o” to generate AI responses.
Ensure all nodes are connected properly for streaming data. This setup enables your AI to answer with contextually relevant, up-to-date information from your own documents.
10. Test Your RAG AI Agent
Trigger the chat by sending a question. The “When Chat Message Received” node activates the agent, which retrieves data from Milvus, uses memory for context, and replies via GPT-4o. You should receive coherent, document-based answers within seconds.
Customizations ✏️
- Change Embedding Model: In the “Embeddings Cohere” node, switch the model to any other supported Cohere embedding model to suit your language or domain needs.
- Chunk Size Adjustment: Modify the “Set Chunks” node’s chunk size and overlap if you want faster indexing or more in-depth context per chunk.
- Switch Vector DB: Although this workflow uses Milvus, you can replace it with Supabase or Pinecone by swapping out the “Insert” and “Retrieve” nodes and updating credentials.
- Expand Memory Size: Increase the window size in the “Memory” node to hold longer conversation history if needed.
- Folder to Monitor: Change the Google Drive folder ID in the “Watch New Files” node to track different or multiple folders.
Troubleshooting 🔧
- Problem: “No new files detected by the Watch New Files node.”
Cause: Incorrect folder ID or permission issues with Google Drive API.
Solution: Double-check the folder ID in the trigger node and ensure your Google Drive credentials have correct access rights.
- Problem: “PDF extraction returns empty or error.”
Cause: PDF might be encrypted, corrupted, or not supported.
Solution: Test with standard PDFs and verify the “Extract from File” node configuration. Sometimes converting PDFs can fix issues.
- Problem: “Milvus insertion fails or times out.”
Cause: API credential mismatch or incorrect collection name.
Solution: Verify Milvus credentials and collection parameters. Check Zilliz cloud status if hosted externally.
Pre-Production Checklist ✅
- Verify all API credentials for Google Drive, Cohere, Milvus, and OpenAI are correctly entered in n8n.
- Test the “Watch New Files” node by adding a sample PDF and confirm it triggers download and extraction.
- Check that the “Extract from File” node correctly extracts text from PDFs.
- Confirm embeddings are created and inserted into Milvus by monitoring logs or using vector database tools.
- Test chat interaction to ensure the RAG Agent returns relevant answers based on stored documents.
Deployment Guide
Activate your workflow inside n8n and set the “Watch New Files” trigger node to start listening for new PDFs (make sure n8n is running). For cloud users, this runs continuously. For self-hosted, ensure your server uptime is stable.
Monitor executions within n8n’s execution panel to track errors or failed runs. Logs also help optimize query accuracy and address API quota limits.
FAQs
- Can I replace Milvus with another vector database? Yes. Supabase or Pinecone can be used by swapping vector store nodes.
- Does this workflow consume many API credits? It depends on document size and query frequency; Cohere and OpenAI charge per usage.
- Is my data secure? Data is processed within your APIs and hosted services; always use secure credentials and environment controls.
- Can this handle a large number of PDFs? Yes, Milvus scales well for large datasets and high-volume search.
Conclusion
By following this detailed tutorial, you’ve built a tailored RAG AI agent using Milvus vector store and Cohere embeddings, fully automated with n8n. Lisa’s team can now instantly retrieve and answer questions based on their ever-growing document repository, saving numerous hours per week and drastically reducing human error.
Next steps include integrating your new AI agent with Slack or Microsoft Teams for team collaboration, adding more data sources like databases or emails, or experimenting with other LLM models for varied AI response styles.
Keep innovating with n8n and your AI stack!