What this workflow does
This workflow automates getting new files from Supabase storage, reading their content, making AI searchable data, and updating a vector database. It stops repeated work by skipping files you already processed. New PDF and text files get split into small parts and turned into vectors for fast searches with AI. The workflow also allows chatting with an AI bot to ask questions about your files.
The main goal is to save time on manual file handling and give quick answers from stored documents.
Who should use this workflow
Anyone who has many PDF or text files in Supabase and wants automatic, no-touch extraction and indexing.
This fits teams needing a fast way to find info in their documents using AI chat without reprocessing duplicate files.
Tools and services used
- Supabase Storage: Holds original files and metadata tables.
- Supabase Vector Store: Stores vector embeddings for semantic search.
- OpenAI API: Generates vector embeddings and powers AI chatbot.
- n8n Automation Platform: Runs workflow automation, connects all steps.
How the workflow works (Inputs → Processing → Outputs)
Inputs
The workflow starts by getting all processed file records from the Supabase table to know which files were done.
It queries the Supabase Storage API to list all current files in the target bucket, excluding placeholders.
Processing Steps
- Use an aggregation step to collect existing file data into one object for easy comparison.
- Process files one by one using a batch node to avoid overload.
- Check each file’s name and metadata to skip already processed or placeholder files.
- Download new files securely using authenticated HTTP requests.
- Decide file type with a switch: if text, use raw content; if PDF, extract text with a PDF extractor node.
- Merge extracted or raw text content back to the main workflow.
- Use a recursive splitter to break big texts into smaller chunks with overlap (to keep context).
- Load these chunks into structured documents, adding metadata like file ID.
- Generate vector embeddings from the chunks via OpenAI’s embedding model.
- Update the Supabase files table with new file records to prevent duplicates.
- Insert new vector embeddings into the Supabase vector store for fast AI retrieval.
Outputs
Fresh vector data stored in Supabase to enable quick semantic search.
New file records made to track processed files.
Support for an AI chatbot that answers questions in real time using vector search on the document content.
Beginner step-by-step: How to use this workflow in n8n
Import the workflow
- Download the workflow file from this page using the Download button.
- Open the n8n editor where you want to run the workflow.
- Use the menu to select “Import from File” and pick the downloaded workflow.
Configure credentials and details
- Add your Supabase API Key and project reference into the Supabase credential settings.
- Insert your OpenAI API Key in the OpenAI credential node.
- Review and if needed, update table names, storage bucket IDs, or URLs in relevant HTTP Request or Supabase nodes.
Test and activate
- Run the flow manually by clicking the Manual Trigger named When clicking ‘Test workflow’.
- Check outputs for errors and that files get processed correctly.
- When ready, activate the workflow with the switch at the top-right in n8n.
- Set up a time-based trigger if you want the workflow to check for new files on a schedule.
If self hosting n8n, view self-host n8n for deployment tips.
Edges and failure points to watch
- Make sure Supabase API Key has access to list and download files.
- Check file URLs and authentication setup in HTTP request nodes carefully to avoid 401 errors.
- PDF extraction can fail if input files are corrupt or binary data is missing.
- Conditions checking existing files must be exact to stop duplicates.
- Keep OpenAI keys valid to avoid failures in vector generation.
Customization ideas
- Change chunk size or overlap in the text splitter node to fit your documents’ average size.
- Add new cases to the switch node for more file types like DOCX or CSV with proper extractors.
- Add metadata such as author or upload date in the document loader for richer searches.
- Change chatbot prompts to match your company language or use case.
- Switch API key authentication to OAuth in HTTP Requests for better security if needed.
Summary of results
✓ Save hours weekly by automating file fetching and processing.
✓ Avoid duplicate work by tracking processed files.
✓ Create searchable vectors for instant AI-powered document lookup.
✓ Use an AI chatbot able to answer questions based on uploaded files.
