Opening Problem Statement
Meet Lisa, a finance manager at a growing startup. Every week, Lisa spends hours manually extracting data from piles of receipts and invoices stored in Google Drive and entering them into spreadsheets for accounting. This repetitive task often leads to errors and lost time, eating away at the hours she could spend on strategic financial planning. With dozens of files arriving weekly, manual processing is inefficient, error-prone, and frustrating.
Lisa’s problem is common among businesses that receive many invoices and receipts digitally but lack an automated method to extract and organize the data. Without automation, Lisa wastes roughly 5 hours weekly on this process, delaying financial reporting and increasing the risk of costly mistakes.
What This Automation Does
This n8n workflow solves Lisa’s problem by automatically recognizing and extracting key information from receipt and invoice files stored in a specific Google Drive folder. Here’s what happens when this workflow runs:
- Monitors a dedicated Google Drive folder for newly uploaded receipt or invoice files (PDF, PNG, JPG formats).
- Automatically downloads any new or unprocessed files.
- Sends the files to the OakPDF Receipt and Invoice OCR API via an HTTP Request node for optical character recognition (OCR).
- Parses the OCR API’s JSON response to extract structured data such as company name, country, invoice lines, total amount, and currency.
- Appends or updates this data into a Google Sheets document for easy viewing and further processing.
- Ensures no duplicate processing by checking Google Sheets for previously handled files.
By automating these steps, Lisa can save approximately 5 hours weekly, reduce data entry errors, and quickly access structured invoice data organized in Google Sheets.
Prerequisites ⚙️
- n8n account (cloud or self-hosted) ⚙️
- Google Drive account with a folder to monitor 📁
- Google Sheets account with a sheet configured to store invoice data 📊
- OakPDF Receipt and Invoice OCR API key via RapidAPI 🔑 Get your API key here
- Configured Google credentials for Drive and Sheets in n8n 🔐
Step-by-Step Guide
1. Start with Manual Trigger
In n8n, your workflow begins with the Manual Trigger node named “When clicking ‘Test workflow'”. This lets you run the workflow on demand for testing and initial runs.
Navigate to the node settings and add it to your workflow canvas. No parameters need changing here.
You should see this trigger in the center of your canvas ready to initiate the flow.
2. Monitor Google Drive Folder for New Files
Next, add the Google Drive Trigger node named “On new file in Google Drive” configured to monitor a specific folder ID where receipts and invoices will be uploaded.
Set it to watch for the “fileCreated” event and poll the folder every minute.
Ensure your Google Drive OAuth credentials are connected.
This node automatically triggers the workflow when a new file appears, so you don’t have to manually start the process.
3. Load Existing Files from Google Drive Folder
Add a Google Drive node called “Load files from Google Drive folder” that lists all files from the same folder. This node helps identify all documents present for processing.
Configure the folderId to the specific monitored Google Drive folder. Connect your Google Drive OAuth credentials.
This node will return all relevant file metadata needed for further steps.
4. Retrieve Already Processed Rows from Google Sheets
Next, the Google Sheets node “Get already processed rows from Sheets” retrieves the current invoice data stored in your sheet.
Configure it with your Google Sheets document ID and sheet name (default gid=0 or Sheet1).
This retrieval lets the workflow know which files have already been processed, avoiding duplicates.
5. Filter Out Already Processed Files
Use a Merge node named “Filter processed files” in “keepNonMatches” mode to compare the list of files found in Google Drive against those recorded in Google Sheets.
Set the matching field as the file unique ID. This node outputs only new files that need OCR processing.
6. Download Files for OCR Processing
Connect to a Google Drive node “Download file for OCR” with operation set to download based on the file ID passed from the filtered list.
This node fetches the actual document content (PDF, PNG, JPG) for OCR.
7. Send Files to OCR API
Add the HTTP Request node “OCR recognize” that sends the downloaded file to the OakPDF Receipt and Invoice OCR API.
Configure it as a POST request with multipart/form-data content type.
Attach your RapidAPI HTTP header credentials with your API key.
Include the setting { "documentType": "invoice" } in the form data body parameter.
This node returns the OCR JSON with extracted structured data.
8. Parse OCR JSON Response
Use a Code node named “Unserialize response JSON” with this JavaScript code:
// Loop over input items and add the parsed data to each
for (const item of $input.all()) {
item.json.parsedData = JSON.parse(item.json.result.data);
}
return $input.all();
This step transforms the raw OCR JSON text into a usable JavaScript object for later steps.
9. Save OCR Data into Google Sheets
Finally, the Google Sheets node “Save OCR result into Sheets” appends or updates the extracted receipt data into your spreadsheet.
The node is configured with defined column mappings for fields like file ID, company name, country code, total amount, currency, filename, and line items.
Make sure to connect your Google Sheets credentials and specify the document ID and sheet name.
This completes the automation from file arrival to structured data in your spreadsheet.
Customizations ✏️
- Change OCR document type: In the “OCR recognize” HTTP Request node, change the
documentTypein the body from “invoice” to “receipt” or others supported by the OCR API to adapt to different document formats. - Monitor multiple folders: Add additional Google Drive Trigger nodes for other folders to process receipts from diverse sources by duplicating the workflow trigger and merging results.
- Enhance data extraction: Customize the Code node to extract more detailed fields from the OCR JSON response if your documents have specific data requirements.
- Use webhook trigger: Replace the manual trigger with a Webhook node to automate from external systems pushing files into Google Drive and triggering processing.
Troubleshooting 🔧
Problem: “HTTP Request node returns 401 Unauthorized.”
Cause: Invalid or missing API key for the OakPDF OCR service.
Solution: Go to your HTTP Request node credentials, verify your RapidAPI key is correctly set and not expired. Test with a valid key.
Problem: “No new files are being processed despite uploading files to Google Drive folder.”
Cause: Google Drive Trigger node not properly linked or folder ID incorrect.
Solution: Confirm the folder ID in the trigger matches exactly the folder you upload to. Ensure polling interval is active.
Problem: “Google Sheets updates fail with data mismatch errors.”
Cause: Incorrect column mapping or sheet ID.
Solution: Double-check Google Sheets configuration in the “Save OCR result into Sheets” node. Confirm sheet ID, sheet name, and column mappings are correct.
Pre-Production Checklist ✅
- Verify Google Drive folder ID is correct and accessible by n8n credentials.
- Confirm RapidAPI OCR key is valid and attached to HTTP Request credentials.
- Test manual trigger to ensure full workflow runs end-to-end.
- Check Google Sheets document is shared with your n8n Google Sheets OAuth account and has correct columns.
- Verify that no duplicate entries appear in the sheet after processing.
Deployment Guide
Activate your workflow by enabling it in n8n. The Google Drive Trigger will continuously monitor the specified folder for new files, ensuring automation runs automatically as files upload.
Use the manual trigger button for initial testing or troubleshooting. Monitor workflow execution logs within n8n to identify any errors or performance issues.
FAQs
Q: Can I use a different OCR service instead of OakPDF?
A: Yes, but you would need to adjust the HTTP Request node’s URL, headers, and body to match the alternative API’s requirements.
Q: Does this workflow consume a lot of API credits?
A: Each OCR request counts as one API call on RapidAPI. Monitor your usage on the provider’s dashboard.
Q: Is my data secure during processing?
A: Your files are sent to the OCR provider via HTTPS. Always use trusted API providers and secure credentials in n8n.
Q: What file formats are supported?
A: This workflow supports PDF, PNG, and JPG files as input to the OCR API.
Conclusion
By building this n8n workflow, you automated the tedious task of receipt and invoice data extraction from files stored in Google Drive. You saved valuable time, reduced errors, and created a reliable system to keep your financial documents organized in Google Sheets.
Lisa can now process dozens of documents in minutes instead of hours. Next, consider extending this automation to send summary reports via email or integrating with accounting software for end-to-end workflow automation.
Feel confident to customize this workflow further depending on your business needs and document types!