Opening Problem Statement
Tom, a freelance software engineer, spends excessive time every month sifting through his Gmail inbox to find receipts and invoices for his business expenses. Manually downloading, sorting, and uploading these PDFs to his Google Drive is tedious and error-prone, often leading to missed documents critical for tax filing and accounting.
This process wastes Tom hours each week and causes frustration when files are misplaced or overlooked. Without automation, valuable time that could be spent on billable work is lost. Tom needs a reliable way to automatically identify those relevant PDFs in his email attachments, classify them correctly, and upload them to his Google Drive in organized folders, optionally sending them to his accountant by email.
What This Automation Does
This n8n workflow, triggered via a webhook, performs the following when activated:
- Fetches all Gmail emails with attachments within a given date range.
- Filters only PDF attachments from those emails for processing.
- Reads text content from each PDF and checks if the document matches a user-defined type (e.g., “receipt or invoice”) using OpenAI’s GPT-4.1 Mini model.
- Creates a dated folder in Google Drive corresponding to the specified date range.
- Uploads all matching PDFs to the created Google Drive folder.
- Optionally aggregates and emails the matched PDFs to an accountant or specified recipient if enabled in the webhook.
Overall, this workflow can save Tom several hours of tedious manual file sorting, improve document management accuracy, and streamline his monthly accounting process.
Prerequisites ⚙️
- n8n account (Cloud or self-hosted) to run workflows.
- Gmail account 📧 with OAuth access to retrieve emails and download attachments.
- Google Drive account 📁 with OAuth access to create folders and upload files.
- OpenAI account 🔑 configured with API access for document classification.
- Basic familiarity with n8n interface for workflow creation and editing.
- Optional: Self-hosting your n8n instance for full control (see Hostinger guide).
Step-by-Step Guide
Step 1: Configure the Webhook Trigger
Navigate to Webhook node (path: /cded3af3-31df-47c2-a826-ff84eb4a41df). This webhook awaits a POST request containing JSON with startDate, endDate, and optionally sendEmail.
You should see the webhook ready for incoming requests. This starts the automation.
Common Mistake: Forgetting to send a valid JSON payload or incorrect date format causes the workflow not to trigger.
Step 2: Create a Date-Named Folder in Google Drive
From the webhook, the Create folder Google Drive node constructs a folder named like “invoices_YYYY-MM-DD” based on the startDate. It uses the Drive’s root folder as the parent.
After execution, you should see the folder appear in your Google Drive.
Common Mistake: Not setting correct Drive or root folder ID in credentials can cause upload failures.
Step 3: Set Configuration Parameters
The Configure node sets key variables used throughout, such as token size limits, the “Match on” phrase (defaults to “receipt or invoice that can be considered a software engineering business cost”), Google Drive upload folder link, recipient emails, and send email boolean.
Verify these fields match your use case.
Common Mistake: Not updating “Google Drive folder to upload matched PDFs” with actual IDs will cause later uploads to wrong folders.
Step 4: Fetch Emails with Attachments from Gmail
The Get emails with attachments Gmail node queries all emails received between startDate and endDate with attachments.
You should see emails retrieved with associated attachments.
Common Mistake: Not granting full Gmail OAuth scopes disables attachment downloads.
Step 5: (Optional) Filter Emails Further
Use Optional filter for emails node to exclude emails based on conditions like empty recipient field.
This cleans your email list before processing.
Common Mistake: Misconfiguring filter conditions could omit valid emails.
Step 6: Iterate Over Each Email Attachment
The Iterate over email attachments Code node loops through every attachment binary data, creating individual items for downstream nodes.
You should see each attachment as a separate workflow item.
Common Mistake: Errors in JavaScript snippet or incorrect binary property keys break iteration.
Step 7: Check If the Attachment is a PDF
In the Is attachment a PDF? If node, conditions check if binary data has file extension “pdf” to proceed only with PDF files.
Non-PDFs route to the Not a PDF noop node and are ignored.
Common Mistake: Case sensitivity or file extension errors can misclassify PDFs.
Step 8: Extract Text from PDF Attachments
Pass PDFs into Read PDF email attachments node which extracts text content from the PDF binary.
You will see text content as output.
Common Mistake: Extremely large PDFs might error, but this is handled by continuing on error.
Step 9: Check Text Length vs Token Size Limit
The Is text within token limit? If node compares the text length divided by 4 against max token size minus reply token size.
Outputs to OpenAI node if true, or skips classification if false.
Common Mistake: Not adjusting token limits correctly might omit valid documents or exceed OpenAI token quotas.
Step 10: Use OpenAI for Document Classification
The OpenAI Langchain node sends the PDF text and filename to GPT-4.1 Mini with a prompt asking if the document matches the “Match on” phrase from the Configure node.
Example prompt sent:
=Does this PDF file look like a receipt or invoice that can be considered a software engineering business cost? Return "true" if yes, "false" if no. Only reply with lowercase letters "true" or "false".
This is the PDF filename:
{{ $binary.data.fileName }}
This is the PDF text content:
{{ $json.text }}The OpenAI response is expected as “true” or “false”.
Common Mistake: API key not set or prompt formatting issues cause API errors.
Step 11: Merge OpenAI Response with Original Attachment Data
The Merge node combines the classification result with the binary PDF data so we can decide what to upload.
You should see a combined data item.
Common Mistake: Incorrect merge configuration or clash handling can cause data loss.
Step 12: Filter Only Matching Documents
The Is matched If node filters only those where OpenAI returned “true”.
Only these PDFs proceed to upload and optionally email.
Common Mistake: Case sensitivity on match response can cause missed uploads.
Step 13: Upload Matched PDFs to Google Drive Folder
The Upload file to folder Google Drive node uploads the original PDF binary to the folder created earlier.
Files are named by their original filename.
Common Mistake: Not linking the parent folder ID properly results in root drive uploads or errors.
Step 14: Conditionally Send Email with Matched PDFs
The Send email with invoices? If node checks the boolean to send emails.
If true, the Aggregate attachments code node consolidates all matched PDFs into one item.
The Send to my accountant Gmail node sends an email with all PDFs attached.
Email subject dynamically includes reporting date range.
Common Mistake: Email sending errors from missing credentials, wrong recipient, or attachment limits.
Customizations ✏️
- Change document type detection phrase
In the Configure Set node, update theMatch onfield to classify different document types such as “contract” or “project proposal”. This lets you repurpose the workflow for other business documents. - Switch email recipient for matched documents
Modify thesendInvoicesTofield in Configure to change who receives the final PDF batch email. - Adjust token size limits
In Configure, changemaxTokenSizeandreplyTokenSizeto optimize OpenAI costs and processing of longer or shorter PDFs. - Disable email sending
Send the webhook parametersendEmail=falseto skip emailing matched PDFs, uploading only to Drive. - Add additional email filters
Enhance Optional filter for emails node with more conditions like sender address, subject keywords, or label checks for more precise email selection.
Troubleshooting 🔧
Problem: “OpenAI API key not found or invalid.”
Cause: Incorrect or missing OpenAI API credentials.
Solution: Go to OpenAI node → Credentials section → Make sure your API key is entered correctly and has permissions.
Problem: “Gmail node returns no emails or no attachments.”
Cause: OAuth scopes missing or email filters too restrictive.
Solution: Check Gmail node OAuth credential scopes, confirm there are emails with attachments in the date range, and adjust query filters in Get emails with attachments node.
Problem: “Files uploaded to wrong Google Drive folder or root folder.”
Cause: Incorrect parent folder ID linked in Upload file to folder node.
Solution: Verify Create folder node outputs folder ID is correctly assigned as the parent folder parameter.
Pre-Production Checklist ✅
- Verify Google Drive OAuth credentials and test folder creation manually.
- Confirm OpenAI API key validity by testing classification with sample PDF text.
- Send test webhook request with date range JSON payload to trigger workflow.
- Ensure Gmail account has emails with PDF attachments in the test date range.
- Check the token size limit parameters in Configure.
- Test sending email option on and off to confirm functionality.
Deployment Guide
Activate the workflow in n8n after completing setup and credentials configuration.
Use the webhook URL to trigger the workflow from any external system via a POST request including startDate, endDate, and optionally sendEmail boolean.
Monitor workflow execution history in n8n to verify successful runs and troubleshoot errors.
Consider setting up alerts or logs if available in your n8n environment for workflow failure monitoring over time.
FAQs
Q: Can I use an email provider other than Gmail?
A: This workflow is specific to Gmail due to the Gmail node used, but can be adapted to other providers with API access.
Q: Does this workflow consume a lot of OpenAI API tokens?
A: It depends on the size of PDFs and number processed. The token size settings help manage costs.
Q: Is my data secure during processing?
A: Data stays within your n8n environment and connected services. Use secure credentials and encrypted connections.
Q: Can this handle hundreds of emails and attachments?
A: Yes, as long as your n8n plan and API limits support the workload. Consider batch processing for very large sets.
Conclusion
By setting up this n8n workflow, Tom has automated the time-consuming, error-prone process of identifying, sorting, and uploading invoice and receipt PDFs from Gmail. He now has organized, date-stamped folders on Google Drive capturing all relevant business expense documents automatically.
This saves him countless hours every billing cycle, reduces human errors, and provides easy audit trails for accounting or tax professionals.
Next, Tom could extend this automation by adding Slack notifications for new document uploads, integrating OCR text extraction improvements, or connecting the workflow to accounting software like QuickBooks for seamless bookkeeping.