Automate Gmail PDF Invoice Sorting & Upload to Google Drive

Save hours by automating the classification and upload of PDF invoices and receipts from Gmail to Google Drive using n8n and OpenAI. This workflow intelligently identifies relevant documents and organizes them into dated folders with optional email forwarding, reducing manual work and errors.
openAi
googleDrive
gmail
+10
Learn how to Build this Workflow with AI:
Workflow Identifier: 2323
NODES in Use: openAi, set, code, noOp, if, merge, googleDrive, webhook, respondToWebhook, gmail, readPDF, filter, stickyNote

Press CTRL+F5 if the workflow didn't load.

Visit through Desktop for Best experience

Opening Problem Statement

Tom, a freelance software engineer, spends excessive time every month sifting through his Gmail inbox to find receipts and invoices for his business expenses. Manually downloading, sorting, and uploading these PDFs to his Google Drive is tedious and error-prone, often leading to missed documents critical for tax filing and accounting.

This process wastes Tom hours each week and causes frustration when files are misplaced or overlooked. Without automation, valuable time that could be spent on billable work is lost. Tom needs a reliable way to automatically identify those relevant PDFs in his email attachments, classify them correctly, and upload them to his Google Drive in organized folders, optionally sending them to his accountant by email.

What This Automation Does

This n8n workflow, triggered via a webhook, performs the following when activated:

  • Fetches all Gmail emails with attachments within a given date range.
  • Filters only PDF attachments from those emails for processing.
  • Reads text content from each PDF and checks if the document matches a user-defined type (e.g., “receipt or invoice”) using OpenAI’s GPT-4.1 Mini model.
  • Creates a dated folder in Google Drive corresponding to the specified date range.
  • Uploads all matching PDFs to the created Google Drive folder.
  • Optionally aggregates and emails the matched PDFs to an accountant or specified recipient if enabled in the webhook.

Overall, this workflow can save Tom several hours of tedious manual file sorting, improve document management accuracy, and streamline his monthly accounting process.

Prerequisites ⚙️

  • n8n account (Cloud or self-hosted) to run workflows.
  • Gmail account 📧 with OAuth access to retrieve emails and download attachments.
  • Google Drive account 📁 with OAuth access to create folders and upload files.
  • OpenAI account 🔑 configured with API access for document classification.
  • Basic familiarity with n8n interface for workflow creation and editing.
  • Optional: Self-hosting your n8n instance for full control (see Hostinger guide).

Step-by-Step Guide

Step 1: Configure the Webhook Trigger
Navigate to Webhook node (path: /cded3af3-31df-47c2-a826-ff84eb4a41df). This webhook awaits a POST request containing JSON with startDate, endDate, and optionally sendEmail.
You should see the webhook ready for incoming requests. This starts the automation.
Common Mistake: Forgetting to send a valid JSON payload or incorrect date format causes the workflow not to trigger.

Step 2: Create a Date-Named Folder in Google Drive
From the webhook, the Create folder Google Drive node constructs a folder named like “invoices_YYYY-MM-DD” based on the startDate. It uses the Drive’s root folder as the parent.
After execution, you should see the folder appear in your Google Drive.
Common Mistake: Not setting correct Drive or root folder ID in credentials can cause upload failures.

Step 3: Set Configuration Parameters
The Configure node sets key variables used throughout, such as token size limits, the “Match on” phrase (defaults to “receipt or invoice that can be considered a software engineering business cost”), Google Drive upload folder link, recipient emails, and send email boolean.
Verify these fields match your use case.
Common Mistake: Not updating “Google Drive folder to upload matched PDFs” with actual IDs will cause later uploads to wrong folders.

Step 4: Fetch Emails with Attachments from Gmail
The Get emails with attachments Gmail node queries all emails received between startDate and endDate with attachments.
You should see emails retrieved with associated attachments.
Common Mistake: Not granting full Gmail OAuth scopes disables attachment downloads.

Step 5: (Optional) Filter Emails Further
Use Optional filter for emails node to exclude emails based on conditions like empty recipient field.
This cleans your email list before processing.
Common Mistake: Misconfiguring filter conditions could omit valid emails.

Step 6: Iterate Over Each Email Attachment
The Iterate over email attachments Code node loops through every attachment binary data, creating individual items for downstream nodes.
You should see each attachment as a separate workflow item.
Common Mistake: Errors in JavaScript snippet or incorrect binary property keys break iteration.

Step 7: Check If the Attachment is a PDF
In the Is attachment a PDF? If node, conditions check if binary data has file extension “pdf” to proceed only with PDF files.
Non-PDFs route to the Not a PDF noop node and are ignored.
Common Mistake: Case sensitivity or file extension errors can misclassify PDFs.

Step 8: Extract Text from PDF Attachments
Pass PDFs into Read PDF email attachments node which extracts text content from the PDF binary.
You will see text content as output.
Common Mistake: Extremely large PDFs might error, but this is handled by continuing on error.

Step 9: Check Text Length vs Token Size Limit
The Is text within token limit? If node compares the text length divided by 4 against max token size minus reply token size.
Outputs to OpenAI node if true, or skips classification if false.
Common Mistake: Not adjusting token limits correctly might omit valid documents or exceed OpenAI token quotas.

Step 10: Use OpenAI for Document Classification
The OpenAI Langchain node sends the PDF text and filename to GPT-4.1 Mini with a prompt asking if the document matches the “Match on” phrase from the Configure node.
Example prompt sent:

=Does this PDF file look like a receipt or invoice that can be considered a software engineering business cost? Return "true" if yes, "false" if no. Only reply with lowercase letters "true" or "false".

This is the PDF filename:
{{ $binary.data.fileName }}

This is the PDF text content:
{{ $json.text }}

The OpenAI response is expected as “true” or “false”.
Common Mistake: API key not set or prompt formatting issues cause API errors.

Step 11: Merge OpenAI Response with Original Attachment Data
The Merge node combines the classification result with the binary PDF data so we can decide what to upload.
You should see a combined data item.
Common Mistake: Incorrect merge configuration or clash handling can cause data loss.

Step 12: Filter Only Matching Documents
The Is matched If node filters only those where OpenAI returned “true”.
Only these PDFs proceed to upload and optionally email.
Common Mistake: Case sensitivity on match response can cause missed uploads.

Step 13: Upload Matched PDFs to Google Drive Folder
The Upload file to folder Google Drive node uploads the original PDF binary to the folder created earlier.
Files are named by their original filename.
Common Mistake: Not linking the parent folder ID properly results in root drive uploads or errors.

Step 14: Conditionally Send Email with Matched PDFs
The Send email with invoices? If node checks the boolean to send emails.
If true, the Aggregate attachments code node consolidates all matched PDFs into one item.
The Send to my accountant Gmail node sends an email with all PDFs attached.
Email subject dynamically includes reporting date range.
Common Mistake: Email sending errors from missing credentials, wrong recipient, or attachment limits.

Customizations ✏️

  • Change document type detection phrase

    In the Configure Set node, update the Match on field to classify different document types such as “contract” or “project proposal”. This lets you repurpose the workflow for other business documents.
  • Switch email recipient for matched documents

    Modify the sendInvoicesTo field in Configure to change who receives the final PDF batch email.
  • Adjust token size limits

    In Configure, change maxTokenSize and replyTokenSize to optimize OpenAI costs and processing of longer or shorter PDFs.
  • Disable email sending

    Send the webhook parameter sendEmail=false to skip emailing matched PDFs, uploading only to Drive.
  • Add additional email filters

    Enhance Optional filter for emails node with more conditions like sender address, subject keywords, or label checks for more precise email selection.

Troubleshooting 🔧

Problem: “OpenAI API key not found or invalid.”
Cause: Incorrect or missing OpenAI API credentials.
Solution: Go to OpenAI node → Credentials section → Make sure your API key is entered correctly and has permissions.

Problem: “Gmail node returns no emails or no attachments.”
Cause: OAuth scopes missing or email filters too restrictive.
Solution: Check Gmail node OAuth credential scopes, confirm there are emails with attachments in the date range, and adjust query filters in Get emails with attachments node.

Problem: “Files uploaded to wrong Google Drive folder or root folder.”
Cause: Incorrect parent folder ID linked in Upload file to folder node.
Solution: Verify Create folder node outputs folder ID is correctly assigned as the parent folder parameter.

Pre-Production Checklist ✅

  • Verify Google Drive OAuth credentials and test folder creation manually.
  • Confirm OpenAI API key validity by testing classification with sample PDF text.
  • Send test webhook request with date range JSON payload to trigger workflow.
  • Ensure Gmail account has emails with PDF attachments in the test date range.
  • Check the token size limit parameters in Configure.
  • Test sending email option on and off to confirm functionality.

Deployment Guide

Activate the workflow in n8n after completing setup and credentials configuration.

Use the webhook URL to trigger the workflow from any external system via a POST request including startDate, endDate, and optionally sendEmail boolean.

Monitor workflow execution history in n8n to verify successful runs and troubleshoot errors.

Consider setting up alerts or logs if available in your n8n environment for workflow failure monitoring over time.

FAQs

Q: Can I use an email provider other than Gmail?
A: This workflow is specific to Gmail due to the Gmail node used, but can be adapted to other providers with API access.

Q: Does this workflow consume a lot of OpenAI API tokens?
A: It depends on the size of PDFs and number processed. The token size settings help manage costs.

Q: Is my data secure during processing?
A: Data stays within your n8n environment and connected services. Use secure credentials and encrypted connections.

Q: Can this handle hundreds of emails and attachments?
A: Yes, as long as your n8n plan and API limits support the workload. Consider batch processing for very large sets.

Conclusion

By setting up this n8n workflow, Tom has automated the time-consuming, error-prone process of identifying, sorting, and uploading invoice and receipt PDFs from Gmail. He now has organized, date-stamped folders on Google Drive capturing all relevant business expense documents automatically.

This saves him countless hours every billing cycle, reduces human errors, and provides easy audit trails for accounting or tax professionals.

Next, Tom could extend this automation by adding Slack notifications for new document uploads, integrating OCR text extraction improvements, or connecting the workflow to accounting software like QuickBooks for seamless bookkeeping.

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n (Beginner Guide)

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free