Extract VAT Data from PDFs with Claude 3.5 & Gemini in n8n

Struggling to extract VAT numbers from lengthy PDF invoices manually? This n8n workflow automates extracting VAT data directly from PDFs using AI models Claude 3.5 Sonnet and Google Gemini 2.0 Flash, saving hours and providing side-by-side result comparison with easy prompt customization.
googleDrive
httpRequest
manualTrigger
+3
Workflow Identifier: 1549
NODES in Use: Manual Trigger, Google Drive, Extract from File, HTTP Request, Set, Sticky Note
Extract VAT data from PDFs with n8n and Claude

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

What this workflow does

This workflow extracts VAT numbers from PDF invoices automatically in n8n.
It avoids manual copy-pasting and no extra OCR steps are needed.
You get fast, AI-extracted data ready to compare from two models: Claude 3.5 Sonnet and Google Gemini 2.0 Flash.

It downloads PDF files from Google Drive, converts them to base64 strings, then sends each to AI APIs with a clear prompt for VAT extraction.
You will receive results for easy side-by-side comparison to pick the best AI model for your workflow.


Who should use this workflow

This workflow fits financial officers or accountants who must handle many PDF invoices monthly.
It helps reduce time wasted on manual VAT extraction and lowers errors.
Users want to automate trusted extraction without juggling multiple software or manual OCR.
Non-technical users who know basic n8n can run and customize it easily.


Tools and services used

  • Google Drive: Stores and shares PDF invoices.
  • Anthropic Claude API: Extracts VAT using Claude 3.5 Sonnet model.
  • Google Gemini API: Extracts VAT using Gemini 2.0 Flash model.
  • n8n automation platform: Builds workflow and manages API calls.

Inputs, Processing steps, and Output

Inputs

  • PDF invoice files stored in Google Drive.
  • User-defined text prompt specifying what data to extract, for example, “Extract the VAT numbers for each country.”
  • API keys for Claude 3.5 Sonnet and Google Gemini 2.0 Flash models.

Processing steps

  • Download the PDF file from Google Drive using the provided file ID.
  • Convert the PDF binary data into a base64 string inside n8n.
  • Send the base64 PDF along with the extraction prompt to Claude 3.5 Sonnet API.
  • Send the same PDF and prompt to Gemini 2.0 Flash API.
  • Receive structured text or JSON responses from both AI models.

Output

  • Two sets of extracted VAT number data, ready for side-by-side comparison.
  • Latencies and API cost data can be compared if tracked.
  • Data can be parsed further or stored in databases or sheets for records.

Beginner step-by-step: How to use this workflow in n8n production

Download and Import

  1. Download the workflow file by clicking the Download button on this page.
  2. Open your n8n editor where the automation is to run.
  3. Use the “Import from File” function in n8n to load the downloaded workflow.

Configure essentials

  1. Add Google Drive credentials with access to your PDF folder.
  2. Enter your Anthropic API Key for the Claude 3.5 Sonnet node under HTTP credentials.
  3. Enter your Google PaLM API Key for Gemini 2.0 Flash node also under HTTP credentials.
  4. Replace example Google Drive file ID with the actual invoice PDF file ID you want to process.
  5. Update the prompt text in the Set node called “Define Prompt” if you want to extract different data.

Test and activate

  1. Run the workflow once manually by clicking “Execute Node” or “Test Workflow” in the Manual Trigger node.
  2. Check outputs of Claude and Gemini nodes to see if VAT numbers are properly extracted.
  3. When working well, activate the workflow by switching it to “Active” so it can be triggered as needed.
  4. Optionally connect to other nodes for saving or notifications.

Self-host users can refer to self-host n8n options to run on their own servers.


Customizations ideas

  • Change the prompt in the Set node to extract dates, names, totals, or other invoice details.
  • Disable either Claude or Gemini API call nodes to save costs or focus testing.
  • Add configuration to Gemini API JSON body to request JSON responses if needed for easier parsing.
  • Add error handling nodes to catch failed API calls or rate limits gracefully.
  • Connect outputs to Google Sheets or databases for automatic record keeping.

Common edge cases and failures

  • 401 Unauthorized error on HTTP requests means API keys are wrong or expired.
    Fix by updating your credentials in the nodes.
  • No data extracted or blank AI responses can come from wrong base64 encoding or unclear prompts.
    Check PDF conversion node and make the prompt clearer.
  • Google Drive PDF download fails usually caused by bad file ID or missing permissions.
    Verify file ID and Google Drive access rights.

Summary

→ Saves time by automating VAT extraction from PDFs using AI.
→ No manual OCR needed, PDF base64 sent directly to AI models.
→ Gets extraction results from Claude 3.5 Sonnet and Google Gemini 2.0 Flash for comparison.
→ Input PDFs come from Google Drive, all inside one n8n workflow.
→ Beginner users can import, configure, test, and activate with simple steps.


Extract VAT data from PDFs with n8n and Claude

Visit through Desktop to Interact with the Workflow.

Frequently Asked Questions

The Google Drive node downloads the PDF file using the file ID specified. The connected Google account must have access to that file.
Both Claude 3.5 Sonnet and Google Gemini 2.0 Flash expect the PDF data as a base64 encoded string inside their request JSON.
Yes. The user can update the prompt text in the “Define Prompt” node to extract any desired information from the invoices.
The user can add error handling nodes in n8n to catch API failures or rate limits and create conditional flows for retry or notifications.

Promoted by BULDRR AI

Related Workflows

Automate Twist Channel Creation and Messaging with n8n

This workflow automates creating and updating a channel in Twist and sending a personalized message to specific users. It eliminates manual setup errors and saves time managing Twist communications.

Automate Ideogram Image Generation with Google Sheets & Gmail

This workflow automates graphic design image generation via Ideogram AI, storing image data in Google Sheets and Google Drive, with email alerts via Gmail. It saves designers hours by automating image creation, remixing, review, and record-keeping.

Automate IT Support with Slack and OpenAI in n8n

Streamline IT support by automating Slack message handling using n8n and OpenAI. This workflow handles Slack DMs, filters bots, queries a Confluence knowledge base, and delivers AI-generated responses, improving support efficiency and response time.

Automate Crypto Analysis with CoinMarketCap & n8n AI Agent

Discover how this unique n8n workflow leverages CoinMarketCap’s multi-agent AI to deliver precise, real-time cryptocurrency insights directly via Telegram. Manage crypto data analysis efficiently with automated multi-source API integration.

Automate Gumroad to Beehiiv Subscriber Sync with n8n

Learn how to automatically add new Gumroad sales customers as Beehiiv newsletter subscribers using n8n automation. This workflow saves time by syncing sales data to Google Sheets CRM and notifying your Telegram channel instantly.

Generate On-Brand Blog Articles Using n8n and OpenAI

This workflow automates the creation of on-brand blog articles by analyzing existing company content using n8n and OpenAI. It extracts article structures and brand voice to produce consistent draft articles, saving significant content creation time.
1:1 Free Strategy Session
Your competitors are already automating. Are you still paying for it manually?

Do you want to adopt AI Automation?

Every hour your team does repetitive work, you're burning real money.
While you wait, faster businesses are cutting costs and moving quicker.
AI and automations aren't the future anymore — they're the present.

Book a live 1-on-1 session where we show you exactly which of your daily tasks can be automated — and what it’s costing you not to.