Automate PII Removal from Google Drive CSVs with n8n & OpenAI

Struggling with manually finding and deleting sensitive PII from CSV files stored in Google Drive? This n8n workflow detects new CSV uploads, uses OpenAI to pinpoint PII columns, removes them, and saves sanitized files back to your drive automatically, saving hours and securing your data.
googleDriveTrigger
openAi
code
+6
Learn how to Build this Workflow with AI:
Workflow Identifier: 2266
NODES in Use: Google Drive Trigger, Google Drive, Extract from File, OpenAI, Merge, Upload to Drive, Split Out, Code, Sticky Note

Press CTRL+F5 if the workflow didn't load.

Visit through Desktop for Best experience

1. Opening Problem Statement

Meet Sarah, a compliance officer at a mid-sized marketing firm. Every day, Sarah receives multiple CSV reports containing customer data uploaded to a shared Google Drive folder. Her task? Identify and delete columns with sensitive personally identifiable information (PII) before forwarding these sanitized reports to external partners. This manual process takes her nearly 3 hours daily and is prone to human error, risking leaks of sensitive data that could lead to compliance violations and hefty fines.

Imagine the tediousness of opening each CSV, scanning dozens of columns, deleting name, email, phone, or other PII data, and saving the file back — all repeatedly. Meanwhile, sensitive data risks accidental exposure. Sarah’s struggle is not unique; many teams handling files with PII face similar challenges daily.

2. What This Automation Does

This n8n workflow automates Sarah’s painful manual process with precision and speed, triggered immediately when a new CSV file appears in a specified Google Drive folder. Here’s exactly what happens:

  • Detects new CSV files uploaded to your Google Drive folder and triggers the workflow instantly.
  • Downloads the CSV file from Google Drive for processing.
  • Extracts and sends tabular headers to an OpenAI model to identify which columns contain PII.
  • Removes those PII columns from the CSV data using a custom Code node.
  • Generates a new sanitized CSV file with PII removed, appending “_PII_removed” to the filename.
  • Uploads the cleaned file back into a dedicated Google Drive folder for processed files.

By automating this, Sarah can save 3+ hours of tedious work daily, reduce errors, and guarantee compliance without lifting a finger.

3. Prerequisites ⚙️

  • n8n account (cloud or self-hosted) 🔌
  • Google Drive account with API access enabled 📁🔑
  • OpenAI account with API keys configured 🔐
  • Basic familiarity with CSV files and folder structure in Google Drive 📊

For those interested in self-hosting n8n, Hostinger offers an easy solution: host n8n yourself with Hostinger.

4. Step-by-Step Guide to Build This Workflow

Step 1: Set up Google Drive Trigger to Watch Folder

In n8n, add the Google Drive Trigger node.
Navigate to the node settings and authenticate your Google Drive account.
Set the trigger event to “fileCreated”.
Choose “Trigger On Specific Folder” and select or enter the folder ID of your Drive folder to monitor.
Set polling to every minute for near real-time processing.
You should see your folder’s ID and name verified.
Common mistake: Forgetting to set the correct folder ID or permissions may prevent triggering.

Step 2: Download the Uploaded File from Google Drive

Add a Google Drive node and set the operation to download.
Map the fileId to the ID received from the trigger node ({{ $json.id }}).
Set the binaryPropertyName to “data” for file content.
Ensure the same Google Drive credentials are used.
This node fetches the actual CSV file for processing.

Step 3: Extract Tabular Data from the File

Add the Extract from File node.
Configure it to extract plain CSV content.
This node outputs the parsed tabular data and headers that will be analyzed.
No special input required besides the downloaded file representation.

Step 4: Use OpenAI to Identify PII Columns

Insert the OpenAI (LangChain) node.
Configure it with your OpenAI credentials.
Set the model to “gpt-4o-mini” for a balance of performance and speed.
In the messages section, provide a system instruction asking to analyze the provided tabular headers and identify only the PII-containing columns.
Use expressions referencing the CSV headers extracted:

{"role":"system","content":"Analyze the provided tabular data and identify the columns that contain personally identifiable information (PII). Return only the column names that contain PII, separated by commas."}

and

{"content":"=Here is some tabular data with column headers and example rows. Headers: {{Object.keys($json)}} Example Row 1: {{Object.values($json)}}"}

This instructs OpenAI to return a simple comma-separated list of PII columns.

Expected output is JSON with a text field listing the columns.

Step 5: Merge OpenAI Output with Filename and Data

Add two Split Out nodes to extract the filename and the OpenAI response separately.
Then merge these with the main data stream using a Merge node configured for 3 inputs.
This prepares consolidated data for PII removal.

Step 6: Remove the PII Columns via Code Node

Add a Code node named “Remove PII columns”.
Use this JavaScript code:

// Input: All items from the previous node
const input = $input.all();

// Step 1: Extract the PII column names from the first item
const firstItem = input[0];
if (!firstItem.json.data || !firstItem.json.data) {
 throw new Error("PII column names are missing in the input data.");
}
const piiColumns = firstItem.json.data.split(',').map(col => col.trim());

// Step 2: Remove the first two items and process the rest
let rows = input.slice(2).map(item => item.json);

if (rows.length === 0) {
 throw new Error("No rows to convert to CSV.");
}

// Step 3: Remove PII columns from each row
const sanitizedRows = rows.map(row => {
 const sanitizedRow = { ...row };
 piiColumns.forEach(column => delete sanitizedRow[column]);
 return sanitizedRow;
});

// Step 4: Extract headers from sanitized rows
const headers = Object.keys(sanitizedRows[0]);

// Step 5: Convert rows to CSV
const csvRows = [
 headers.join(','),
 ...sanitizedRows.map(row => headers.map(header => String(row[header] || '').replace(/,/g, '')).join(','))
];

const csvContent = csvRows.join('n');

const originalFileName = input[1].json.originalFilename;
const fileExtension = originalFileName.split('.').pop();
const baseName = originalFileName.replace(`.${fileExtension}`, '');
const newFileName = `${baseName}_PII_removed.${fileExtension}`;

return [ { json: { fileName: newFileName, content: csvContent } } ];

This code removes sensitive columns and prepares a new CSV content string and filename.

Step 7: Upload Sanitized File Back to Google Drive

Finish by adding another Google Drive node.
Set operation to “createFromText”.
Map the new filename and content from the Code node output.
Specify the destination folder ID where sanitized files should be saved.
Authenticate with your Google Drive account.
Run a test upload; the file should appear in your target folder.

5. Customizations ✏️

  • Change Folder to Monitor: In the Google Drive Trigger node, update the folder ID to any folder you want watched.
  • Adjust OpenAI Model: Swap the OpenAI model from “gpt-4o-mini” to a larger or cheaper model as per your API limits.
  • File Type Filters: Add conditional logic before processing to restrict only CSV files by checking the file extension in the Google Drive Trigger output.
  • Additional Sanitization: Enhance the Code node to mask rather than remove PII columns if needed, by replacing values instead of deleting fields.
  • Notification Integration: Add a notification node (Slack, email) after upload to alert your team when files are sanitized.

6. Troubleshooting 🔧

  • Problem: “PII column names are missing in the input data.”
    Cause: The OpenAI node did not return expected PII columns.
    Solution: Check if OpenAI prompt inputs match your CSV column headers and ensure the OpenAI API key is valid and has usage quota.
  • Problem: “No rows to convert to CSV.”
    Cause: The CSV file might be empty or the Extract from File node failed.
    Solution: Verify the CSV file is properly formatted and test the Extract from File node separately.
  • Problem: Google Drive upload fails.
    Cause: Incorrect folder ID or insufficient permissions.
    Solution: Double-check folder IDs and that your Google Drive API credentials have write access.

7. Pre-Production Checklist ✅

  • Confirm Google Drive folder ID to watch is correct and accessible.
  • Test OpenAI node independently with sample headers to verify PII detection.
  • Upload a test CSV via Google Drive to trigger the workflow end-to-end.
  • Ensure sanitized file appears in the specified output Drive folder with proper naming.
  • Backup original CSV files regularly before running automated processes.

8. Deployment Guide

After composing this workflow in n8n, activate the workflow toggle to enable it.
Ensure your n8n instance remains online—cloud instances usually handle this automatically.
Monitor initial runs for errors in the n8n executions window.
Adjust logging or add notification nodes if error alerts are needed.
This workflow scales with file frequency but monitor API quota usage with OpenAI and Google Drive.

9. FAQs

  • Can I use other AI models instead of OpenAI?
    Yes, any AI capable of analyzing tabular data for PII can be integrated by replacing the OpenAI node and adjusting the prompt.
  • Does this workflow consume many API credits?
    It makes one OpenAI call per file; costs depend on your file volume and model used.
  • Is my data safe?
    All file processing happens within your n8n instance securely; ensure your API keys remain secure and use encrypted credentials.
  • Can this handle hundreds of files daily?
    Depending on your API limits and n8n hosting, it can scale well but may require quota monitoring.

10. Conclusion

By following this detailed tutorial, you automated the detection and removal of PII from CSV files uploaded to Google Drive, freeing Sarah (and you) from hours of manual, error-prone work daily.

This workflow enhances data compliance, reduces risk, and expedites file processing, potentially saving multiple workdays each month.

Next, consider automating email notifications for processed files, integrating Slack alerts, or expanding to handle other file formats like Excel.

Get started today, and secure your data processing with smart automation!

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n (Beginner Guide)

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free