Goal (What we’re building)
A system where:
Main n8n workflow fails → Error workflow triggers → Claude Code auto-fixes → workflow re-runs → you get notified
So instead of “workflow red ho gaya”
you get: “fixed + deployed + done”
What Claude can fix vs can’t fix (clear reality)
✅ Claude CAN auto-fix
- wrong data type (array vs item)
- missing/null values
- JSON parse errors
- broken expressions
- Code node bugs (JS)
- bad schema in structured output
- missing Split Out / Split In Batches logic
- wrong mapping / wrong field names
❌ Claude CAN’T auto-fix (human needed)
- expired credentials / OAuth refresh
- wrong API key
- external API down
- permission denied
- paid quota exhausted
Claude will still detect + tell you exactly what to do.
Architecture (simple mental model)
1) Your “Main Workflow”
The production workflow that runs daily (orders, leads, emails, scraping, etc.)
2) n8n “Error Workflow”
A separate workflow that runs only when Main Workflow fails
3) Tunnel (bridge from cloud → local)
Because Claude Code runs locally on your laptop/server
and your n8n might be hosted online.
So we expose a local endpoint using:
- ngrok / Cloudflare tunnel / localtunnel (any one)
4) Claude Code + MCP n8n server
Claude Code can:
- read the broken workflow
- understand the failed node
- patch code / expressions
- update the workflow via API
- save it back
5) Notification (ClickUp/Slack/Email)
So you always know:
- fixed automatically or
- “human action required”
Step-by-step: Build Self-Healing n8n + Claude Code
STEP 0 — Prerequisites checklist
You need:
✅ n8n instance (cloud or self-hosted)
✅ Claude Code installed locally
✅ Node.js installed
✅ A tunnel tool (ngrok recommended)
✅ n8n API access (key or basic auth)
STEP 1 — Create your Main workflow normally
Example: “Order Processing Pipeline”
Just make sure:
- It’s real production logic
- It can fail (because real life)
STEP 2 — Enable “Error Workflow” feature in n8n
In n8n, open your Main Workflow → Settings:
Look for something like:
Error Workflow / Error Trigger workflow
Set it to:
✅ your new workflow (we’ll create next)
This means:
ANY error in main workflow triggers the error workflow instantly.
STEP 3 — Create the Error Workflow (the brain of self-healing)
Create a new workflow named:
“Self-Heal Handler”
First node:
✅ Error Trigger
This gives you data like:
- workflow name / workflow ID
- failed node name
- error message
- execution ID
- stack trace (sometimes)
- input data
STEP 4 — Format the error payload cleanly
Add a node after Error Trigger:
✅ Set / Code node (your choice)
Build a clean JSON payload like:
- workflowId
- workflowName
- failedNodeName
- errorMessage
- executionId
- timestamp
- optional: node parameters / last output
Keep it minimal.
Claude doesn’t need garbage.
STEP 5 — Send the error to your local “Claude Fixer” endpoint
Now add:
✅ HTTP Request node
→ POST to your tunnel URL (example):
https://<your-tunnel>.ngrok-free.app/fix
Body: JSON payload from Step 4
This is the “n8n → Claude” handoff.
STEP 6 — Create the local Claude Fixer service (small API)
On your local machine/server:
Create a small Node/Python service that:
- receives POST /fix
- saves payload into a file like error.json
- triggers Claude Code with a command like:
- “Open this workflow, patch it, save it back”
- returns response to n8n:
- fixed / not fixed
- what changed
- what to do if human action needed
This is basically a “Claude runner”.
Important:
Claude Code needs a consistent prompt every time.
STEP 7 — Connect Claude Code to n8n using MCP
This is what makes Claude able to ACTUALLY edit workflows.
What you need:
- n8n MCP server installed (or whichever connector you’re using)
- Claude Code configured with MCP tools
Claude should be able to do:
- fetch workflow JSON by ID
- update workflow JSON
- activate workflow (optional)
- test workflow (optional)
Without MCP, Claude will only “suggest fixes” like a blog post.
With MCP, Claude will apply fixes.
STEP 8 — Give Claude a strict “Fixing Playbook” prompt
This is the most important part.
Your Claude Code system prompt should enforce:
Fixing rules
- Identify root cause (don’t guess)
- Apply the smallest safe change
- Prefer fixing upstream node (better than patching later)
- Add guards:
- empty array handling
- always output data
- fallback defaults
- If auth/credential issue → STOP and request human action
- After fix → save workflow → re-run last failed execution if possible
- Return summary + diff of changes
You want Claude behaving like a production engineer, not “helpful chatbot”.
STEP 9 — Auto re-run the workflow after fix
Back inside the Error Workflow:
After the HTTP Request returns “fixed”:
Trigger the main workflow again using:
✅ Execute Workflow
or
✅ Webhook trigger to main
or
✅ n8n API call to run workflow
This makes it self-heal + self-resume.
STEP 10 — Notifications (must-have)
Add a final node:
If fixed:
Send message:
✅ “Fixed automatically”
Include:
- workflow name
- node fixed
- error reason
- what changed
If not fixed:
Send message:
⚠️ “User action required”
Include:
- exactly what to update (credential, key, permissions)
Use:
- Slack / Email / ClickUp / Telegram / ntfy whatever you use daily.
Practical examples of fixes Claude should do
Example 1: Array vs item mismatch
Error: “Expected item but received array”
Fix options:
- add “Split Out”
- OR better: change Code node to return items properly
Claude should pick the cleanest fix.
Example 2: Structured output parser broken JSON
Error: schema invalid / missing commas
Fix: correct schema JSON
Example 3: User input breaks JSON body
Error: invalid JSON because user typed quotes
Fix: wrap payload safely + escape strings
Example 4: Rate limit
Error: 429
Claude can:
- add Wait node
- retry with backoff
- Split in Batches
(advanced but possible)
Best practices (so you don’t build a fragile “self-heal”)
1) Keep a “Dev workflow” copy
Don’t let Claude patch production without control.
Best flow:
Fix in Dev → promote to Prod.
2) Log every patch
Write patch summaries to:
- Google Sheets
- Supabase
- Notion
- GitHub commits
3) Add safety limits
Example:
- max 2 auto-fixes per hour
- if repeated failures → stop and alert
4) Add “Human approval mode”
For enterprise clients:
Claude prepares fix → you approve → then it deploys.
Quick checklist (what your final system should do)
✅ Workflow fails
✅ Error Trigger fires
✅ Payload sent to Claude
✅ Claude finds root cause
✅ Claude patches workflow in n8n
✅ Workflow re-runs
✅ You get message: fixed OR action required
✅ All changes logged

