Automating Funding News Extraction & Research with n8n Workflow

Discover how this n8n workflow automates extracting detailed startup funding news from TechCrunch and VentureBeat, including deep company research using AI, saving analysts hours of manual work. It filters, parses, enriches, and stores data in Airtable for easy tracking.
HTTP Request
chainLlm
Airtable
+15
Workflow Identifier: 1914
NODES in Use: Manual Trigger, HTTP Request, XML, Split Out, Filter, HTML, Merge, chainLlm, lmChatAnthropic, outputParserStructured, outputParserAutofixing, Airtable, lmChatOpenRouter, informationExtractor, Execute Workflow, Execute Workflow Trigger, Set, Sticky Note

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

What this workflow does

This workflow collects startup funding news from TechCrunch and VentureBeat sitemaps.
The main problem solved is saving analysts time by automating extraction and research of funding details.
It outputs clean, structured funding data stored in Airtable.

The workflow fetches XML sitemaps, finds articles with funding keywords, scrapes article content, and uses AI to pull structured company data.
It also enriches data with company research and merges info into a final record.


Who should use this workflow

This workflow is great for market analysts tracking startup funding news.
Anyone who wants to save hours of manual reading and data entry benefit from it.

Users without deep programming skills can run it inside n8n with minimal setup.
Business intelligence teams and reporting specialists also gain from structured, up-to-date funding records.


Tools and services used

  • n8n Automation Platform: Runs the workflow either on cloud or self-host n8n.
  • HTTP Request Nodes: Fetch sitemaps and article HTML pages.
  • XML Parse Nodes: Convert sitemap XML to JSON.
  • Split Out Nodes: Separate individual article URLs.
  • Filter Nodes: Select only articles mentioning funding keywords.
  • HTML Parser Nodes: Extract article titles and body text.
  • LangChain AI Nodes: Use AI models like Anthropic Claude 3.5 and Perplexity LLaMA to extract structured company data and perform research.
  • Airtable: Stores the final structured datasets for easy access.
  • API Keys: Needed for Airtable, OpenRouter, Anthropic, and Perplexity.

How this workflow works: Inputs, Process, and Output

Inputs

  • Latest sitemap XML URLs from TechCrunch and VentureBeat.
  • Article URLs found in sitemaps.
  • Funding keyword filter like “raise”.
  • API Keys for AI models and Airtable.

Processing Steps

  1. Fetch sitemap XMLs using HTTP Request nodes.
  2. Convert XML to JSON to list article URLs via XML Parse.
  3. Split JSON lists into single article URLs.
  4. Filter URLs and titles containing funding keywords.
  5. Download full article HTML pages.
  6. Extract clean text for titles and article body using HTML Parser.
  7. Merge articles from both sources into a single stream.
  8. Use AI (LangChain nodes) to parse unstructured text to detailed JSON data with company name, funding round, amount, investors, valuation, and URLs.
  9. Run an AI-based data cleaner that auto-fixes JSON output.
  10. Query AI to find company websites to enrich profiles.
  11. Prepare final JSON data combining all extracted and researched fields.

Output

The data is stored as records in Airtable.
Users get a clean, structured table with funding rounds, investors, amounts, companies, and detailed research ready to use.


Beginner step-by-step: How to use this workflow in n8n

Importing the Workflow

  1. Click the Download button provided on the workflow page to save the workflow file.
  2. Open n8n editor where you want to run the workflow.
  3. Use the option “Import from File” in n8n and select the downloaded workflow file.

Configuring API Keys and IDs

  1. Add Airtable API Key in the Airtable node.
  2. Add API Keys for OpenRouter, Anthropic, and Perplexity models in respective LangChain AI nodes.
  3. Update Airtable Base IDs, Table Names, or any folder/email references if needed according to your account.

Testing and Activation

  1. Trigger the workflow manually once to test data flow and API responses.
  2. Check intermediate node outputs to confirm data is correct.
  3. Fix any errors like missing fields or API issues.
  4. Activate the workflow to run on a schedule or with other triggers for production.

This method lets beginners run the whole sequence without building from zero.
It uses easy import, config, test, and activate steps inside the n8n editor.


Customizations ideas

  • Change funding keyword filters to include terms like “closed Series” or “funded”.
  • Add other tech news sources by fetching additional sitemaps with same parsing flow.
  • Enhance AI extract prompt to pull CEO name, employee count, or product launches.
  • Replace Airtable with tools like Google Sheets or Salesforce for data storage.

Troubleshooting common issues

  • Problem: No articles pass the funding keyword filter.
    Cause: Keyword case or spelling mismatch.
    Fix: Try a broader keyword or check conditions.
  • Problem: AI nodes output malformed JSON.
    Cause: AI sometimes sends invalid structures.
    Fix: Ensure Auto-fixing Output Parser node is active.
  • Problem: Airtable node record creation fails.
    Cause: Wrong API key or base/table info.
    Fix: Confirm all credentials and table IDs.

Pre-production checklist

  • Verify all API Keys for Airtable, OpenRouter, Anthropic, and Perplexity.
  • Manually open sitemap URLs in a browser to confirm access.
  • Run workflow manually and review nodes outputs.
  • Confirm CSS selectors in HTML parser nodes still capture titles and article body.
  • Test deep research subworkflow inputs and outputs separately.

Deploying to production

  • Set a schedule trigger in n8n to run workflow regularly, like daily.
  • Monitor execution logs and data records in Airtable.
  • Back up Airtable data or connect with reporting dashboards.

Summary of benefits and output

✓ Saves hours or days of manual news reading and data entry.

✓ Filters and extracts only relevant funding news automatically.

✓ Uses AI to create clean, structured data on companies and funding.

✓ Enriches data with extra company research for smarter reporting.

→ Final output is organized funding data stored in Airtable,
ready to use in reports or dashboards.


Frequently Asked Questions

Yes, the LangChain nodes support multiple AI models. Just update API Keys and adjust prompts accordingly.
API consumption depends on how often the workflow runs and the number of articles processed. Filtering early reduces AI calls.
Failures usually come from invalid API Keys or wrong Airtable base and table selections. Double-check credentials and IDs.
Begin by importing the workflow file via ‘Import from File’ in the n8n editor. Then add API Keys and update IDs if needed. Test once and activate for production.

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free