Automate Company Story Generation with LinkedIn, Bright Data & Google Gemini

Discover how to automatically extract company data from LinkedIn using Bright Data’s API and generate compelling company stories with Google Gemini AI. Save hours on research and storytelling with this unique n8n automation workflow.
httpRequest
lmChatGoogleGemini
informationExtractor
+10
Workflow Identifier: 2257
NODES in Use: ManualTrigger, Google Gemini Chat Model, Default Data Loader, Recursive Character Text Splitter, If, Set Snapshot Id, HTTP Request, Set LinkedIn URL, LinkedIn Data Extractor, Concise Summary Generator, Webhook Notifier for Data Extractor, Webhook Notifier for Summary Generator, Wait

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

Opening Problem Statement

Meet Sarah, a content marketer at a busy HR tech firm. Every day, she needs to create engaging and accurate stories about companies on LinkedIn to help her team with recruitment marketing. However, manually scraping LinkedIn data, filtering errors, and crafting concise company stories takes hours and often leads to inconsistent results. Time wasted accumulates, opportunities are missed, and the marketing team’s output suffers.

This is a very specific problem because LinkedIn data is dynamically updated and scraping it requires legal and technical steps, and the storytelling process demands AI capabilities to convert raw data into engaging narratives. Sarah needs a tailored solution that automates the entire workflow from data extraction to intelligent summary generation without manual intervention.

What This Automation Does

This unique n8n workflow tackles Sarah’s challenge by integrating Bright Data’s web scraping API and Google Gemini AI models to craft company stories effortlessly. When run, this workflow:

  • Triggers a LinkedIn company data scrape using Bright Data’s snapshot API based on a configured company URL.
  • Monitors the scraping progress and waits for completion automatically, reducing manual polling.
  • Downloads the scraped JSON snapshot of company data once ready, ensuring fresh, indexed data delivery.
  • Uses n8n’s LangChain nodes to intelligently extract structured information from the raw LinkedIn JSON data.
  • Employs Google Gemini’s advanced AI models to convert extracted data into a comprehensive company story.
  • Generates a concise summary from the detailed story using advanced summarization chains, improving readability.
  • Automatically sends the detailed story and summary to configured webhook endpoints for downstream applications or notifications.

By automating these steps, the workflow saves Sarah and her team several hours each week, eliminates scraping errors through automated checks, and ensures company stories are consistently high-quality and ready for use.

Prerequisites ⚙️

  • n8n account (cloud or self-hosted) 🔌
  • Bright Data API account with access to datasets API for LinkedIn scraping 🔑
  • Google PaLM (Google Gemini) API credentials for access to Gemini chat models 🔑
  • Webhook URL for receiving story and summary notifications (e.g., webhook.site) 📡
  • Basic knowledge of LinkedIn company URLs to customize the scraper input 🌐

Step-by-Step Guide

1. Start with the Manual Trigger Node

Navigate in n8n Editor to the ManualTrigger node labeled “When clicking ‘Test workflow’”. This node manually starts the workflow for testing and development. No parameters are needed here—simply click “Execute Workflow” in the editor to begin the process.

Expected outcome: Workflow starts and passes control to the next node to set the LinkedIn URL.

Common mistake: Forgetting to click ‘Execute Workflow’ means nothing starts.

2. Set the LinkedIn Company URL

Next, the Set LinkedIn URL node assigns the target LinkedIn company page URL to scrape. Navigate: Click the node named Set LinkedIn URL. Enter a field called url with a value like https://il.linkedin.com/company/bright-data.

This URL directly affects what company data is pulled from LinkedIn.

Expected outcome: The URL is stored in the workflow’s JSON payload for subsequent requests.

Common mistake: Using an incorrect or private LinkedIn URL causes failed scraping.

3. Trigger LinkedIn Data Scraping with Bright Data

The Perform LinkedIn Web Request node sends a POST request to Bright Data’s dataset trigger endpoint to start scraping.

  • URL: https://api.brightdata.com/datasets/v3/trigger
  • Method: POST
  • Body: JSON array including the LinkedIn URL field from previous step.
  • Query parameters: dataset_id=gd_l1vikfnt1wgvvqz95w (specific Bright Data dataset for LinkedIn company data) and include_errors=true.
  • Authentication: Header Auth with your Bright Data API key.

Expected outcome: A snapshot ID is returned indicating the scraping job has started.

Common mistake: Incorrect dataset ID or invalid credentials causing HTTP 401/403 errors.

4. Store the Snapshot ID

The Set Snapshot Id node captures the snapshot ID from the previous response and assigns it as snapshot_id for future API calls.

Expected outcome: Snapshot ID is stored in workflow context for polling.

Common mistake: Not mapping the snapshot ID properly will cause all next steps to fail.

5. Poll the Scraping Job Status

The Check Snapshot Status node performs GET requests on Bright Data’s progress API endpoint https://api.brightdata.com/datasets/v3/progress/{{ $json.snapshot_id }}.

If the status is not ready, the workflow loops into the Wait for 30 seconds node to pause execution before rechecking.

Expected outcome: Automatic wait/retry ensures the workflow only proceeds after data is ready.

Common mistake: Not configuring the condition to detect “ready” status leads to infinite or premature requests.

6. Download the Finished Snapshot

Once the snapshot is marked as ready, the Download Snapshot HTTP Request node downloads the scraped data in JSON format for processing.

Expected outcome: Full LinkedIn company profile JSON is fetched for extraction.

Common mistake: Missing authorization headers results in failed data fetch.

7. Extract Structured Company Info

The LinkedIn Data Extractor LangChain Information Extractor node receives the raw JSON and is instructed—with a system prompt—to formulate a detailed company story incorporating all metadata.

Expected outcome: Structured, human-readable company story as output.

Common mistake: Poor system prompt detail or incorrect input data disrupts extraction.

8. Generate a Concise Summary

The Concise Summary Generator LangChain Summarization Chain node takes the detailed story output to generate a neat summary.

Expected outcome: A brief, readable summary is created for quick consumption.

Common mistake: Failure to properly map input/output breaks chain flow.

9. Notify via Webhook

The workflow ends with two Webhook Notifier HTTP Request nodes sending the story and summary payloads to configured endpoints like webhook.site for live monitoring or integration.

Expected outcome: External systems or users instantly receive the generated content.

Common mistake: Forgetting to configure your webhook URL will lose the notification output.

Customizations ✏️

  • Change LinkedIn URL dynamically: In the Set LinkedIn URL node, replace the hardcoded URL with an incoming webhook or environment variable to automate different companies.
  • Adjust wait time: In the Wait for 30 seconds node, modify the delay duration to suit dataset size or API rate limits.
  • Enhance story tone: Modify systemPromptTemplate in the LinkedIn Data Extractor node to make stories more formal, casual, or creative.
  • Send notifications to Slack or email: Replace webhook notifiers with Slack or Gmail nodes to directly inform team members.
  • Use different Google Gemini models: Experiment with other Gemini model names in the Google Gemini Chat Model nodes to leverage various AI capabilities.

Troubleshooting 🔧

Problem: “401 Unauthorized or 403 Forbidden Errors from Bright Data API”

Cause: Invalid or expired API key or incorrect header authentication setup.

Solution: Re-check Bright Data API credentials in n8n under HTTP Request node authentication. Ensure header keys are correct and active.

Problem: “Snapshot status never changes to ‘ready'”

Cause: Dataset processing delay or wrong snapshot ID mapping.

Solution: Verify correct snapshot ID mapping in Set Snapshot Id. Increase wait time in wait node. Check Bright Data API status online.

Problem: “AI model returns incomplete or irrelevant story”

Cause: Poor prompt design or incomplete input data.

Solution: Refine system prompt in LinkedIn Data Extractor. Confirm JSON input is complete and correctly mapped.

Pre-Production Checklist ✅

  • Confirm Bright Data and Google Gemini API credentials are valid and active.
  • Test the LinkedIn company URL is publicly accessible and correct.
  • Ensure snapshot ID is extracted and passed properly.
  • Verify correct conditions in If nodes for status and error handling.
  • Test the entire workflow manually and watch logs for errors.
  • Backup n8n workflow and credentials securely.

Deployment Guide

Once fully tested, activate the workflow in n8n’s editor by enabling it from the top right toggle. Schedule runs via cron if automatic periodic refreshes are needed.

Monitor workflow executions and errors via the n8n dashboard to catch any issues early. Logs will help understand runtime behavior.

FAQs

Can I use other scraping services instead of Bright Data?

Yes, but you would need to adjust the HTTP request nodes and API credentials accordingly. The unique polling and snapshot handling may differ.

Does this workflow consume many API credits?

It depends on Bright Data and Google Gemini usage plans. Frequent scraping and AI calls add up, so optimize running frequency.

Is my data safe using this automation?

n8n ensures data security in transit using HTTPS and your API keys remain private in node credentials. However, safeguard webhook URLs and credentials carefully.

Can this workflow scale for hundreds of companies?

Yes, with a proper queuing mechanism and sufficient API quota, you can queue URLs for batch processing by adapting the Set LinkedIn URL node to accept incoming data dynamically.

Conclusion

With this detailed n8n workflow, Sarah can now automatically extract LinkedIn company data, generate engaging stories using Google Gemini AI, and produce succinct summaries with zero manual effort. This saves her hours each week, reduces errors, and boosts her HR content marketing significantly.

Next steps could include automating personalized job postings or combining with social media schedulers to broadcast stories automatically.

By mastering this automation, you unlock powerful storytelling with AI integrated deeply into modern data extraction workflows.

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free