1. Opening Problem Statement
Meet Sarah, a content manager at a tech startup. Every month, her team receives detailed PDF reports from industry analysts, packed with valuable insights. Sarah needs to repurpose these PDFs into engaging blog posts for her company’s website hosted on Ghost CMS. However, manually extracting text, structuring content, and formatting blog posts is a time-consuming, error-prone process that steals precious hours from her busy schedule. Reformatting PDF content into SEO-friendly, reader-friendly blogs often leads to errors and inconsistent quality, causing delays and costing her company potential web traffic and lead generation.
What if Sarah could automate this entire content conversion flow – from PDF upload to blog post creation and publishing – saving her hours, reducing errors, and ensuring consistent quality? That’s exactly what this n8n workflow achieves.
2. What This Automation Does
This workflow lets you upload a PDF file through a simple web form in n8n, then automatically converts it into a structured blog post and publishes it to Ghost CMS. When triggered, it:
- Extracts text content from uploaded PDF files swiftly and accurately.
- Uses GPT-4o (a powerful AI language model) to analyze and transform the PDF text into a compelling, SEO-friendly blog post with an introduction, multiple chapters with subheadings, and conclusion.
- Parses the AI output into distinct title and content fields to cleanly separate blog metadata from body content.
- Checks that the title and content are properly generated before proceeding, ensuring quality control.
- Publishes the completed blog post draft directly to a Ghost CMS website using the Ghost Admin API.
- Handles errors gracefully by halting publishing if essential fields are missing, preventing broken content from going live.
By automating this workflow, Sarah can save 3-5+ hours per PDF transformation cycle and eliminate repetitive manual formatting errors.
3. Prerequisites ⚙️
- n8n Account – for creating and running workflows.
- Ghost CMS Admin Account – with Admin API access to publish posts.
- OpenAI Account – to use GPT-4o-mini model for content generation.
- Basic knowledge of PDF files and blog publication.
Optional: You can self-host your n8n instance for maximum control and privacy. Learn more about self-hosting at Hostinger self-hosting guide.
4. Step-by-Step Guide
Step 1: Set up Upload PDF Trigger via FormTrigger Node
In n8n, create a FormTrigger node named “Upload PDF.” Configure it with the path /pdf so it listens for file uploads at that URL endpoint.
Customize the form fields to accept a single PDF file:
– Field Label: “Upload PDF File”
– Field Type: file
– Accept File Types: .pdf
– Required: true
Once deployed, you’ll have a URL to upload PDFs directly to your workflow.
Common mistake: Forgetting to set the accepted file type to .pdf may allow unsupported files.
Step 2: Extract Text from Uploaded PDF with ExtractFromFile Node
Add the ExtractFromFile node called “Extract Text” next. Configure it for PDF operation and set the binary property to match the uploaded PDF file’s binary data property (e.g., Upload_PDF_File).
This node extracts the full text content, ready for further processing.
Tip: Ensure the file property name matches exactly what the FormTrigger outputs.
Step 3: Use GPT-4o-mini AI Model to Analyze and Generate Blog Content
Add the AI Language Model node from Langchain called “gpt-4o-mini.” Configure it with your OpenAI credentials and set the model to “gpt-4o-mini-2024-07-18.”
Connect “Extract Text” output text to this node’s input to provide the raw PDF text for analysis.
Step 4: Generate Structured Blog Post Using Langchain Agent Node
Next, add the Langchain Agent node named “Create Structured Blog Post.” This node acts as a conversational agent configured to:
- Take the raw PDF text and generate a blog post in JSON format with fields:
titleandcontent. - Ensure result follows strict formatting instructions: SEO-friendly title under 10 words, HTML tags for paragraphs, multiple chapters with subheadings, and proper spacing.
- Use the system message to enforce detailed formatting and content requirements.
This regulated approach produces consistent, ready-to-publish blog text.
Step 5: Parse and Separate Title & Content with Code Node
Insert a Code node named “Separate Title & Content” that extracts the title and content from the JSON output. It removes any redundant H1 tags from the content to avoid duplication on Ghost CMS and validates non-empty fields.
try {
const input = $input.all();
if (!input || !input.length) throw new Error('No input data received');
const firstItem = input[0];
if (!firstItem || !firstItem.json || !firstItem.json.output || !firstItem.json.output.output) throw new Error('Invalid input structure: missing required properties');
const output = firstItem.json.output.output;
if (!output.title) throw new Error('Missing title in output');
if (!output.content) throw new Error('Missing content in output');
const title = output.title;
const content = output.content.replace(/.*?
/s, '').trim();
if (!content) throw new Error('Content is empty after processing');
return { title, content };
} catch (error) {
return { error: true, message: error.message, title: '', content: '', timestamp: new Date().toISOString() };
}
Note: This node is critical to splitting the blog metadata cleanly before publishing.
Step 6: Validate Non-Empty Title and Content with If Node
Add an If node that checks if both title and content are not empty strings. This validation prevents broken or incomplete posts from being published.
Configure two string conditions using the “notEmpty” operator, one checking {{$json.title}} and the other {{$json.content}}.
If the condition passes, the workflow proceeds to publish; otherwise, it stops with a No Operation node.
Step 7: Publish the Blog Post to Ghost CMS via Ghost Node
Connect the “true” branch from the If node to the Ghost node named “Post to Ghost.” Configure it for the “create” operation with source = 'adminApi'. Bind the post title and content dynamically from the previous Code node output:
- Title:
{{$json.title}} - Content:
{{$json.content}}
Ensure you have setup Ghost Admin API credentials in n8n.
Successful execution will create a new blog draft in your Ghost CMS ready for review or immediate publication.
Step 8: Handle Failure Gracefully with No Operation Node
If title or content validation fails, the “false” branch routes to a NoOp node named “Do Nothing.” This prevents the workflow from producing or publishing incomplete content.
Tip: Customize this to alert you by email or Slack if desired.
5. Customizations ✏️
- Adjust Blog Post Length: Modify the system message in “Create Structured Blog Post” to request shorter or longer sections depending on your audience.
- Add Social Media Snippets: Use additional Code or AI nodes after content generation to produce tweet-sized promotional blurbs with the same AI.
- Auto-Tagging: Insert a keyword extraction step after the AI output and add tags dynamically to Ghost posts.
- Multi-language Support: Add a translation node after “Separate Title & Content” to translate blog posts into different languages.
- Upload More File Types: Modify the FormTrigger node to accept other document formats like DOCX and update extraction accordingly.
6. Troubleshooting 🔧
Problem: “Missing title in output” error from Code node
Cause: GPT prompt didn’t generate a “title” field or JSON structure was malformed.
Solution: Check the system message in “Create Structured Blog Post” node. Ensure it explicitly demands the title field and correct JSON formatting. Test with a simple PDF sample.
Problem: Blog post not appearing in Ghost CMS after publishing
Cause: Ghost API credentials misconfigured or insufficient permissions.
Solution: Revisit your Ghost Admin API credential setup in n8n, confirm Admin API key has post creation rights, and test the connection.
Problem: PDF extraction yields empty or incomplete text
Cause: Input PDF file is scanned image or protected.
Solution: Use OCR tools before uploading or try test PDFs known to contain selectable text.
7. Pre-Production Checklist ✅
- Double-check all node connections match the described flow.
- Test PDF uploads with diverse file sizes and content complexity.
- Verify OpenAI and Ghost credentials work and are valid.
- Run test executions ensuring AI generates well-structured JSON blogs.
- Review error handling paths and confirm no incomplete content publishes.
8. Deployment Guide
Activate the workflow in n8n and share the form URL for PDF uploads. Monitor executions via n8n’s execution list to confirm seamless operations.
Optionally, enable alerts on failures by adding email or Slack notifications on error branches.
9. FAQs
Q: Can I use a different AI model than GPT-4o-mini?
A: Yes, you can switch models by adjusting the model parameter in the OpenAI node, but ensure the output format matches the expected JSON schema.
Q: Does publishing consume many API credits?
A: The OpenAI usage depends on input text length. The Ghost API calls are minimal as they only create a post, so costs are usually low.
Q: Is this workflow suitable for high-volume PDF processing?
A: It’s best for moderate volumes. Heavy bulk processing might require queuing logic or segmented workflows.
10. Conclusion
By following this detailed workflow, you’ve automated the tedious process of transforming PDF reports into polished, structured blog posts published directly to Ghost CMS. This saves you hours of manual copy-pasting and editing per article, reducing errors and maintaining consistent quality. Next, consider automating social media promotion of your blogs or implementing keyword tagging for SEO enhancements. With n8n and GPT-4o-mini, turning static PDFs into engaging blogs has never been easier.