What This Automation Does
This workflow takes a video from a public link and makes a spoken narration audio automatically.
It solves the problem of writing scripts for videos which takes many hours.
You get a voiceover audio file from the video fast, with a script written by AI looking at the images.
Tools and Services Used
- OpenAI API: For generating narration scripts and text-to-speech audio.
- Google Drive API: To upload and store the final audio file.
- Python with OpenCV: To extract frames evenly from the downloaded video.
- n8n Automation Platform: To connect all steps in a workflow.
- HTTP Request node: To download the video.
Workflow Input, Process, and Output
Inputs
- Video file accessed from a public URL.
Processing Steps
- Download video using HTTP Request node.
- Extract up to 90 evenly spaced frames from the video using Python and OpenCV in Code node.
- Split frames into individual items for further processing.
- Batch frames into groups of 15 to fit language model token limits.
- Convert base64 frames to binary and resize images for AI input.
- Generate a continuous narration script in the style of a famous nature narrator using OpenAI’s multimodal GPT-4o LLM.
- Combine partial scripts into full narration text.
- Use OpenAI’s text-to-speech to make an MP3 voiceover from the script.
- Upload the final audio file to Google Drive for storage and sharing.
Output
- MP3 audio file with a voiceover narration of the input video.
Beginner Step-by-Step: How to Use This Workflow in n8n
Download and Import
- Inside the n8n editor, click the Download button on this page to get the workflow file.
- Import the workflow file via the “Import from File” button in n8n.
Configure Credentials and Settings
- Add your OpenAI API Key in the credentials section to allow script generation and TTS.
- Add Google Drive API credentials to upload the voiceover audio.
- Check and update any IDs like Google Drive folder ID or email/channel fields if applicable.
- If URLs, prompts, or code snippets appear in the input fields, copy and paste exactly from the description to these fields.
Test the Workflow
- Click the Manual Trigger node (Manual Trigger) and run the workflow once.
- Check the output to make sure the MP3 audio has uploaded successfully.
Activate for Production
- Once tests pass, activate the workflow to run whenever needed or schedule runs.
This quick method lets you use and benefit from the workflow without building it from zero.
Key Workflow Details
Extracting Frames
The Code node (Python) uses OpenCV to select frames evenly spaced through the video.
It outputs a list of base64 JPEG strings representing these frames.
Batching and Image Preparation
Splitting frames into batches of 15 keeps the input size manageable for the AI model.
Frames get converted to binary and resized to 768px for better processing.
Narration Script Generation
The multimodal GPT-4o model reads image batches and produces script parts.
Scripts keep context between batches for a smooth narration.
Building the Final Audio
Partial scripts combine into a single composed text.
OpenAI’s text-to-speech service creates the MP3 file from the whole script.
Uploading and Access
The voiceover MP3 uploads to Google Drive.
The user can then easily find and share the audio file.
Common Issues and Solutions
- Video Frame Extraction Failure: Happens if video format is unsupported or Python OpenCV is missing.
Fix it by using MP4 videos and ensure OpenCV is installed. - OpenAI Rate Limit Errors: May occur when sending many requests fast.
Use a Wait node to slow requests or upgrade the OpenAI plan. - Google Drive Upload Fails: Caused by expired or wrong OAuth credentials.
Re-authenticate with Google Drive and check folder permissions.
Customization Ideas
- Change narration style by editing prompt in the script generation node.
- Increase or decrease frame count in the Python node for detail vs speed.
- Adjust batch size in the frame batching node for token limit management.
- Switch upload folder in Google Drive node or use other cloud storage.
- Remove or modify the frame resize node depending on desired image quality.
Pre-Production Checklist
- Confirm the video download URL is accessible and uses a supported format.
- Run the frame extraction Python code separately first.
- Validate OpenAI credentials including quota limits.
- Verify Google Drive API connection and destination folder.
- Test the workflow on a small video to check timing and outputs.
- Backup generated files if re-running for multiple videos.
Deployment Notes
After testing the workflow, activate it in n8n editor.
You can run it manually or schedule for batch jobs.
Watch execution logs to catch API errors or slowdowns.
Use the Wait node to avoid hitting OpenAI call limits.
Consider self-host n8n for more control and stability at scale.
Summary
✓ The workflow transforms a public video URL into narrated MP3 audio.
✓ It extracts key video frames, writes AI narration scripts, and produces speech.
✓ Saves hours of manual work on scripting and voiceover creation.
✓ Outputs an MP3 stored on Google Drive for easy access.
✓ Beginners can import and configure the workflow in n8n without building it themselves.
