Use Google Gemini 2.0 for Precise Image Object Detection with n8n

Solve the hassle of manual image tagging by automatically detecting objects like rabbits with Google Gemini 2.0 and n8n. This workflow downloads images, detects objects via prompts, and draws bounding boxes, saving hours and improving accuracy.
httpRequest
editImage
code
+2
Workflow Identifier: 2293
NODES in Use: Manual Trigger, HTTP Request, Edit Image, Code, Set

Press CTRL+F5 if the workflow didn't load.

Learn how to Build this Workflow with AI:

Visit through Desktop for Best experience

1. Opening Problem Statement

Meet Jenny, the manager of a petting zoo who maintains a website with photos of the animals and events. Jenny spends hours manually tagging images to highlight important subjects—like rabbits—in her photos for promotional materials. Every time she updates the site, she repeats this tedious task, wasting valuable time and risking errors in object labeling. Manually drawing bounding boxes over multiple images leads to inconsistencies and slows down her marketing efforts, resulting in lost opportunities and increased frustration.

This is where automation can save Jenny a considerable amount of effort by letting AI do the heavy lifting. Specifically, using the latest capabilities of Google’s Gemini 2.0 multimodal model, she can detect objects in images by simply prompting what to look for, and automatically visualize these detections on the photos.

2. What This Automation Does

This n8n workflow automates image object detection using Google Gemini 2.0’s prompt-based bounding box feature. Here’s what happens when you run it:

  • Downloads a test image of a petting zoo with multiple rabbits.
  • Uses the Gemini 2.0 API to prompt detection of all rabbits in the image.
  • Retrieves bounding box coordinates normalized to a 0-1000 scale from the AI response.
  • Scales the normalized coordinates to fit the actual image dimensions.
  • Draws colored bounding boxes over the detected rabbits in the original image.
  • Outputs the image with visual bounding boxes enabling quick validation and further use.

This eliminates hours of manual image annotation, drastically reduces human error, and empowers users like Jenny to handle various complex detection tasks just by adjusting the prompt.

3. Prerequisites ⚙️

  • n8n Account: You will need access to an n8n automation platform, either via n8n cloud or self-hosted options like Hostinger.
  • Google Gemini (PaLM) API account 🔑: API credentials for Google Gemini 2.0 to use the object detection model.
  • Image URL source 🔌: A publicly accessible image URL to test the workflow, here we use a petting zoo photo.

4. Step-by-Step Guide

Step 1: Trigger the Workflow Manually

In n8n, look for the Manual Trigger node (When clicking ‘Test workflow’). Click “Execute Workflow” to start the process on-demand. This is useful for testing and debugging your automation. You should see the node execute successfully.

Step 2: Download the Test Image

Next, the workflow uses the HTTP Request node (Get Test Image) to fetch an image from a specified URL. In this case, the URL is:

https://www.stonhambarns.co.uk/wp-content/uploads/jennys-ark-petting-zoo-for-website-6.jpg

Ensure the URL returns a valid image. After this step runs, the image data will be available in binary format to subsequent nodes.

Step 3: Extract Image Information

The Edit Image node (Get Image Info) retrieves metadata about the downloaded image, especially the width and height in pixels. This information is crucial for later scaling of bounding box coordinates.

Step 4: Use Google Gemini 2.0 for Object Detection

The workflow calls the HTTP Request node (Gemini 2.0 Object Detection) to send the image to the Google Gemini API. It posts a JSON request with a prompt: “I want to see all bounding boxes of rabbits in this image.”

The request includes the image data as inline base64 encoded binary. The API responds with detected objects including the bounding box coordinates normalized on a 0-1000 scale.

Step 5: Set Image Dimensions and Coordinates Variables

The Set node (Get Variables) assigns variables for the coordinates array, as well as the image width and height, to be used in the next calculations.

Here, we parse the JSON response’s bounding box data and keep the width and height from the image info.

Step 6: Scale Normalized Coordinates to Actual Pixels

Using the Code node (Scale Normalised Coords), a JavaScript snippet recalculates bounding box areas scaled to the original image dimensions:

const { coords, width, height } = $input.first().json;

const scale = 1000;
const scaleCoordX = (val) => (val * width) / scale;
const scaleCoordY = (val) => (val * height) / scale;
  
const normalisedOutput = coords
  .filter(coord => coord.box_2d.length === 4)
  .map(coord => {
    return {
      xmin: coord.box_2d[1] ? scaleCoordX(coord.box_2d[1]) : coord.box_2d[1],
      xmax: coord.box_2d[3] ? scaleCoordX(coord.box_2d[3]) : coord.box_2d[3],
      ymin: coord.box_2d[0] ? scaleCoordY(coord.box_2d[0]) : coord.box_2d[0],
      ymax: coord.box_2d[2] ? scaleCoordY(coord.box_2d[2]) : coord.box_2d[2],
    }
  });

return {
  json: {
    coords: normalisedOutput
  },
  binary: $('Get Test Image').first().binary
}

This step converts the AI’s relative coordinates into pixel values matching the actual photo size.

Step 7: Draw Bounding Boxes on the Image

The final Edit Image node (Draw Bounding Boxes) receives the scaled coordinates and draws colorful bounding boxes around each detected rabbit. The node is configured with multiple draw operations specifying start and end X/Y pixels and color code #ff00f277. The output is an image with visible highlights around target objects.

Step 8: Review the Output

You can add further nodes to save or share the resulting image. In this demo, the workflow ends after drawing the bounding boxes, but you can easily extend it to upload the image to cloud storage or send via email.

5. Customizations ✏️

Customize the Object Detection Prompt

In the Gemini 2.0 Object Detection HTTP node, change the prompt text inside the JSON body from “all bounding boxes of rabbits” to any other subject you want to detect, such as “cars,” “dogs,” or “people with umbrellas.” This allows flexible context-based image detection.

Adjust the Image Source URL

Update the HTTP Request node (Get Test Image) to fetch a different image by modifying the URL parameter. Useful to test different photos or your own data.

Modify Bounding Box Appearance

Within the Draw Bounding Boxes node, you can change the color or add more draw operations to highlight additional objects. Adjust stroke thickness or corner radius if supported for better visuals.

Extend to Save or Send Results

Add a cloud storage node (e.g., Google Drive, Dropbox) or email node (e.g., Gmail) after the drawing step to automatically archive or share processed images.

Increase Detection Accuracy with Different Prompts

Experiment with different prompts or multiple API calls for detecting complex or overlapping objects to improve detection robustness.

6. Troubleshooting 🔧

Problem: API returns no bounding boxes

Cause: The prompt is unclear or the image content does not match requested objects.

Solution: Refine your prompt for more specific request (e.g., “all rabbits” vs “all animals”) and verify the image content is appropriate to detect those objects.

Problem: Coordinates do not align with image

Cause: Image dimensions used for scaling are incorrect or not updated.

Solution: Confirm the Edit Image node (Get Image Info) correctly extracts width and height, and the scaling logic in the Code node (Scale Normalised Coords) matches those values.

Problem: Workflow fails at API call step

Cause: Invalid or expired Google Gemini API credentials.

Solution: Reconfigure your Google Palm API credentials in n8n under the Gemini 2.0 HTTP Request node settings.

7. Pre-Production Checklist ✅

  • Verify your Google Gemini API credentials are active and permissions granted.
  • Test the image URL to confirm it returns a valid image accessible without authentication.
  • Run the workflow manually and watch each node output in n8n editor to confirm data flow.
  • Validate the scaling code outputs reasonable bounding box coordinates compared to the image size.
  • Prepare backup workflow versions before deploying with custom images or prompts.

8. Deployment Guide

After testing, activate your workflow by setting triggers as needed (manual or timed triggers). The manual trigger included suits simple on-demand runs.

For ongoing use, integrate other trigger nodes like Webhooks or Schedules to automate detection for new images uploaded to your systems.

Monitor recent workflow executions in n8n to catch any API changes or errors.

9. FAQs

Can I use images stored locally instead of URLs?

Yes, if you can upload images to a web-accessible location or use n8n’s binary data handling to pass them directly into the HTTP request node for the Gemini API.

Does this consume my Google Gemini API credits?

Yes, each API call to Gemini 2.0 for object detection uses your quota according to Google’s pricing and limits.

Is my data secure when using this API?

Google Gemini API uses secure HTTPS connections. Always safeguard your API keys and avoid exposing sensitive images unnecessarily.

Can this handle detection in complex images with many objects?

While Gemini 2.0 is advanced, very crowded scenes might require multiple passes or more specific prompts for best results.

10. Conclusion

By following this guide, you’ve set up an advanced image object detection workflow using Google Gemini 2.0 inside n8n. You automated tagging of rabbit objects from images, cutting down hours of manual labor and boosting accuracy.

This approach scales to many detection scenarios just by changing prompts in the HTTP node, empowering you to create responsive AI-assisted image processing pipelines.

Next, consider adding automated storage and sharing of annotated images or linking this with real-time content updates on websites or marketing platforms.

Keep experimenting with different object detection prompts and image sources to unlock new creative automation possibilities! Happy automating!

Promoted by BULDRR AI

Related Workflows

Automate Viral UGC Video Creation Using n8n + Degaus (Beginner-Friendly Guide)

Learn how to automate viral UGC video creation using n8n, AI prompts, and Degaus. This beginner-friendly guide shows how to import, configure, and run the workflow without technical complexity.
Form Trigger
Google Sheets
Gmail
+37
Free

AI SEO Blog Writer Automation in n8n

A complete beginner guide to building an AI-powered SEO blog writer automation using n8n.
AI Agent
Google Sheets
httpRequest
+5
Free

Automate CrowdStrike Alerts with VirusTotal, Jira & Slack

This workflow automates processing of CrowdStrike detections by enriching threat data via VirusTotal, creating Jira tickets for incident tracking, and notifying teams on Slack for quick response. Save hours daily by transforming complex threat data into actionable alerts effortlessly.
scheduleTrigger
httpRequest
jira
+5
Free

Automate Telegram Invoices to Notion with AI Summaries & Reports

Save hours on financial tracking by automating invoice extraction from Telegram photos to Notion using Google Gemini AI. This workflow extracts data, records transactions, and generates detailed spending reports with charts sent on schedule via Telegram.
lmChatGoogleGemini
telegramTrigger
notion
+9
Free

Automate Email Replies with n8n and AI-Powered Summarization

Save hours managing your inbox with this n8n workflow that uses IMAP email triggers, AI summarization, and vector search to draft concise replies requiring minimal review. Automate business email processing efficiently with AI guidance and Gmail integration.
emailReadImap
vectorStoreQdrant
emailSend
+12
Free

Automate Email Campaigns Using n8n with Gmail & Google Sheets

This n8n workflow automates personalized email outreach campaigns by integrating Gmail and Google Sheets, saving hours of manual follow-up work and reducing errors in email sequences. It ensures timely follow-ups based on previous email interactions, optimizing communication efficiency.
googleSheets
gmail
code
+5
Free