Depth Estimation

YOLO Depth Estimation, Built for Production

Detect objects and measure distance from a single camera with Roboflow. Commercial-safe licensing, enterprise security, and edge-to-cloud deployment in one platform.

Detect objects and estimate distance from one camera

YOLO-Depth is a roadmap item for September 2026. The capability it promises, detection plus depth from a single camera, is something you can build in Roboflow Workflows today.

1

Detect objects

Add an Object Detection block to a Workflow. We recommend RF-DETR, trained on your own classes (person, forklift, part) or a pre-trained checkpoint. Better boxes mean better depth samples.

2

Estimate depth

Add the Depth Estimation block on the same image. It runs Depth Anything 3 and returns a per-pixel depth map alongside your detections, no new sensors required.

3

Fuse and calibrate

Sample the depth map at the center of each bounding box to get a distance per detected object. A one-time calibration against an object at a known distance converts relative depth to real-world units.

4

Deploy to the edge

Deploy the Workflow with Inference on a live video stream, at the edge, on-prem, in your VPC, or via API. Proximity alerts, spacing, grasp distance, and object sizing are logic on top of the same pipeline.

Try RF-DETR live in the model playground Open in new tab

Depth from a camera comes two ways. Roboflow handles both.

Monocular depth (one camera)

A single RGB camera is the cheapest, most widely deployed sensor in the world. Monocular depth estimation reads per-pixel distance from one camera, giving relative depth that a one-time calibration turns into real-world units. This runs today in Roboflow Workflows with Depth Anything 3.

Stereo depth (two cameras)

A calibrated stereo pair computes absolute, metric depth from binocular disparity, a camera-native alternative to lidar for robotics. YOLO-StereoDepth was announced for September 2026; until then, a stereo setup plus the same Workflow pattern gets you there.

Either way, Roboflow gives you 3D understanding from the cameras you already have, with detection and depth as separate blocks you can swap as better models ship, no rebuild required.

Your models and data stay yours

Commercial-safe by license, secure by architecture, and shipping today instead of next year.

Commercial-safe licensing by default

RF-DETR, the recommended detector for a detection-plus-depth pipeline, ships under the permissive Apache 2.0 license, with no copyleft obligations. YOLO-Depth licensing is unannounced, and previous YOLO releases shipped under AGPL-3.0. Build on a license you can trust in production.

Enterprise security and full data sovereignty

Roboflow is a US-based platform with SOC 2 Type II compliance, encryption in transit and at rest, and an uptime SLA. Deploy on-prem, in your own VPC, or fully air-gapped, so your camera feeds, depth data, and trained models never leave your infrastructure and never cross a border you did not choose.

Depth from cameras you already have

Extract 3D understanding from the single RGB cameras already mounted on your line, in your warehouse, or on your vehicle. No stereo rigs, no lidar, no new sensors, and no new capex to get distance from the frame.

Available today, not a 2026 roadmap item

YOLO-Depth is planned for September 2026. Detection plus depth from one camera works now in Roboflow Workflows with RF-DETR and Depth Anything 3, so you can build proximity alerts, spacing, and grasp planning this quarter.

Vision AI is already running in production

Half the Fortune 100 build computer vision with Roboflow, with models deployed in warehouses, on robots, on vehicles, and on plant floors.

55B+
model inferences run in production across critical industries
1M+
engineers and 16,000+ organizations building on the platform
Edge to cloud
models deployed to edge, on-prem, VPC, and API from one platform

Trusted by teams at BNSF, Rivian, GE Vernova, Cummins, USG, Pella, and Peer Robotics.

Frequently asked questions

What is YOLO depth estimation?

YOLO depth estimation refers to YOLO-Depth, an announced monocular depth estimation model in the YOLO family, part of the YOLO27 generation and planned for September 2026. Monocular depth estimation takes a standard 2D image from a single camera and produces a depth map, where every pixel value corresponds to distance from the camera. It adds the third dimension, how far away each object is, to the usual what and where. Monocular depth is typically relative rather than absolute, so converting it to real-world units like meters requires a one-time calibration against a known reference.

How do I detect objects and estimate distance from one camera today?

You can build detection plus depth from a single camera in Roboflow Workflows right now. Add an Object Detection block (RF-DETR, trained on your own classes or a pre-trained checkpoint), add the Depth Estimation block that runs Depth Anything 3 on the same image, then fuse the two by sampling the depth map at the center of each bounding box. A one-time calibration converts the values to real-world units. Deploy the Workflow with Inference on a video stream, at the edge, on-prem, or in the cloud.

Is the licensing safe for commercial and embedded products?

RF-DETR, the recommended detector for a detection-plus-depth Workflow, is released under Apache 2.0, a permissive license with no copyleft obligations, so you can build it into commercial and embedded products without a separate per-deployment license. YOLO-Depth licensing has not been announced, and previous similar YOLO releases shipped under AGPL-3.0, which requires open-sourcing derivative works unless you buy a commercial license. If you are evaluating models for commercial deployment, this is worth confirming before you build on it.

Can I get metric depth, and what about stereo?

Monocular depth from a single camera is relative, but you can convert it to metric distances with a one-time calibration: record the depth value of an object at a known distance, then scale new readings against that reference. If your deployment can mount a calibrated stereo pair, binocular disparity produces absolute depth without that step, and YOLO-StereoDepth was announced alongside YOLO-Depth for September 2026. Because Roboflow Workflows keeps detection and depth as separate blocks, you can swap in better depth models as they ship without rebuilding the pipeline.

Build detection plus depth today

Detect objects and measure distance from a single camera with Roboflow Workflows. Turn detections into spatial decisions.

Roboflow mascot

Have a question about depth estimation?

Ask the Roboflow assistant about detecting objects and measuring distance from a single camera.

Ask the Roboflow agent

Suggested resources