Question 1

What is YOLO depth estimation?

Accepted Answer

YOLO depth estimation refers to YOLO-Depth, an announced monocular depth estimation model in the YOLO family, part of the YOLO27 generation and planned for September 2026. Monocular depth estimation takes a standard 2D image from a single camera and produces a depth map, where every pixel value corresponds to distance from the camera. It adds the third dimension, how far away each object is, to the usual what and where. Monocular depth is typically relative rather than absolute, so converting it to real-world units like meters requires a one-time calibration against a known reference.

Question 2

How do I detect objects and estimate distance from one camera today?

Accepted Answer

You can build detection plus depth from a single camera in Roboflow Workflows right now. Add an Object Detection block (RF-DETR, trained on your own classes or a pre-trained checkpoint), add the Depth Estimation block that runs Depth Anything 3 on the same image, then fuse the two by sampling the depth map at the center of each bounding box. A one-time calibration converts the values to real-world units. Deploy the Workflow with Inference on a video stream, at the edge, on-prem, or in the cloud.

Question 3

Is the licensing safe for commercial and embedded products?

Accepted Answer

RF-DETR, the recommended detector for a detection-plus-depth Workflow, is released under Apache 2.0, a permissive license with no copyleft obligations, so you can build it into commercial and embedded products without a separate per-deployment license. YOLO-Depth licensing has not been announced, and previous similar YOLO releases shipped under AGPL-3.0, which requires open-sourcing derivative works unless you buy a commercial license. If you are evaluating models for commercial deployment, this is worth confirming before you build on it.

Question 4

Can I get metric depth, and what about stereo?

Accepted Answer

Monocular depth from a single camera is relative, but you can convert it to metric distances with a one-time calibration: record the depth value of an object at a known distance, then scale new readings against that reference. If your deployment can mount a calibrated stereo pair, binocular disparity produces absolute depth without that step, and YOLO-StereoDepth was announced alongside YOLO-Depth for September 2026. Because Roboflow Workflows keeps detection and depth as separate blocks, you can swap in better depth models as they ship without rebuilding the pipeline.

YOLO Depth Estimation, Built for Production

Detect objects and estimate distance from one camera

Detect objects

Estimate depth

Fuse and calibrate

Deploy to the edge

Depth from a camera comes two ways. Roboflow handles both.

Monocular depth (one camera)

Stereo depth (two cameras)

Your models and data stay yours

Commercial-safe licensing by default

Enterprise security and full data sovereignty

Depth from cameras you already have

Available today, not a 2026 roadmap item

Vision AI is already running in production

Frequently asked questions

Build detection plus depth today

Have a question about depth estimation?

Suggested resources

Depth Estimation Models, Compared

Build with the Depth Estimation Block

Vision AI for Robotics