Monocular Depth Estimation for UAV Perch Landing
No depth sensor — just math. Recovering 3D pose from a single camera using image moments and projective geometry.
0 cm
Error @ 25 cm
0 cm
Error @ 1 m
0 mm
Focal Length
Tech Stack
The Challenge
Autonomous UAV landing on a perch requires knowing the perch's 3D position and the camera's attitude relative to it — without a depth sensor, IMU, or GPS. The system had to recover all of this from a single video frame using only the geometric properties of the detected blob: its centroid, principal axes, and apparent size. The depth estimation formula Z = f·X/X₀ is simple in principle but sensitive to blob detection noise, so the pipeline needed to be robust to partial occlusion and varying lighting.
Architecture & System Design

Single camera estimates 3D target pose using computer vision: image processing extracts target features, geometric math recovers position and orientation from 2D image moments. Enables closed-loop visual control without depth sensor.
The pipeline runs in real time on each video frame: (1) Convert BGR → HSV and threshold for the target colour (yellow: H=20–30, S=50–255, V=100–255) to isolate the blob. (2) Compute zeroth, first, and second-order image moments (M00, M10, M01, M20, M02, M11) to recover centroid (cx, cy) and shape covariance. (3) Eigendecompose the 2×2 covariance matrix to get principal axes and orientation angle θ — this gives camera roll/pitch relative to the perch. (4) Depth: Z = f · X_real / X_pixels, where focal length f=3.22 mm was measured via calibration. Output: [cx, cy, θ, Z] written to a position matrix file and used as the visual error signal for the PID controller. A complementary OpenCV toolkit was developed alongside: Canny edge detection with interactive threshold sliders, HSV range calibration tool, and SURF feature matching.
Code Walkthrough
3-step walk-through of the production implementation — file paths and intent shown above each block.
- 01
Step 1 of 3
HSV colour segmentation and morphological cleanup
vision/blob.cppThe perch marker is a known yellow colour. Converting to HSV decouples hue (colour identity) from brightness, making the threshold robust to shadows and changing illumination that would break an RGB threshold. Morphological open+close removes noise speckles and fills small holes before computing moments — noisy blobs produce wildly wrong centroids.
cpp// blob.cpp — Step 1: Isolate the yellow perch marker via HSV threshold. // Focal length f = 3.22 mm measured via checkerboard calibration. cv::Mat hsv; cv::cvtColor(frame, hsv, CV_BGR2HSV); // HSV range calibrated with the interactive calibration tool cv::Mat mask; cv::inRange(hsv, cv::Scalar(20, 50, 100), // lower: H=20, S=50, V=100 cv::Scalar(30, 255, 255), // upper: H=30, S=255, V=255 mask); // Morphological cleanup: open removes speckle, close fills holes cv::Mat kernel = cv::getStructuringElement( cv::MORPH_ELLIPSE, cv::Size(5, 5)); cv::morphologyEx(mask, mask, cv::MORPH_OPEN, kernel); cv::morphologyEx(mask, mask, cv::MORPH_CLOSE, kernel);TakeawayHSV separation lets you tune H for colour identity and S/V for saturation/brightness independently — three sliders instead of six, and the threshold stays stable under lab lighting changes.
- 02
Step 2 of 3
Image moments → centroid and principal axis angle
vision/blob.cppImage moments up to second order give the centroid (M10/M00, M01/M00) and the covariance of the blob shape. Eigendecomposing the 2×2 covariance matrix recovers the principal axis angle θ — this is the camera's roll/pitch attitude relative to the perch surface, extracted from a single frame with no IMU or depth sensor.
cpp// blob.cpp — Step 2: Image moments → centroid and orientation angle. cv::Moments m = cv::moments(mask, true); if (m.m00 < 500) { // Blob too small — target not in frame or heavily occluded; skip frame posMatrix << "0 0 0 0 "; continue; } double cx = m.m10 / m.m00; // centroid x [pixels] double cy = m.m01 / m.m00; // centroid y [pixels] // Normalised second-order central moments (covariance of blob shape) double mu20 = m.mu20 / m.m00; double mu02 = m.mu02 / m.m00; double mu11 = m.mu11 / m.m00; // Principal axis angle from covariance eigendecomposition // θ = 0.5 · atan2(2μ₁₁, μ₂₀ − μ₀₂) [radians] double theta = 0.5 * std::atan2(2.0 * mu11, mu20 - mu02);TakeawayThe atan2 formula for principal axis is a direct eigendecomposition of the 2×2 blob covariance — no matrix library needed, and it correctly handles the degenerate case when mu20 == mu02.
- 03
Step 3 of 3
Monocular depth from focal-length triangulation
vision/blob.cppWith no depth sensor, depth is recovered from apparent blob size: Z = f · X_real / X_pixels, where X_real is the known physical perch diameter. The second moment √μ₂₀ approximates the blob's pixel half-width, giving depth without a stereo pair. Results are written to a position file consumed by the PID controller node.
cpp// blob.cpp — Step 3: Depth from focal-length triangulation. // Z = f · X_real / X_pixels (f=3.22 mm, X_real=25 mm perch diameter) // Calibrated proportionality constant k = f · X_real = 8e-4 m·m double blob_half_width = std::sqrt(mu20); // pixel std-dev ≈ half-width // Guard against divide-by-zero when blob is tiny or fully occluded if (blob_half_width < 2.0) { posMatrix << "0 0 0 0 "; continue; } double Z_est = 8e-4 / blob_half_width; // estimated depth [m] // Write [cx, cy, theta, Z] — consumed by PID servo controller posMatrix << cx << " " << cy << " " << theta << " " << Z_est << " "; // Debug overlay: centroid dot + depth annotation cv::circle(frame, cv::Point(cx, cy), 4, cv::Scalar(0, 255, 0), -1); cv::putText(frame, cv::format("Z=%.2fm", Z_est), cv::Point(10, 30), cv::FONT_HERSHEY_SIMPLEX, 0.7, cv::Scalar(0, 255, 0), 2);TakeawayThe constant k (8e-4) encodes both the focal length and the known object size — recalibrate just k when the perch diameter changes, and the depth formula stays the same.
Results
Depth estimation accuracy: 0.67 cm error at 25 cm, 5.7 cm at 1 m, and 25 cm at 2 m — performance degrades linearly with distance as expected from the triangulation model. Orientation estimation successfully recovered camera attitude (roll/pitch) relative to the perch surface at all tested distances. The full pipeline ran at real-time video rates in C++ on a laptop. Results were validated against a Vicon motion capture ground-truth system and documented in the final project report.
Gallery & Demos
Click any image or video to expand · ← → keys navigate
More from University of Utah
Vision-Based Autonomous Quadrotor
MS thesis project: a quadrotor capable of autonomously taking off, navigating, and perching on branch-like structures using only visual feedback — designed for autonomous crop monitoring in agricultural fields.
Multi-Arm Coordination — 2-DOF QUANSER
Dual-arm robotic manipulation system using 2-DOF QUANSER robots with a master-slave architecture — one arm controlling position, the other controlling force — to collaboratively manipulate objects with precision.
Adaptive Backstepping — Indoor Micro-Quadrotor
Nonlinear controller design for an indoor micro-quadrotor with a suspended pendulum mass — a highly unstable configuration. Adaptive backstepping outperformed classical PID in robustness tests across multiple flight regimes.
Sensor-Based SLAM Navigation — iRobot Create
Autonomous mapping and navigation system on an iRobot Create platform using IR rangefinders and servo-mounted sensors for 360° SLAM — with RRT path planning to navigate complex maze environments.
PUMA 6-DOF Robot Arm — Forward & Inverse Kinematics
Full forward and inverse kinematics solver for a 6-DOF PUMA 762 robot arm, built from scratch using Denavit-Hartenberg parameters — with an interactive 3D MATLAB GUI featuring joint sliders, motion trail, and collision detection.
Sampling-Based & Graph-Search Motion Planning
MATLAB implementations of four canonical path-planning algorithms — Dijkstra, A*, PRM, and RRT — applied to a differential-drive robot navigating bitmap maps in configuration space, with real hardware execution on an iRobot Create.
RC Fixed-Wing Glider — Servo Actuation & Aerodynamics
Fixed-wing RC glider designed and built from scratch — two servos providing roll and pitch authority via aileron and flap control surfaces, with a brushless DC motor and ESC delivering forward thrust. Flight-tested outdoors.
Interested in this work?
Full architecture walkthrough and code review available during interviews.


