VC.
ResearchAI/MLHardware

Monocular Depth Estimation for UAV Perch Landing

No depth sensor — just math. Recovering 3D pose from a single camera using image moments and projective geometry.

0 cm

Error @ 25 cm

0 cm

Error @ 1 m

0 mm

Focal Length

Tech Stack

C++OpenCVHSV SegmentationImage MomentsKalman FilterCMakeVisual Servoing

The Challenge

Autonomous UAV landing on a perch requires knowing the perch's 3D position and the camera's attitude relative to it — without a depth sensor, IMU, or GPS. The system had to recover all of this from a single video frame using only the geometric properties of the detected blob: its centroid, principal axes, and apparent size. The depth estimation formula Z = f·X/X₀ is simple in principle but sensitive to blob detection noise, so the pipeline needed to be robust to partial occlusion and varying lighting.

Architecture & System Design

Monocular Depth Estimation for UAV Perch Landing system architecture

Single camera estimates 3D target pose using computer vision: image processing extracts target features, geometric math recovers position and orientation from 2D image moments. Enables closed-loop visual control without depth sensor.

Full system schematic available upon request

The pipeline runs in real time on each video frame: (1) Convert BGR → HSV and threshold for the target colour (yellow: H=20–30, S=50–255, V=100–255) to isolate the blob. (2) Compute zeroth, first, and second-order image moments (M00, M10, M01, M20, M02, M11) to recover centroid (cx, cy) and shape covariance. (3) Eigendecompose the 2×2 covariance matrix to get principal axes and orientation angle θ — this gives camera roll/pitch relative to the perch. (4) Depth: Z = f · X_real / X_pixels, where focal length f=3.22 mm was measured via calibration. Output: [cx, cy, θ, Z] written to a position matrix file and used as the visual error signal for the PID controller. A complementary OpenCV toolkit was developed alongside: Canny edge detection with interactive threshold sliders, HSV range calibration tool, and SURF feature matching.

Code Walkthrough

3-step walk-through of the production implementation — file paths and intent shown above each block.

  1. Step 1 of 3

    HSV colour segmentation and morphological cleanup

    vision/blob.cpp

    The perch marker is a known yellow colour. Converting to HSV decouples hue (colour identity) from brightness, making the threshold robust to shadows and changing illumination that would break an RGB threshold. Morphological open+close removes noise speckles and fills small holes before computing moments — noisy blobs produce wildly wrong centroids.

    cpp
    // blob.cpp — Step 1: Isolate the yellow perch marker via HSV threshold.
    // Focal length f = 3.22 mm measured via checkerboard calibration.
    
    cv::Mat hsv;
    cv::cvtColor(frame, hsv, CV_BGR2HSV);
    
    // HSV range calibrated with the interactive calibration tool
    cv::Mat mask;
    cv::inRange(hsv,
        cv::Scalar(20,  50, 100),   // lower: H=20, S=50,  V=100
        cv::Scalar(30, 255, 255),   // upper: H=30, S=255, V=255
        mask);
    
    // Morphological cleanup: open removes speckle, close fills holes
    cv::Mat kernel = cv::getStructuringElement(
        cv::MORPH_ELLIPSE, cv::Size(5, 5));
    cv::morphologyEx(mask, mask, cv::MORPH_OPEN,  kernel);
    cv::morphologyEx(mask, mask, cv::MORPH_CLOSE, kernel);
    Takeaway

    HSV separation lets you tune H for colour identity and S/V for saturation/brightness independently — three sliders instead of six, and the threshold stays stable under lab lighting changes.

  2. Step 2 of 3

    Image moments → centroid and principal axis angle

    vision/blob.cpp

    Image moments up to second order give the centroid (M10/M00, M01/M00) and the covariance of the blob shape. Eigendecomposing the 2×2 covariance matrix recovers the principal axis angle θ — this is the camera's roll/pitch attitude relative to the perch surface, extracted from a single frame with no IMU or depth sensor.

    cpp
    // blob.cpp — Step 2: Image moments → centroid and orientation angle.
    
    cv::Moments m = cv::moments(mask, true);
    
    if (m.m00 < 500) {
        // Blob too small — target not in frame or heavily occluded; skip frame
        posMatrix << "0 0 0 0
    ";
        continue;
    }
    
    double cx = m.m10 / m.m00;   // centroid x [pixels]
    double cy = m.m01 / m.m00;   // centroid y [pixels]
    
    // Normalised second-order central moments (covariance of blob shape)
    double mu20 = m.mu20 / m.m00;
    double mu02 = m.mu02 / m.m00;
    double mu11 = m.mu11 / m.m00;
    
    // Principal axis angle from covariance eigendecomposition
    // θ = 0.5 · atan2(2μ₁₁, μ₂₀ − μ₀₂)  [radians]
    double theta = 0.5 * std::atan2(2.0 * mu11, mu20 - mu02);
    Takeaway

    The atan2 formula for principal axis is a direct eigendecomposition of the 2×2 blob covariance — no matrix library needed, and it correctly handles the degenerate case when mu20 == mu02.

  3. Step 3 of 3

    Monocular depth from focal-length triangulation

    vision/blob.cpp

    With no depth sensor, depth is recovered from apparent blob size: Z = f · X_real / X_pixels, where X_real is the known physical perch diameter. The second moment √μ₂₀ approximates the blob's pixel half-width, giving depth without a stereo pair. Results are written to a position file consumed by the PID controller node.

    cpp
    // blob.cpp — Step 3: Depth from focal-length triangulation.
    // Z = f · X_real / X_pixels   (f=3.22 mm, X_real=25 mm perch diameter)
    // Calibrated proportionality constant k = f · X_real = 8e-4 m·m
    
    double blob_half_width = std::sqrt(mu20);   // pixel std-dev ≈ half-width
    
    // Guard against divide-by-zero when blob is tiny or fully occluded
    if (blob_half_width < 2.0) {
        posMatrix << "0 0 0 0
    ";
        continue;
    }
    
    double Z_est = 8e-4 / blob_half_width;   // estimated depth [m]
    
    // Write [cx, cy, theta, Z] — consumed by PID servo controller
    posMatrix << cx << " " << cy << " " << theta << " " << Z_est << "
    ";
    
    // Debug overlay: centroid dot + depth annotation
    cv::circle(frame, cv::Point(cx, cy), 4, cv::Scalar(0, 255, 0), -1);
    cv::putText(frame, cv::format("Z=%.2fm", Z_est),
                cv::Point(10, 30), cv::FONT_HERSHEY_SIMPLEX,
                0.7, cv::Scalar(0, 255, 0), 2);
    Takeaway

    The constant k (8e-4) encodes both the focal length and the known object size — recalibrate just k when the perch diameter changes, and the depth formula stays the same.

Results

Depth estimation accuracy: 0.67 cm error at 25 cm, 5.7 cm at 1 m, and 25 cm at 2 m — performance degrades linearly with distance as expected from the triangulation model. Orientation estimation successfully recovered camera attitude (roll/pitch) relative to the perch surface at all tested distances. The full pipeline ran at real-time video rates in C++ on a laptop. Results were validated against a Vicon motion capture ground-truth system and documented in the final project report.

Gallery & Demos

Monocular Depth Estimation for UAV Perch Landing screenshot
Monocular Depth Estimation for UAV Perch Landing screenshot
Monocular Depth Estimation for UAV Perch Landing screenshot

Click any image or video to expand · ← → keys navigate

University of Utah

More from University of Utah

Vision-Based Autonomous Quadrotor

Researcher

MS thesis project: a quadrotor capable of autonomously taking off, navigating, and perching on branch-like structures using only visual feedback — designed for autonomous crop monitoring in agricultural fields.

OpenCVPID ControlROS

Multi-Arm Coordination — 2-DOF QUANSER

Researcher

Dual-arm robotic manipulation system using 2-DOF QUANSER robots with a master-slave architecture — one arm controlling position, the other controlling force — to collaboratively manipulate objects with precision.

MATLAB/SimulinkQUARCImpedance Control

Adaptive Backstepping — Indoor Micro-Quadrotor

Researcher

Nonlinear controller design for an indoor micro-quadrotor with a suspended pendulum mass — a highly unstable configuration. Adaptive backstepping outperformed classical PID in robustness tests across multiple flight regimes.

MATLAB/SimulinkBackstepping ControlLyapunov Analysis

Sensor-Based SLAM Navigation — iRobot Create

Researcher

Autonomous mapping and navigation system on an iRobot Create platform using IR rangefinders and servo-mounted sensors for 360° SLAM — with RRT path planning to navigate complex maze environments.

MATLABiRobot CreateRRT Path Planning

PUMA 6-DOF Robot Arm — Forward & Inverse Kinematics

Researcher

Full forward and inverse kinematics solver for a 6-DOF PUMA 762 robot arm, built from scratch using Denavit-Hartenberg parameters — with an interactive 3D MATLAB GUI featuring joint sliders, motion trail, and collision detection.

MATLABDenavit-HartenbergRobotics Toolbox

Sampling-Based & Graph-Search Motion Planning

Researcher

MATLAB implementations of four canonical path-planning algorithms — Dijkstra, A*, PRM, and RRT — applied to a differential-drive robot navigating bitmap maps in configuration space, with real hardware execution on an iRobot Create.

MATLABRRTPRM

RC Fixed-Wing Glider — Servo Actuation & Aerodynamics

Researcher

Fixed-wing RC glider designed and built from scratch — two servos providing roll and pitch authority via aileron and flap control surfaces, with a brushless DC motor and ESC delivering forward thrust. Flight-tested outdoors.

Servo ControlBrushless DC MotorESC

Interested in this work?

Full architecture walkthrough and code review available during interviews.