VC.
ResearchHardwareAI/ML

Vision-Based Autonomous Quadrotor

Vision-guided autonomous flight and precision perching for agricultural monitoring.

0%

Autonomous Perching

Visual Servo

Control Mode

University of Utah

MS Thesis

Tech Stack

OpenCVPID ControlROSPythonC++Visual ServoingMATLAB/Simulink

The Challenge

Commercial drones at the time required GPS for stable flight, making them unusable in dense vineyard canopies where satellite signal was occluded. The system needed to autonomously locate a branch-like structure, align itself, and perch using only camera input — with bird-leg mechanisms attached to the frame adding nonlinear mass distribution and making the dynamics significantly harder to stabilize.

Architecture & System Design

Vision-Based Autonomous Quadrotor system architecture

Camera-based visual control loop: drone sees target on ground, computes positioning error, adjusts flight path using feedback control. Perching mechanism attached to frame changes dynamics, requiring stability analysis and adaptive controller gains.

Full system schematic available upon request

The quadrotor used a downward-facing camera feeding OpenCV-based feature detection to compute a visual error signal. A visual servoing loop drove the PID controllers for x/y positioning and yaw alignment. The nonlinear dynamics introduced by the bird-leg perching structures required a Lyapunov-based stability analysis and controller gain-scheduling. MATLAB/Simulink was used for initial simulation; the validated controllers were then ported to the onboard flight controller via ROS.

Code Walkthrough

3-step walk-through of the production implementation — file paths and intent shown above each block.

  1. Step 1 of 3

    Target detection and visual error computation

    vision/visual_servo.py

    The only sensor is a downward-facing camera. The visual error — how far the branch centroid is from the frame centre — is the sole input to the flight controller. Finding contours on a thresholded frame and computing image moments is fast enough to run at video rate on the onboard computer, with no learned model needed for a constrained lab target.

    python
    def compute_visual_error(frame, target_shape=None):
        """
        Detect branch centroid and return pixel-space error from frame centre.
        Returns (None, None) when the target is not visible.
        """
        gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
        _, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
        contours, _ = cv2.findContours(
            thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
        )
        if not contours:
            return None, None
    
        # Largest contour = branch structure
        target = max(contours, key=cv2.contourArea)
        M = cv2.moments(target)
        if M['m00'] == 0:
            return None, None
    
        cx = int(M['m10'] / M['m00'])
        cy = int(M['m01'] / M['m00'])
    
        # Error = centroid offset from frame centre (pixels)
        frame_cx, frame_cy = frame.shape[1] // 2, frame.shape[0] // 2
        return cx - frame_cx, cy - frame_cy
    Takeaway

    Image moments give centroid in one pass with no learned model — sufficient for a lab target and fast enough to run every frame without buffering.

  2. Step 2 of 3

    IBVS control law — pixel error to velocity command

    vision/visual_servo.py

    Image-Based Visual Servoing directly maps pixel error to body velocity, skipping 3D pose estimation. The PID controller runs in pixel space: proportional gain drives the drone toward the target, derivative damps oscillation as it closes in, and the tiny integral term removes steady-state offset when hover isn't perfectly trimmed.

    python
    def pid_update(error, prev_error, integral,
                   Kp=0.8, Ki=0.01, Kd=0.3, dt=0.033):
        """
        Single-axis discrete PID in pixel space.
        dt = 1/30 s at 30 fps camera rate.
        """
        integral   += error * dt
        derivative  = (error - prev_error) / dt
        output      = Kp * error + Ki * integral + Kd * derivative
        return output, integral
    
    
    # Servo loop — called every frame
    def servo_step(frame, state):
        err_x, err_y = compute_visual_error(frame)
        if err_x is None:
            return 0.0, 0.0, state   # hold position if target lost
    
        vx, state['ix'] = pid_update(err_x, state['px'], state['ix'])
        vy, state['iy'] = pid_update(err_y, state['py'], state['iy'])
        state['px'], state['py'] = err_x, err_y
        return vx, vy, state
    Takeaway

    Running PID in pixel space avoids the camera calibration needed for full IBVS — valid when the target is planar and the gain-to-scale conversion can be tuned empirically.

  3. Step 3 of 3

    ROS node — publish velocity commands to flight controller

    ros/visual_servo_node.py

    The vision loop needs to publish at camera rate (30 Hz) and respond to ROS shutdown cleanly. The node subscribes to the raw image topic, runs the servo step, and publishes a geometry_msgs/Twist to /cmd_vel — the standard interface the ArduPilot-based flight controller expects for guided velocity mode.

    python
    #!/usr/bin/env python
    """visual_servo_node.py — ROS node: camera frame → /cmd_vel velocity command."""
    import rospy
    from geometry_msgs.msg import Twist
    from sensor_msgs.msg import Image
    from cv_bridge import CvBridge
    
    bridge = CvBridge()
    pub    = None
    state  = {'px': 0, 'py': 0, 'ix': 0.0, 'iy': 0.0}
    
    PIXEL_TO_MS = 0.008   # empirical: 1 px error ≈ 0.008 m/s command
    
    
    def image_callback(msg):
        frame = bridge.imgmsg_to_cv2(msg, "bgr8")
        vx, vy, _ = servo_step(frame, state)
    
        cmd = Twist()
        cmd.linear.x =  vx * PIXEL_TO_MS   # forward/back
        cmd.linear.y = -vy * PIXEL_TO_MS   # camera y inverted vs body y
        cmd.linear.z =  0.0
        pub.publish(cmd)
    
    
    if __name__ == "__main__":
        rospy.init_node("visual_servo")
        pub = rospy.Publisher("/cmd_vel", Twist, queue_size=1)
        rospy.Subscriber("/camera/image_raw", Image, image_callback)
        rospy.loginfo("Visual servo node started")
        rospy.spin()
    Takeaway

    The PIXEL_TO_MS constant is the only empirical tune between the vision math and the flight controller — isolating it makes gain adjustment a one-line change during field testing.

Results

Successfully demonstrated autonomous perching on simulated vineyard branch structures in lab conditions. The visual servoing controller achieved stable hover within 5 cm of the target using only camera feedback, with the nonlinear perching-leg dynamics compensated through gain scheduling. The work contributed to understanding of vision-only navigation in GPS-denied agricultural environments.

Gallery & Demos

Vision-Based Autonomous Quadrotor screenshot
Vision-Based Autonomous Quadrotor screenshot

Click any image or video to expand · ← → keys navigate

University of Utah

More from University of Utah

Multi-Arm Coordination — 2-DOF QUANSER

Researcher

Dual-arm robotic manipulation system using 2-DOF QUANSER robots with a master-slave architecture — one arm controlling position, the other controlling force — to collaboratively manipulate objects with precision.

MATLAB/SimulinkQUARCImpedance Control

Adaptive Backstepping — Indoor Micro-Quadrotor

Researcher

Nonlinear controller design for an indoor micro-quadrotor with a suspended pendulum mass — a highly unstable configuration. Adaptive backstepping outperformed classical PID in robustness tests across multiple flight regimes.

MATLAB/SimulinkBackstepping ControlLyapunov Analysis

Sensor-Based SLAM Navigation — iRobot Create

Researcher

Autonomous mapping and navigation system on an iRobot Create platform using IR rangefinders and servo-mounted sensors for 360° SLAM — with RRT path planning to navigate complex maze environments.

MATLABiRobot CreateRRT Path Planning

PUMA 6-DOF Robot Arm — Forward & Inverse Kinematics

Researcher

Full forward and inverse kinematics solver for a 6-DOF PUMA 762 robot arm, built from scratch using Denavit-Hartenberg parameters — with an interactive 3D MATLAB GUI featuring joint sliders, motion trail, and collision detection.

MATLABDenavit-HartenbergRobotics Toolbox

Sampling-Based & Graph-Search Motion Planning

Researcher

MATLAB implementations of four canonical path-planning algorithms — Dijkstra, A*, PRM, and RRT — applied to a differential-drive robot navigating bitmap maps in configuration space, with real hardware execution on an iRobot Create.

MATLABRRTPRM

Monocular Depth Estimation for UAV Perch Landing

Researcher

C++/OpenCV vision system that estimates the 3D position and orientation of a landing perch from a single monocular camera — using image moments, covariance eigendecomposition for attitude, and focal-length triangulation for depth — enabling closed-loop visual servoing on a quadrotor.

C++OpenCVHSV Segmentation

RC Fixed-Wing Glider — Servo Actuation & Aerodynamics

Researcher

Fixed-wing RC glider designed and built from scratch — two servos providing roll and pitch authority via aileron and flap control surfaces, with a brushless DC motor and ESC delivering forward thrust. Flight-tested outdoors.

Servo ControlBrushless DC MotorESC

Interested in this work?

Full architecture walkthrough and code review available during interviews.