University of Zurich Researchers Introduce Swift: An Autonomous Vision-based Drone that can Beat human World Champions in Several Fair Head-to-Head Races

First-person view (FPV) drone racing is an exhilarating and rapidly growing sport where pilots control racing drones from a first-person perspective using specialized FPV goggles. The drones have powerful motors, lightweight frames, and high-quality cameras for low-latency video transmission in this sport. Pilots wear FPV goggles that provide a live video feed from the drone’s camera. This immersive experience allows them to see what the drone sees in real time.

Can we have an autonomous mobile drone that can beat human champions in the race? The researchers of the Robotics and Perception group at the University of Zurich built a drone system called “SWIFT” that can race physical vehicles at the level of the human world champions. Swift can fly at its physical limits while estimating its speed and location in the circuit using sensors. 

Swift combines deep reinforcement learning (RL) in simulation with data collected from the physical world. It consists of a perception system that translates high-dimensional representation and a control policy that ingests the low-dimensional representation produced by the perception system and has control commands.

The perception system includes a visual-inertial estimator and a gate detector ( a CNN that detects the racing gates). The detected gates are further used to estimate the trajectories of the drone as well as the orientation of the drone required along the track. Swift does this analysis using a camera-resectioning algorithm in combination with a map of the track. To get a more accurate drone orientation, they use the global pose obtained from the gate detector combined with the visual-inertial estimator utilizing a filter. 

The control policy consists of two-layer perceptrons, which map the filter output to control commands of the drone and maximize the perception objective by keeping the next gate in the camera’s field of view. Seeing the next gate is promising because it increases the accuracy of the pose estimation. However, optimizing these methods purely in simulation will yield poor performance if there are discrepancies between simulation and reality. 

The differences between the simulated and real dynamics will cause the drone to choose the wrong trajectories, leading to a crash. Another factor affecting the safe trajectories is a noisy estimation of the drone’s state. The team mitigates these defects by collecting a small amount of data in the real world and using this data to increase the simulator’s realism. They record the data using the onboard sensors with highly accurate estimates from a motion-capture system while the drone races through the track. 

Researchers say that Swift wins most of the races against each human pilot and archives the fastest race time recorded, with a lead of half a second over the best time clocked by a human pilot. They say it is consistently faster than the human pilots at the turns and has a lower reaction time in taking off from the podium, an average of 120 ms before human pilots.  


Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 29k+ ML SubReddit, 40k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Arshad is an intern at MarktechPost. He is currently pursuing his Int. MSc Physics from the Indian Institute of Technology Kharagpur. Understanding things to the fundamental level leads to new discoveries which lead to advancement in technology. He is passionate about understanding the nature fundamentally with the help of tools like mathematical models, ML models and AI.