UCLA Researchers Propose PhyCV: A Physics-Inspired Computer Vision Python Library

Artificial intelligence is making noteworthy strides in the field of computer vision. One key area of development is deep learning, where neural networks are trained on huge datasets of images to recognize and classify objects, scenes, and events. This has resulted in significant improvements in image recognition and object detection. Integrating computer vision with other technologies is opening various gates to new potentials and scopes for AI.

In the latest innovation, Jalali-Lab @ UCLA has developed a new Python library called PhyCV, which is the first Physics-based Computer vision Python library. This unique library uses algorithms based on the laws and equations of physics to analyze pictorial data. These algorithms imitate how light passes through several physical materials and are based on mathematical equations rather than a series of hand-crafted rules. The algorithms in PhyCV are built on the principles of a rapid data acquisition method called the photonic time stretch.

The three algorithms included in PhyCV are – Phase-Stretch Transform (PST) algorithm, Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) algorithm, and Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) algorithm.

Phase-Stretch Transform (PST) algorithm

The PST algorithm of the PhyCV library identifies edges and textures in images. The algorithm simulates how light travels through a device with particular diffractive properties and then detects the subsequent image cohesively. The algorithm works best for images with visual impairments and has been used in various applications, including enhancing the resolution of MRI scans, identifying blood vessels in retina images, etc.

Phase-Stretch Adaptive Gradient-Field Extractor (PAGE) algorithm

PAGE algorithm identifies edges and orientations in images using the principles of Physics. Essentially, PAGE imitates the process of light passing through a device with a specific diffractive structure, which causes the image to be converted into a complex function. The information about edges is stored in the real and imaginary components of the result. The researchers mention how PAGE can be utilized as a preprocessing method in different Machine Learning problems.

Vision Enhancement via Virtual diffraction and coherent Detection (VEViD) algorithm

VEViD algorithm improvises the low-light and color images by considering them as a spatially-varying light field and using physical processes like diffraction and coherent detection. It does so with minimal latency and thus can increase the accuracy of a computer vision model in low-light circumstances. A particular approximation of VEViD, known as VEViD-lite, can enhance 4K video at up to 200 frames per second. The research team has compared the VEViD algorithm with the popular neural network models showing how VEViD shows an exceptional image quality with only one to two orders of magnitude greater processing speed.  

PhyCV is available on GitHub and can be easily installed via pip. The algorithms in PhyCV can even be applied in actual physical devices for more efficient computation. PhyCV undoubtedly seems interesting and like a significant development in the field of Computer Vision. Consequently, the advancements in AI and computer vision are definitely driving a wide range of advanced applications.

Check out the GitHub and Project. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our Reddit PageDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

🐝 [FREE AI WEBINAR] 'Beginners Guide to LangChain: Chat with Your Multi-Model Data' Dec 11, 2023 10 am PST