Computer Vision

Meta AI Open-Sources DINOv2: A New AI Method for Training High-Performance Computer Vision Models Based on Self-Supervised Learning

Due to recent developments in AI, foundational computer vision models may now be pretrained using massive datasets. Producing general-purpose visual features, or features that...

Meta AI Releases the Segment Anything Model (SAM): A New AI Model That Can Cut Out Any Object In A Image/Video With A Single...

Computer vision relies heavily on segmentation, the process of determining which pixels in an image represents a particular object for uses ranging from analyzing...

If You Can Say It, Now You Can See It: RunWay’s Latest Artificial Intelligence Tool Can Generate Videos With Nothing But Words

Runway, an artificial intelligence (AI) platform, has recently released its latest software, Gen-2, which can create full videos from text descriptions. Gen-2 was developed...

Researchers From ETH Zurich and Microsoft Propose X-Avatar: An Animatable Implicit Human Avatar Model Capable of Capturing Human Body Pose and Facial Expressions

Pose, look, facial expression, hand gestures, etc.—collectively called "body language”—has been the subject of many academic investigations. Accurately recording, interpreting, and creating non-verbal signals...

Microsoft AI Proposes MM-REACT: A System Paradigm that Combines ChatGPT and Vision Experts for Advanced Multimodal Reasoning and Action

Large Language Models (LLMs) are rapidly advancing and contributing to notable economic and social transformations. With many artificial intelligence (AI) tools getting released on...

Unified Understanding: This AI Approach Provides a Better 3D Mapping for Robots

Developing robots that could do daily tasks for us is a long-lasting dream of humanity. We want them to walk around and help us...

Google AI Introduces A Vision-Only Approach That Aims To Achieve General UI Understanding Completely From Raw Pixels

For UI/UX designers, getting a better computational understanding of user interfaces is the primary step toward achieving more enhanced and intelligent UI behaviors. This...

Divide and Track: This AI Model Can Track 3D Human Motion in Videos by Decoupling

Deep learning has been a game-changer in the field of computer vision, enabling unprecedented advances in numerous applications. One of these applications is tracking...

Mimicking is the Way: Innovative AI Model Lets Robots Learn Tasks by Watching Human Videos

Robots are incredible. They have already revolutionized the way we live and work, and they still have the potential to do it again. They...

A New Artificial Intelligence (AI) Study Proposes A 3D-Aware Blending Technique With Generative NeRFs

Image blending is a primary method in computer vision, one of the most known branches in the artificial intelligence component. The goal is to...

A New AI Research Proposes VoxFormer: A Transformer-Based 3D Semantic Scene Completion Framework

Understanding a holistic 3D picture is a significant challenge for autonomous vehicles (AV) to perceive. It directly influences later activities like planning and map...

A New Artificial Intelligence (AI) Study From CMU and Meta Proposes a Framework for Efficient Neural Relighting of Articulated Hand Models

Neural rendering is a cutting-edge technology that uses artificial intelligence and deep learning to create photorealistic images and animations. Unlike traditional rendering techniques that...

Recent articles

Be the first to know the latest AI research breakthroughs.

X