Microsoft Introduces ‘PeopleLens’: An Open-Ended Artificial Intelligence System That Uses Computer Vision Algorithms To Help Young People Who Are Blind To Engage With Their Immediate Social Surroundings

Social engagement can be challenging for children who are born blind. Despite a great desire to do so, many blind children and young people with low vision fail to engage and befriend individuals in their age group. This can be extremely difficult for the kid or adolescent, as well as their support network of family members and instructors who wish to assist them in making these crucial connections.

Microsoft team has recently developed “PeopleLens,” an open-ended AI system that provides additional resources to people who are blind or have low vision to make sense of and interact with their local social settings. The system uses Nreal Light augmented reality glasses connected to a smartphone, allowing them to expand their existing talents and abilities.

The researchers draw on studies and experience from psychology and speech and language therapy to provide technology-related tasks. An ethnographic study inspired the idea behind this project with Paralympic competitors and spectators, which revealed the various sense-making skills that persons with low vision employ to orient to and communicate with others. The PeopleLens is based on the idea that disparities in how children with and without vision acquire core attentional processes as babies and young children cause many social interaction issues for blind children.

Children with vision, for example, learn to internalize a combined visual dialogue of attention as they get older. Young children learn how to focus others’ attention through these interactions. However, there isn’t enough research to know how blind children’s joint attention appears. Most studies fail to account for a missing sense. Further, research on visual impairment does not provide a framework for shared attention beyond the age of three. Their work focuses on understanding how the development of joint attention might be aided by technology in early education.

The system employs a head-mounted augmented reality device and four cutting-edge computer vision algorithms to continually locate, identify, track, and capture the gaze directions of persons in the area. The information is then presented to the user via spatialized audio, which emanates from the wearer’s direction. In other words, the sound comes from the person’s direction, supporting the learner in determining their peers’ relative position and distance. The PeopleLens assists learners in creating a People Map, a mental map of persons in their immediate vicinity that is required to effectively communicate communication intent. The technology, in turn, informs the learner’s peers when they have been “seen” and are free to interact—a replacement for the eye contact that typically initiates a human connection.

The system also protects personal information. Facial recognition of users who have registered in the system is one of the algorithms that underpin the system. Using the phone connected to the PeopleLens, a person registers by shooting multiple photos of themselves. Photographs are turned into a vector of integers that depict a face rather than being stored. Because these are distinct from any other system’s vectors, recognition by the PeopleLens does not imply recognition by any other system. The system captures no video or identifiable information, ensuring that the photographs cannot be misused.

The technology uses a sequence of noises to help the wearer locate persons in the environment. The sound of woodblocks assists the wearer in locating and focusing the face of a person the system has seen for 1 second but has not identified, shifting in pitch to assist the wearer in adjusting their gaze accordingly. When their gaze crosses a person up to 10 meters distant, they get a percussive bump. If the individual is registered in the system, is within 4 meters of the wearer, and both of their ears can be detected, their name follows the bump. Furthermore, when the user receives a gaze notice, they will be aware that they are being seen.

The PeopleLens is a tool for blind children and young people to find their friends; nevertheless, it is also a way for teachers and parents to help them build competence and confidence in social interaction.

Paper: https://www.microsoft.com/en-us/research/uploads/prod/2021/06/Morrison-Interactions2021_PeopleLens.pdf

Reference: https://www.microsoft.com/en-us/research/blog/peoplelens-using-ai-to-support-social-interaction-between-children-who-are-blind-and-their-peers/