Researchers Introduce A New Hand Gesture Recognition Algorithm Combining Hand-Type Adaptive Algorithm And Effective-Area Ratio For Efficient Edge Computing

Almost all of our computer interaction occurs via mouse, keyboards, and touch screens. An essential step in making human-computer interactions more efficient would be to move towards a more contactless form of communication like speech, facial expressions, and gestures, which are generally used when communicating with other humans. Specifically, past studies in hand gesture recognition have not been very successful in achieving high accuracy while maintaining low computational complexity. This poor recognition accuracy is primarily due to the rotation, translation, and scaling that often occurs during the capture of hand gesture images and differences in hand types that vary from person to person.

To improve the hand gesture recognition capability and ensure a low computational burden, a group of researchers from Sun Yat-sen University has proposed a hand-adaptive algorithm. The algorithm first classifies the input images as per their hand-types into slim, normal, and broad categories. Hand-type classification uses three features: palm-length, palm-width, and finger-length. The recognition task addressed in the research work deals with 9 hand gestures. After the hand-type classification step, 360 hand gesture images were captured (9 from each of 40 volunteers that participated), which formed the overall hand gestures library. Then, dedicated libraries from each hand type are used to do further classification. The research group has selected area-perimeter ratio (C), and effective area ratio (E) features along with seventh order Hu moments which offer low complexity and are invariant to rotation, translation, and scaling to a large extent. Thus the feature vector of each of the hand gesture images can be denoted as: {C,E,Hu1,…,Hu7}.

The first step in the recognition pipeline is the Hand-type adaptation, done using an area-perimeter ratio. The user’s area-perimeter ratio (C) is taken for all the 9 gestures and is compared to get the least Euclidian distance match among the slim, normal, and broad classes. After this, the gesture pre-recognition (which is a shortcut method) is done using only the effective area ratio (E) for selecting 3 candidate gestures out of the total 9 gestures based on Euclidian distance. Then, a high-precision complex algorithm using the seventh order Hu features based on Euclidian distance is implemented over the 3 candidate gestures to recognize the final hand gesture. This pipeline design helps keep the computational burden low, increasing the recognition speed and improving accuracy provided the selection of the hand gestures is made carefully.

Under fixed positions of the images, the recognition accuracy was over 94 % for different test scenarios. Even on adding deformations to the gesture image, the accuracy remained above 93 %. The algorithm is also implemented over an FPGA to check it’s working under resource-constrained hardware. The FPGA has achieved an accuracy score of 94.99 %, which is comparable to the ones achieved over Intel Core series chip hardware. With the advent of smart technologies in our daily lives like smart TVs, Touchless vending machines, virtual in-store displays, or any device with camera sensors, it is becoming more and more important to control these appliances using hand gestures. Hence, there is a requirement for lightweight algorithms capable of running over embedded systems installed in these smart appliances.

Contrary to the current trend of developing AI and deep learning (DL) based algorithms, the method proposed by the researchers relies on lower hardware requirements and is highly suitable for embedded computing platforms. The team of researchers plans to carry forward further innovation in this field to increase the number of hand gestures and bring about the capability to deal with more complex backgrounds and poor illumination conditions. One can expect to witness incredible breakthroughs in the field of gesture-based human-machine interactions shortly.



Archishman Biswas is currently an 4th year undergraduate pursuing Dual Degree in Electrical Engineering at the Indian Institute of Technology, Bombay. His specialization is in the field of Communication Systems and Signal Processing. He is interested in recent and emerging Deep Learning architectures. He is enthusiastic about exploring the applications of Deep Learning and other AI techniques in fields of Image Processing and Computer Vision.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...