Microsoft Researchers Unveil ‘EmotionPrompt’: Enhancing AI Emotional Intelligence Across Multiple Language Models

Emotional intelligence is a historically placed cornerstone within the vast mosaic of human qualities. Emotional understanding is the ability to recognize and correctly process emotional data and then use that data to guide logical and analytical processes like problem resolution and behavioral management. Reflexes, perception, cognition, and behavior all give rise to emotions, and various internal and external factors can influence these components. Self-monitoring, Social Cognitive theory, and the importance of positive emotions indicate that emotion control can influence human problem-solving skills. Because of the wide-ranging effects it has on people, emotion regulation theory has been used in fields as diverse as education and health.

New research by CAS, Microsoft, William & Mary, Beijing Normal University, and HKUST investigate the connection between EQ and sophisticated AI models. Emerging large language models (LLMs) have exhibited impressive performance across various tasks, including reasoning, natural language processing and generation, and STEM problem-solving, making them one of the most promising research endeavors toward artificial general intelligence. By allowing GPT-4 to carry out several difficult tasks devised by humans, a recent study suggested that LLMs show remarkable potential toward AGI. However, it is still unknown whether LLMs can interpret psychological emotional impulses, a fundamental benefit of humans that helps them improve their problem-solving abilities. Using in-context learning methods, several academics have made huge strides in various areas. However, given the differences in their capacities, not all LLMs will benefit equally from the currently available methods. While recent research has shown evidence that LLMs can recognize and process emotional cues, this study has not assessed whether or not LLMs’ emotional intelligence plays a significant impact in improving their performance.

This new work takes the first step in investigating LLMs’ potential to comprehend and exploit emotional stimuli. Emotional cues associated with hope, self-assurance, and peer approval have been proven to have a positive effect in previous psychological research. Real-world applications of this phenomenon include uplifting language to improve academic performance and increase physical well-being. The researchers took inspiration from these psychological processes and presented EmotionPrompt, a simple yet powerful method for investigating LLMs’ emotional intelligence. In particular, they designed 11 statements as psychological phrases to be used as follow-up prompts for LLMs to elicit an emotional response. 

Both deterministic and generative tasks, which together encompass a wide range of difficulty levels, are used in their extensive investigations. They performed trials with several LLMs, such as FlanT5-Large, Vicuna, Llama 2, BLOOM, ChatGPT, and GPT-4, on 24 Instruction Induction tasks and 21 curated BIG-Bench tasks, all of which are deterministic and can be evaluated with common metrics. They performed a human study with 106 participants to judge the quality of generating tasks utilizing both vanilla and emotional prompts based on GPT-4, as these activities do not lend themselves to traditional and automatic evaluation. Their human study shows that emotional prompts significantly boost the performance of generative tasks (with an average improvement of 10.9% in performance, truthfulness, and responsibility metrics). On the other hand, the standard experiments show that LLMs possess emotional intelligence and can be enhanced by emotional stimuli. 

The researchers also analyzed why EmotionPrompt is helpful for LLMs by assessing the effects of emotional stimuli on the final outputs through input attention. The findings show that gradients in LLMs benefit from emotional stimuli by giving them bigger weights, which benefits the outcomes by improving the representation of the original prompts. To learn more about how model size and temperature affect EmotionPrompt’s efficacy, they conducted an ablation study. 

Finally, they examined how using many emotional cues together affects performance and found that doing so can significantly improve outcomes. Based on the findings, EP02 is the best stimulus in Instruction Induction, outperforming the poorest stimulus by 6.06 percent, whereas EP06 is the greatest stimulus in BIG-Bench. It’s important to remember that several factors, such as task complexity, task type, and the metrics used, might affect a stimulus’s performance.


Check out the PaperAll credit for this research goes to the researchers of this project. Also, don’t forget to join our 32k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

We are also on Telegram and WhatsApp.

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...