A New AI Research Introduces REV: A Game-Changer in AI Research – A New Information-Theoretic Measure Evaluating Novel, Label-Relevant Information in Free-Text Rationales

Model explanations have proved essential for trust and interpretability in natural language processing (NLP). Free-text rationales, which provide a natural language explanation of a model prediction, have gained popularity because of their adaptability in eliciting the thought process that went into the model’s choice, bringing them closer to human explanations. However, existing metrics for free-text explanation evaluation are still mostly accuracy-based and narrowly focused on how well a justification can assist a (proxy) model in predicting the label it explains. These metrics provide no insight into the new data given by the reason to the original input that would explain why the label was chosen—the precise function a justification is intended to fulfill. 

For instance, even though they provide differing amounts of fresh and pertinent information, the two rationales r*1 and r*1 in Fig. 1 would be deemed equally important under present measures. To address this issue, they introduce an automatic evaluation for free-text justifications along two dimensions in this paper: (1) whether the justification supports (i.e., is predictive of) the intended label, and (2) how much additional information it adds to the label justification beyond that which is already present in the input. 

For instance, the justification r^1,b in Fig. 1 contradicts (1) as it does not anticipate the label “enjoy nature.” Although rationale r^1,a does support the label, it does not provide any new information to what is already stated in input x to support it; as a result, it violates clause (2). Both requirements of the rationale r*1 are met: it provides additional and pertinent information that goes beyond the input to support the label. Both r^1,a and r^1,b will be penalized in their evaluation while r1,a and r1,b will be rewarded. Researchers from the University of Virginia, Allen Institute for AI, University of Southern California, and the University of Washington in this study provide REV2, an information-theoretic framework for assessing free-text justifications along the two previously described dimensions that they have modified. 

Figure 1: The metric REV can distinguish all three rationales by measuring how much new and label-relevant information each adds over a vacuous rationale

REV is based on conditional V-information, which measures the extent to which a representation has information beyond that of a baseline representation and is available to a model family V. They treat any vacuous justification that does nothing more than (and declaratively) pair an input with a predetermined label without adding any new information that would shed light on the decision-making process behind the label as their baseline representation. When evaluating rationales, REV adapts conditional V-information. To do this, they compare two representations: one from an evaluation model trained to produce the label given the input and the rationale and the other from another evaluation model for the same task, but only considering the input (under the guise of a void rationale). 

Other metrics cannot assess fresh and label-relevant information in rationales because they do not account for empty justifications. For two reasoning tasks, commonsense question-answering and natural language inference, across four benchmarks, they offer evaluations with REV for justifications in their studies. Numerous quantitative assessments show how REV may provide ratings along new axes for free-text justifications while more aligned with human judgments than current measurements. They also provide comparisons to show how sensitive REV is to different levels of input disturbances. Additionally, evaluation with REV sheds light on why the performance of predictions is not always enhanced by the rationales discovered by chain-of-thought prompting.


Check out the Paper and GitHub link. Don’t forget to join our 26k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

🚀 Check Out 800+ AI Tools in AI Tools Club

Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...