Researchers at Meta AI Introduce EditEval: An Instruction-Based Benchmark for Text Improvements

For various applications, including question answering, textual entailment, and summarization, large pre-trained language models have demonstrated excellent text production capabilities. However, most work using language models has concentrated on producing immutable text in a single pass. Contrasting this is the way humans naturally create texts, which is an iterative process of little phases, each serving a specific function. Often, a change is necessary only after a substantial portion of the text has been written, such as when reorganizing or eliminating contradictions or inconsistencies.

The existing paradigm of producing text passages in a single pass can be very constricting in this regard. Furthermore, today’s continuous left-to-right generating paradigm is less controlled and inflexible to human collaboration and input. Further, a lack of skilled human mediation in the writing process can negatively impact the final product’s quality. Production tools such as Smart Compose from Google 3 and text predictions from Microsoft 4 can collaborate with humans to compose articles and emails. However, these primarily focused on sentence completion and were not created to improve the previously written text. 

A more capable editing assistant would be able to make suggestions for both the continuation of the work and enhancements to the current material, such as improving the tone and the diction or including more interesting information. These AI techniques should also allow for the text’s naturally inevitable iterative and non-sequential growth.

A new study by Meta AI Research, Carnegie Mellon University, Inria & ENS, PSL University, and University College London list several tasks and datasets that are pertinent to iterative text improvement that is pertinent to text editing, such as text clarification and information addition and offers a pipeline to obtain and combine these datasets into a common format. Rather than at the document or article level, many datasets for natural language problems are annotated at the sentence or paragraph level, which naturally lends well to evaluating repeated modifications.

They have developed EDITEVAL, a benchmark and evaluation suite that uses both new and old datasets of excellent quality to automatically assess editing abilities. Many relevant datasets are currently housed in separate packages and frequently have distinctly different formats. Each dataset is downloaded from its most recent version by EDITEVAL, standardizing it into a single format suitable for evaluation. To accurately assess a model’s capacity to carry out the modular task when directed, they have also incorporated well-liked metrics for each task and a collection of human-generated prompts. They also present a new WAFER-INSERT dataset based on the WAFER dataset for assessing a model’s capacity to update data.

The researchers use these prompts to analyze and compare several cutting-edge language models, including GPT3, OPT, and PEER. These baselines generally lag behind the supervised state-of-the-art, notably for the updating and neutralization tasks. Their finding also reveals that numerous widely used measures are not well connected, even at times contradictory, and that minor changes in a prompt’s wording can significantly impact model performance and robustness. Based on this, the team believes that more research is required to create models capable of carrying out editing tasks and provide a uniform method of evaluating editing skills and systematically choosing prompts.

This Article is written as a research summary article by Marktechpost Staff based on the research paper 'EDITEVAL: An Instruction-Based Benchmark for Text Improvements'. All Credit For This Research Goes To Researchers on This Project. Check out the paper, project and github.

Please Don't Forget To Join Our ML Subreddit

Tanushree Shenwai is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Bhubaneswar. She is a Data Science enthusiast and has a keen interest in the scope of application of artificial intelligence in various fields. She is passionate about exploring the new advancements in technologies and their real-life application.