A New Artificial Intelligence (AI) Benchmark Called ‘OpenCQA,’ Answers Open-Ended Questions About Charts Using Descriptive Texts

Discovering and communicating key insights in data using data visualization methods like bar charts and line graphs is essential for many activities, but it can be time-consuming and labor-intensive. Data analysis and communication of key findings are two common uses for charts. The analysis of visual representations is frequently used to provide explanations for problems that lack a clear yes/no response. It takes a lot of mental and perceptual energy to answer questions like this. Therefore doing so can take time and effort.

To resolve these issues, the Chart Query Answering (CQA) task was developed to accept both a graph and a natural language question as input and produce a response as output. Many studies have been conducted on CQA in the past few years. However, the difficulty is that most datasets only include examples where the answer is a single word or phrase.

Since few data sources with graphs and related textual descriptions are freely available, they have yet to attempt to create datasets consisting of open-ended questions and annotator-written response statements. Therefore, the researchers utilized graphs from Pew Research (pewresearch.org), where experts employ a variety of graphs and summaries to generate papers on market research, public opinion, and social concerns.

👉 Read our latest Newsletter: Microsoft’s FLAME for spreadsheets; Dreamix creates and edit video from image and text prompts......

A total of 7724 sample data sets were generated by adjusting the number of summary words in the 9285 graph-summary pairs extracted from around 4000 articles on this website. A total of 7724 records were included as part of the sample. The dataset’s many charts and graphs span various subjects, from politics and economics to technology and beyond.

Four questions may be asked in the OpenCQA, and the task’s output text acts as the response.

  • To identify, ask questions about a certain target within a set of bars.
  • Graph comparison questions are under the “compare” category.
  • One of the options is to summarise the data in graphical form, which is what the question wants you to do.
  • Undirected inquiries that necessitate conclusions throughout the whole graph

 Models used as a starting point

The new dataset was developed with reference to the following seven preexisting models:

  • Improved performance over the standard BERT model by the addition of directed attention layers, abbreviated as BERTQA
  • Models like ELECTRA’s self-supervised representational learning and GPT-2’s Transformer-based text generation may anticipate the next word in a text based on the words that have already been used.
  • Models like BART, which use a common encoder-decoder transformer framework, have been proven to attain state-of-the-art performance on text production tasks like summarization.
  • Models that propose a document-grounded generation task in which the model improves text generation with the information provided by the document include (a) T5, a unified encoder-decoder transformer model for converting linguistic tasks into a text-to-text format; (b) VLT5, a T5-based framework that unifies Vision-Language tasks as text generation subject to multimodal input; and (c) CODR, a model proposing a document-grounded generation task.

Challenges and Limitations

Many ethical considerations arose for researchers while gathering data and annotating it. They utilized only freely accessible charts found on publicly available resources that allow for the dissemination of downloaded information for educational purposes so as not to infringe on the intellectual property of the chart’s original producers. Users are permitted to utilize data from the Pew Research Center so long as proper credit is given to the organization, or no other entity is named as the source.

It has been speculated by researchers that the models may be used to disseminate false information. While the current model outputs may seem natural, they include several inaccuracies discussed in the cited study. Because of this, the general public might be given false information if these erroneous model results are released in their current form.

However, owing to the specifics of assignment, one could only use data from Pew Research (pewresearch.org) in analysis, which restricts the dataset. In the future, if more relevant data becomes accessible, the dataset might be expanded. Researchers also ignored long-range sequence models like the linformer and the newly suggested memorized transformer.

Since it can be only work with the automatically produced OCR data, which is typically noisy, the job configuration is also constrained. To better feed data into the model, future techniques can concentrate on perfecting OCR extraction for this specific job.

In conclusion, OpenCQA is proposed as a method for providing detailed answers to free-form queries regarding charts. At the same time, they present several cutting-edge standards and metrics. The examination results show that while the most advanced generative models can generate natural-sounding language, a lot of work remains before they can consistently provide valid arguments that use both numbers and logic.


Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our Reddit PageDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Dhanshree Shenwai is a Computer Science Engineer and has a good experience in FinTech companies covering Financial, Cards & Payments and Banking domain with keen interest in applications of AI. She is enthusiastic about exploring new technologies and advancements in today’s evolving world making everyone's life easy.