Whose Opinions Do LLMs Reflect? This AI Paper From Stanford Examines the Opinions Reflected by Language Models LMs Through the Lens of Public Opinion Polls

Over the past few years, it has been observed that language models, or LMs, have been extremely instrumental in accelerating the pace of natural language processing applications in a variety of industries, such as healthcare, software development, finance, and many more. The use of LMs in writing software code, assisting authors in improving their writing style and storyline, etc., is among the transformer-based models’ most successful and popular applications. This is not all, though! Research has shown that LMs are increasingly being used in open-ended contexts when it comes to their applications in chatbots and dialogue assistants by asking them subjective questions. For instance, some examples of such subjective queries include asking a dialogue agent whether AI will take over the world in the coming years or whether legalizing euthanasia is a good idea. In such a situation, the opinions expressed by LMs in response to subjective questions can significantly impact not just determining whether an LM succumbs to particular prejudices and biases but also in shaping society’s overall views.

At present, it is quite challenging to accurately predict how LMs will respond to such subjective queries in order to evaluate their performance in open-ended tasks. The primary reason behind this is that the people responsible for designing and fine-tuning these models come from different walks of life and hold different viewpoints. Moreover, when it comes to subjective queries, there is no “correct” response that can be used to judge a model. As a result, any kind of viewpoint exhibited by the model can significantly affect user satisfaction and how they form their opinions. Thus, in order to correctly evaluate LMs in open-ended tasks, it is crucial to identify exactly whose opinions are being reflected by LMs and how they are aligned with the majority of the general population. For this purpose, a team of postdoctoral researchers from Stanford University and Columbia University have developed an extensive quantitative framework to study the spectrum of opinions generated by LMs and their alignment with different groups of human populations. In order to analyze human views, the team utilized expert-chosen public opinion surveys and their responses which were collected from individuals belonging to different demographic groups. Moreover, the team developed a novel dataset called OpinionQA to assess how closely an LM’s ideas correspond with other demographic groups on a range of issues, including abortion and gun violence.

For their use case, the researchers relied on carefully designed public opinion surveys whose topics were chosen by experts. Moreover, the questions were designed in a multiple-choice format to overcome the challenges associated with open-ended responses and for easy adaptation to an LM prompt. These surveys collected opinions of individuals belonging to different democratic groups in the US and helped the Stanford and Columbia researchers in creating evaluation metrics for quantifying the alignment of LM responses w.r.t. human opinions. The basic foundation behind the proposed framework by the researchers is to convert multiple-choice public opinion surveys into datasets for evaluating LM opinions. Each survey consists of several questions wherein each question can have several possible responses belonging to a wide range of topics. As a part of their study, the researchers first had to create a distribution of human opinions against which the LM responses could be compared. The team then applied this methodology to Pew Research’s American Trends Panels polls to build the OpinionQA dataset. The poll consists of 1498 multiple-choice questions and their responses collected from different demographic groups across the US covering various topics such as science, politics, personal relationships, healthcare, etc. 

The team assessed 9 LMs from AI21 Labs and OpenAI with parameters ranging from 350M to 178B using the resulting OpinionQA dataset by contrasting the model’s opinion with that of the overall US population and 60 different demographic groupings (which included democrats, individuals over 65 in age, widowed, etc.). The researchers primarily looked at three aspects of the findings: representativeness, steerability, and consistency. “Representativeness” refers to how closely the default LM beliefs match those of the US populace as a whole or a particular segment. It was discovered that there is a significant divergence between contemporary LMs’ views and those of American demographic groupings on various topics such as climate change, etc. Moreover, this misalignment only seemed to be amplified by using human feedback-based fine-tuning on the models in order to make them more human-aligned. Also, it was found that current LMs did not adequately represent the viewpoints of some groups, like those over 65 and widows. When it comes to steerability (whether an LM follows the opinion distribution of a group when appropriately prompted), it has been found that most LMs tend to become more in line with a group when encouraged to act in a certain way. The researchers placed a lot of emphasis on determining if the opinions of the various democratic groupings are consistent with LM across a range of issues. On this front, it was found that while some LMs did align well with particular groups, the distribution did not hold across all topics.

In a nutshell, a group of researchers from Stanford and Columbia University has put forward a remarkable framework that can analyze the opinions reflected by LMs with the help of public opinion surveys. Their framework resulted in a novel dataset called OpinionQA that helped identify ways in which LMs misaligned with human opinions on several fronts, including overall representativeness with respect to majority of the US popluation, subgroup representativeness on different groups (which included 65+ and widowed) and steerability. The researchers also pointed out that although the OpinionQA dataset is US-centric, their framework uses a general methodology and can be extended to datasets for different regions as well. The team strongly hopes that their work will drive further research on evaluating LMs on open-ended tasks and help create LMs that are free of bias and stereotypes. Further details regarding the OpinionQA dataset can be accessed here.


Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 17k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...