Can Artificial Intelligence Match Human Creativity? A New Study Compares The Generation Of Original Ideas Between Humans and Generative Artificial Intelligence Chatbots

Recently, Artificial Intelligence has been able to perform tasks by imitating humans. With the development of Large Language Models like ChatGPT and DALL-E and the increase in the popularity of generative AI, generating content like a human is no more a dream. Everything is now possible, from question answering, code completion, and content generation from a textual description to generating an image from text and an image from an image. AI has been matching the creativity of humans lately. It has even proven better than a human in games like chess. 

In a recent research paper, some researchers have compared ideas that a human being has produced with those generated by generative Artificial Intelligence. The six generative AI chatbots that the researchers have used for the comparison are,, ChatGPT (versions 3 and 4),, and YouChat. To determine the similarities and differences between the creativity of AI-generated and human-generated ideas, both the quality and quantity of ideas have been independently evaluated. They have been accessed by both humans and an AI explicitly trained for this purpose.

The team has compared the ideas and the creativity they comprise by using the Alternative Uses Test (AUT28). The Alternative Uses Test assesses divergent thinking abilities, listing a common object’s not-so-obvious and creative uses. The team applied AUT on 100 human participants and five Generative AIs. The test required humans and AI to develop various unique uses for five common objects – pants, ball, tire, fork, and toothbrush. These five objects were termed the prompts.

The team evaluated the responses generated on the basis of their originality and fluency. They have used both intuitive human evaluation (Consensual Assessment technique) and an AI specifically trained for assessing AUT-trained large-language models to rate the originality of the responses. To determine the reliability between the six human raters, the team calculated intraclass correlations using the R-package irr33, the results of which indicated that the human raters generally agreed on which responses were original.

For the comparison, two linear mixed effects models with random intercepts and random slopes have been used for the five prompts. Using the first model in which the human-rated responses were the dependent variable, no difference was found between human and Generative AI-generated ideas. The second model, in which the AI-rated responses acted as the dependent variable, also found no difference between the responses. However, human-rated responses for forks and AI-rated responses for toothbrushes outperformed the Generative AI.

Since GPT-4 was released in mid-March 2023, the researchers conducted an additional analysis. GPT-4 completed the AUT, with responses getting analyzed by the AI only, as human raters could be biased knowing that the responses were not human. GPT-4 outperformed all five other GAIs, except for the prompt – ball, where it ranked second. When comparing GPT-4’s performance to humans, only two humans were more creative than the most creative AI for the prompt – pants, 29 were more creative for the prompt – ball, none were more creative for the prompt – tire, three were more creative for – fork and 13 were more creative for – tooth” Overall, 9.4 humans were more creative than GPT-4 across all prompts. Consequently, there was not much significant difference in creativity between humans and AI in terms of originality and fluency except for a small percentage of human participants who were found to be more creative.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 16k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Tanya Malhotra is a final year undergrad from the University of Petroleum & Energy Studies, Dehradun, pursuing BTech in Computer Science Engineering with a specialization in Artificial Intelligence and Machine Learning.
She is a Data Science enthusiast with good analytical and critical thinking, along with an ardent interest in acquiring new skills, leading groups, and managing work in an organized manner.

🐝 Join the Fastest Growing AI Research Newsletter Read by Researchers from Google + NVIDIA + Meta + Stanford + MIT + Microsoft and many others...