You probably heard about or even used ChatGPT at this point. OpenAI’s new magical tool is there to answer your questions, help you write documents, write executable codes, give you recipes with the ingredients you have, and even more, all with a human-like capacity.
ChatGPT is probably the most famous example of large language models (LLMs). These models are trained on large-scale datasets, and they can understand and generate text replies to the given requests. When we mean large datasets, we really mean it.
As these LLMs become more advanced, we may need a way to identify if they or a human has written something. “But why?” You might ask. Although these tools are extremely useful at augmenting our skills, we might not expect everyone to be used them innocently; there could be malicious use cases where we cannot allow them to operate.
For example, one can use it to generate fake news, and ChatGPT can be really convincing. Imagine your Twitter feed is flooded with LLM bots propagating the same misinformation, but they all sound realistic. This could be a huge problem. Moreover, academic writing assignments are no longer safe. How can you make sure whether the student wrote the article or an LLM? In fact, how can you ensure this very article is not written by ChatGPT? (P.S.: it’s not 🙂)
On the other hand, LLMs are trained with the data obtained from the Internet. What will happen if most of our data is synthetic, AI-generated content? That would reduce the quality of LLMs as synthetic data is usually inferior to human-generated content.
We can keep going about the importance of detecting AI-generated content, but let’s stop here and think about how it can be done. Since we are talking about LLMs, why not ask ChatGPT and what it recommends us for determining the AI-generated text.
We thank ChatGPT for its honest answer, but none of these approaches can give us high confidence in detection.
Fake content is not a new issue. We have had this problem for years with the important stuff. For example, money counterfeiting was a huge issue, but nowadays, we can be 99% sure that our money is legal and legit. But how? The answer is hidden inside the money. You probably noticed those tiny numbers & symbols that are only visible under certain conditions. These are watermarks; it is like a hidden signature embedded there by the mint that indicates its originality.
Well, since we have a method that has proven useful for multiple use cases, why not take it and apply it to AI-generated content? This was the very idea the authors of this paper had, and they came up with a convenient solution.
They study watermarking of LLM output. The watermark is a hidden pattern that is unlikely to write by human writers. It is hidden in a way humans cannot detect, but it ensures that an LLM writes the text. Watermarking algorithm can be public, so everyone else can use it to check whether a certain LLm writes the text, or it can be private so that it’s only visible to LLM publishers.
Moreover, the proposed watermarking can be integrated into any LLM without requiring it to be trained again. Also, the watermark can be detected from a tiny part of the generated text, preventing someone from generating a long text but using parts of it to avoid detection. Moreover, if one wants to remove the proposed watermark, one should significantly alter the text. Minor modifications would not avoid detection.
The proposed watermarking algorithm works fine but is not perfect, as they mention certain types of attacks. For example, one can ask the LLM to insert certain emojis after each word and remove them from the generated text afterward. This way, the watermarking algorithm can be avoided.
The rise of successful LLMs eases most of the tasks, but they also pose certain threats. This paper proposed a method to identify LLM-generated text by watermarking them.
Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 13k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.
Ekrem Çetinkaya received his B.Sc. in 2018 and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He is currently pursuing a Ph.D. degree at the University of Klagenfurt, Austria, and working as a researcher on the ATHENA project. His research interests include deep learning, computer vision, and multimedia networking.