Privacy Concerns Surrounding LLMs like ChatGPT: This AI Paper Unveils Potential Risks and Safeguarding Measures

While ChatGPT is breaking records, some questions are raised about the security of personal information utilized in OpenAI’s ChatGPT. Recently, researchers from Google DeepMind, University of Washington, Cornell, CMU, UC Berkeley, and ETH Zurich discovered a possible issue: using certain instructions, one may trick ChatGPT into disclosing sensitive user information.

Within two months of its launch, OpenAI’s ChatGPT has amassed over 100 million users, demonstrating its growing popularity. More than 300 billion pieces of data are used by the program from a variety of internet sources, including books, journals, websites, posts, and articles. Even with OpenAI’s best efforts to protect privacy, regular posts and conversations add to a sizable amount of personal information that shouldn’t be publicly disclosed.

Google researchers found a way to deceive ChatGPT into accessing and revealing training data not meant for public consumption. They extracted over 10,000 distinct memorized training instances by applying specified keywords. This implies that additional data could be obtained by enemies who are determined.

The research team showed how they could drive the model to expose personal information by forcing ChatGPT to repeat a word, such as “poem” or “company,” incessantly. For example, they may have extracted addresses, phone numbers, and names in this way, which could have led to data breaches.

Some businesses have put limitations on employing huge language models like ChatGPT in response to these worries. For instance, Apple has prohibited its staff members from using ChatGPT and other AI tools. Furthermore, as a precaution, OpenAI added a function that lets users disable conversation history. However, the retained data is kept for 30 days before being permanently erased.

Google researchers stress the significance of extra care when deploying large language models for privacy-sensitive applications, even with the additional safeguards. Their findings emphasize the need for careful consideration, improved security measures in developing future AI models, and the potential risks associated with the widespread use of ChatGPT and similar models.

In conclusion, the revelation of potential data vulnerabilities in ChatGPT serves as a cautionary tale for users and developers alike. The widespread use of this language model, with millions of people interacting with it regularly, underscores the importance of prioritizing privacy and implementing robust safeguards to prevent unauthorized data disclosures.


Check out the Paper and Reference Article. All credit for this research goes to the researchers of this project. Also, don’t forget to join our 34k+ ML SubReddit, 41k+ Facebook Community, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

If you like our work, you will love our newsletter..

Niharika is a Technical consulting intern at Marktechpost. She is a third year undergraduate, currently pursuing her B.Tech from Indian Institute of Technology(IIT), Kharagpur. She is a highly enthusiastic individual with a keen interest in Machine learning, Data science and AI and an avid reader of the latest developments in these fields.

🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]