Opt-Me-Out From Diffusion: This AI Model Can Remove Copyrighted Concepts from Text-to-Image Diffusion Models

Text-to-image models have stormed the AI domain in the last couple of months. They have demonstrated superb image generation performance, which can generate outputs using text prompts that can be difficult to distinguish from real images. These models are becoming an essential part of content generation quite quickly.

Nowadays, it is possible to use AI models to generate images that we can use in our applications, let’s say, webpage design. We can just take one of the models, which can be MidJourney, DALL-E, or Stable Diffusion, and ask them to generate images for us. 

Let us, for a second assume we are on the other side of the equation. Imagine you are an artist and poured hours of hard work into generating digital art. You publish it in digital channels by ensuring you file all the required copyright information to make sure your art is not stolen in any way. Then, the next day you see one of these large-scale models generate an image that looks identical to your piece of art. How would you react to that?

This is one of the ignored problems of large-scale image generation models. The datasets used to train these models often include copyrighted materials, personal photos, and the art pieces of individual artists. We need to find a way to remove such concepts and materials from large-scale models. But how can we do it without retraining the model from scratch? Or what if we want to keep the related concepts but remove the copyrighted ones?

In response to these concerns, a team of researchers has proposed a method for the ablation, or removal, of specific concepts from text-conditioned diffusion models. 

The proposed method modifies generated images for a target concept to match a broad anchor concept, such as overwriting Star Wars R2D2 with Robot or Monet paintings with a painting. This is called concept ablation, and it is the key contribution of the paper.

Overview of concept ablation. Source: https://arxiv.org/pdf/2303.13516.pdf

The goal here is to modify the conditional distribution of the model for a given target concept. This enables to match of a distribution defined by the anchor concept, thus, ablating the concept to a more generic version. 

The authors propose two different ways to achieve target distributions, each leading to different training objectives. In the first case, the model is fine-tuned to match the model prediction between two text prompts containing the target and corresponding anchor concepts. For example, it takes Cute Grumpy Cat to Cute Cat. In the second objective, the conditional distribution is defined by the modified text-image pairs of the target concept prompt paired with images of anchor concepts. This approach takes Cute Grumpy Cat to a random cat image.

Overview of two evaluated ablation methods. Source: https://arxiv.org/pdf/2303.13516.pdf

Two different ablation methods are evaluated; model-based and noise-based. In the model-based approach, the anchor distribution is generated by the model itself, conditioned on the anchor concept. On the other hand, noise-based ablation involves starting with a concept and generating the target image with added random noise.

The proposed concept ablation method is evaluated on 16 tasks, including specific object instances, artistic styles, and memorized images. It was able to successfully ablate target concepts while minimally affecting closely related surrounding concepts that should be preserved. The method takes around five minutes per concept and is robust to misspelling in the text prompt.

In conclusion, this method presents a promising approach for addressing concerns about the use of copyrighted materials and personal photos in large-scale text-to-image models. 

Check out the Paper and Github. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our 17k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Ekrem Çetinkaya received his B.Sc. in 2018, and M.Sc. in 2019 from Ozyegin University, Istanbul, Türkiye. He wrote his M.Sc. thesis about image denoising using deep convolutional networks. He received his Ph.D. degree in 2023 from the University of Klagenfurt, Austria, with his dissertation titled "Video Coding Enhancements for HTTP Adaptive Streaming Using Machine Learning." His research interests include deep learning, computer vision, video encoding, and multimedia networking.

🚀 LLMWare Launches SLIMs: Small Specialized Function-Calling Models for Multi-Step Automation [Check out all the models]