Stable Diffusion (SD) is a latent open-source AI text-to-image diffusion model that produces photorealistic images from any given input prompt in a matter of seconds. For instance, a prompt outlining “a portrait of a cyborg dressed in a golden suit” in SD will result in the exact same image. Since its introduction in August 2022, SD has been embraced by many creators, developers, and enthusiasts. As a result of its widespread adoption, the research community is now more interested than ever in developing tools and extensions that further build upon SD to produce novel visual material from as little as a text prompt. Currently, some established techniques customize SD by translating it into languages other than English using open-source initiatives such as Hugging Face diffusers.
Recent research focuses on employing SD for more than only creating images from text prompts. These additional tasks range from image editing, in-painting, and out-painting to super-resolution, style transfer, and even creating color palettes. With the proliferation of SD applications, it is more important than ever for developers to make the most of this technology to produce apps that are easily accessible to users everywhere. One primary thought that comes to mind regarding SD in any application is where the model is being executed. On-device deployment of SD in an app is significantly favored over a server-based strategy. This is mostly because on-device deployment ensures more privacy by keeping all user-provided data as input on the device itself. After the initial download, customers can use the model without a connection to the internet. Additionally, it is a more cost-effective approach because local deployment enables developers to cut out server-related expenses.
However, the biggest concern with deploying a model on a device is that SD might take a long time and require several iterations, leading to increased computational requirements. As a result, it becomes imperative to guarantee that the model can produce results quickly enough on a device. This necessitates running a complicated pipeline of 4 separate neural networks with over 1.275 billion parameters, which demands optimization. To work on this problem statement, Apple recently released modifications that enable the Stable Diffusion AI image generator to run on Apple Silicon utilizing Core ML, Apple’s own framework for machine-learning models. These improvements will enable programmers to run SD on Apple Neural Engine hardware nearly twice as efficient as they could with older Mac-based techniques.
The current release includes a Python script for converting SD models from PyTorch to Core ML using diffusers and coremltools and a Swift package for model deployment. Additional comprehensive benchmarking and deployment instructions are available in the Core ML Stable Diffusion Github repository.
Previously, SD generated photos the fastest on top-tier Nvidia GPUs where the model was run locally on a Windows or Linux PC. Running SD on an Apple Silicon Mac the traditional way, in contrast, was much slower. However, Apple’s latest Core ML SD enhancement significantly improves and reduces the creation time in the instance of the M1 by almost half. As of now, it is quite easy for developers to set up SD with Core ML locally on a Mac because it only requires some command-line knowledge. Hugging Face, however, has written a thorough tutorial to make this easier, even for people who simply want to explore SD for fun.
By making it simpler for developers to integrate this technology into their apps in a privacy-preserving and financially viable way without compromising on its performance, Apple hopes to unleash the full potential of image synthesis on its devices. Additionally, greater privacy will be maintained. Additional cloud computing expenditures can also be curbed by executing the AI generation model locally on a Mac or any other Apple device, such as iPhones and iPads.
Check out the Apple Source and Github Link. All Credit For This Research Goes To Researchers on This Project. Also, don’t forget to join our Reddit page and discord channel, where we share the latest AI research news, cool AI projects, and more.
Khushboo Gupta is a consulting intern at MarktechPost. She is currently pursuing her B.Tech from the Indian Institute of Technology(IIT), Goa. She is passionate about the fields of Machine Learning, Natural Language Processing and Web Development. She enjoys learning more about the technical field by participating in several challenges.