Stability.AI and Stable Diffusion
The artificial intelligence (AI) generated image sector has experienced exponential growth in recent years. From training software for sequence prediction and image generation to the widely popular generative adversarial networks (commonly referred to as GANs) and diffusion models, AI image generation has come a long way since its early days.
Diffusion models have gained in popularity over the past few months, and two names dominating the space are Stability AI and Stable Diffusion. Many people are asking who and what these are, and how they are changing the face of AI image generation.
Stability AI and Stable Diffusion
Stable Diffusion is a new text-to-image latent diffusion model created by researchers and engineers at Stability AI, a startup based in London and Los Altos, California. Stable Diffusion differs from other AI image generators, such as DALL-E 2, in that it is open-source and has no content filters. Because it is open-source, the original source code is available for free and may be redistributed and modified.
The developer communities employed or sponsored by Stability AI currently exceed 20,000 members, who focus on building real-world applications of AI for the future. In addition to developing cutting-edge open AI models for images, Stability AI also develops intelligent solutions in the language, audio, video, 3D, and biology areas.
Founded by Emad Mostaque, Stability AI created Stable Diffusion and first publicly released the software and its open-source codes on August 22, 2022. Researchers and engineers from its developers' communities (including RunwayML, LMU Munich, EleutherAI, and LAION) contributed to the creation of Stable Diffusion.
Stability AI announced that it also plans on launching a hosted version of Stable Diffusion for testing on the web. Using Stable Diffusion online is easy—all you need to do is enter your prompt as clearly as possible into our Stable Diffusion generator, and wait for your image to be processed. While it may take a few seconds to process, the images generated are stunning and can be true to life, depending on your description.
How Does Stable Diffusion Work?
Stable Diffusion was created using the Latent Diffusion model, a text-to-image model created in large part by the machine learning research group, CompVis.
Early diffusion models (DMs) worked by decomposing the image formation process into a sequential application of denoising autoencoders to generate state-of-the-art synthesis results on image data and beyond. However, DMs operate in the pixel space leading to the consumption of hundreds of graphics processing unit (GPU) days—and inference is expensive due to sequential evaluations.
However, the latent diffusion models (LDMs) used by Stability AI work on a compression level, not in the pixel space, allowing these models to be more powerful and flexible. As a result, Stable Diffusion is capable of saving computational resources and producing higher-resolution images compared to other forms of AI image generation models.
Stable Diffusion was trained on 512x512 images from a subset of the LAION-5B database and runs in a command-line interface. LAION has the largest, freely accessible multi-modal dataset that currently exists for open-source developers.
Why Is Stable Diffusion Unique?
Stable Diffusion is similar to other text-to-image diffusion models. However, by implementing latent diffusion models, Stable Diffusion is able to produce comparable results (e.g. image generation, inpainting, and super-resolution) while significantly reducing computational requirements.
The computational requirements were reduced in 2022 to such an extent that you can now install and operate Stable Diffusion on a relatively accessible home computer. Stable Diffusion requires a PC with a GPU with at least six gigabytes (GB) of VRAM and ten or more GB of internal storage space on your hard drive or solid-state drive. The huge computational requirements of other DMs make them operational only in a cloud environment.
In addition, our Stable Diffusion implementation has a highly lenient licencing structure, allowing users to install and run it on their PCs for free. With this permissive licence, we don’t claim rights to the images you create with the software.
You are free to use them however you like, but you are accountable for how they are used. Stability AI warns that the model should not be used to produce or share illegal or harmful outputs or content deliberately and we mirror that in our policies.
Stable Diffusion lives up to the hype it received even before it launched, bringing local AI art generation capabilities to the average PC. As Stability AI continues to update the model, the user interface is bound to improve to the level of other more-established AI art generators.