Stable Diffusion Prompt Syntax Explained

Machine learning models that can transform user-written text descriptions into images have rapidly increased in popularity over the past year, and several of these models are now free online. The two most widely used platforms are Stable Diffusion and DALL-E. Users can create hundreds of images in a matter of hours even if they have little to no prior experience, but it is worth learning about the correct prompt syntax to use before getting started so you don’t get frustrated.

Understanding How Stable Diffusion Works

Stable Diffusion is a text-to-image model that uses a frozen CLIP ViT-L/14 text encoder. Stable Diffusion separates the imaging process into a diffusion process at runtime. Diffusion models are taught by introducing additional pixels called noise into the image data. Adding noise in a specific order governed by Gaussian distribution concepts is essential to the process. This allows for gradual learning to progressively depixelate the noise and combine data to reconstruct the image and generate the expected outcome.


The Stable Diffusion online generator uses Contrastive Language-Image Pre-training (CLIP). CLIP is a machine learning model developed by OpenAI that can learn to perform various tasks by training on large datasets of text and images. The model is trained to maximise the match between text and image inputs. It not only generates images from text descriptions but also learns to associate text with corresponding images.


ViT-L/14 is a variant of the CLIP model trained on a dataset of fourteen billion images and text. It delivers state-of-the-art performance for a variety of tasks, including image classification, language translation, and text-to-image generation.

Prompt Fundamentals

In order to produce graphics with a specific style, the text prompts must be made in a particular format. This is usually accomplished by adding prompt modifiers, or keywords and key phrases added to prompts. There is a burgeoning online ecosystem of resources, such as lists of awesome AI art prompts and guides to help beginners get started. However, it takes a lot of trial and error and experimentation to master prompt engineering.

Keep in mind that prompt modifiers are weighted–words at the beginning of a sentence carry more weight than words at the end. Here are some basics about creating prompts that will help you on your exciting AI art journey:

Subjects

The concept of identifying subjects is pretty straightforward. Use your imagination to describe the scene, object, or character you want in your art; the subject is the most basic building block of any prompt. Although a prompt can be written without a subject, it will be difficult to control the image generation process and you might get undesirable results. 

Style Modifiers

To get the outcomes you want, it's useful to include clear and detailed style modifiers. You can add style modifiers to generate images with specific genres, such as oil on canvas paintings, cartoons, anime, cubism, photography, etc. You can also mention specific artists such as Francisco Goya, van Gogh, Pablo Picasso, or Yayoi Kusama. Style modifiers can include information about the time period, material, medium, or technique. 

Adjectives

Adjectives can boost aesthetic qualities and the level of detail of your project. Examples of the type of adjectives you can use to add mood to your art are fantastic, epic, elegant, melancholic, etc. 

Action and Scene

Actions indicate what the subject is doing, while scenes describe where it is taking place.

Repetition 

Repetition can reinforce associations formed by generative systems. For instance, using the two phrases in a prompt—a garden mouse and a mouse in a garden—will likely generate better results than if only one of the phrases was used by itself. Different ways of describing objects using synonyms will make the text-to-image process more accurate by activating certain areas in the neural network’s latent space. 

Negative Prompts

Negative prompts tell AI to remove certain subjects and styles from the results. For example, VQGAN-CLIP tends to generate red heart-shaped images when love is included in the prompt. Adding a “heart:-1” will prevent this from happening. 

In Conclusion

AI presents many possibilities to transform the creative economy, including using Stable Diffusion for commercial purposes. Professionals and hobbyists can now quickly bring their ideas to life with this open-source software and use it to illustrate a children’s book, make a storyboard for a movie, create images for stock photos, or develop a new logo for a company–it is genuinely a real game-changer. 


Join the fun and start exploring the world of AI-generated art by signing up for a free account on NightCafe today!