This article will talk about building prompts in Perchance's text-to-image-plugin. If you want to know how 'Diffusions' work, see this cool website!
The text-to-image-plugin is funded with ads, so ads will appear on your generator for non-logged-in users if you import this plugin. See the plugin-page for other notes.
The Text-to-Image model uses Stable Diffusion and by using it you agree to the Licensing and Terms and Conditions stated in this license.
Per Section III Paragraph 6, Perchance claims no rights in the Output you generate with the model. You (the user) are accountable for the Output you generate and its subsequent uses.
Here are the Use-based restrictions stated in the License (see Attachment A at the bottom of the license):
You agree not to use the Model or Derivatives of the Model:
First we will talk about terms that we will be using in this article:
Based on the text-to-image-plugin page, we can do two things with prompt settings: (1) have them all inline or (2) have them in a list.
We would recommend using the list method like so:
promptData
prompt = painting of [character] in [place], [season]
seed = 123
size = 400
style = border:4px solid blue; margin-top:20px;Compared to:
prompt
[character] in [place] (size:::400) (seed:::123)Since in the long run, it will be more readable in the list form compared to the inline method.
The settings that are important are (in order of importance):
promptnegativePromptguidanceScaleresolutionseedwidth (for portrait, height if landscape, size if square)Here is a template of that settings.
promptSetting
prompt =
negativePrompt =
guidanceScale =
resolution =
seed =
width =For managing the prompt and tags, we would recommend also having them in multiple lines instead of inline or one line.
If you haven't checked out the User Input and Output Formatting article, which talks about the $output keyword, I would recommend reading through it as you will understand the following formatting easier.
The following are the benefits of having the tags like so:
Here is an example of the multiline Prompt and Tags Management
rawPrompt = ...
prompt
$output = [this.selectAll.joinItems(", ")]
[rawPrompt]
[tags]
tags
$output = [this.selectAll.joinItems(", ")]
tag1
tag2
tag3
tag4
...
negPrompt
$output = [this.selectAll.joinItems(", ")]
tag1
tag2
tag3
tag4
...
promptSetting
prompt = [prompt]
negativePrompt = [negPrompt]
guidanceScale = 7
resolution = 512x512
seed = -1
size = 400Here notice that on the prompt, we are joining each lists with a comma and since the output upon calling the list is the joined items, it will be instantly a long list of words deliminated by commas.
Here is an example of difference between prompts and seeds to get variations.
seed but it has different prompts or tags where the color differs. However looking at it, the images are almost similar, though with some variations.
On the second row, we have same prompt but different seed. Looking at it, each image are different to each other as they have different base 'noise' to make an image from.
If you want a different image, you change the image seed but stay with the same prompt. But if you want the same image but different variation, you can change the prompt but same seed.
By the default the seed will be randomized.
Change Seed but Same Prompt = Different Image, Change Prompt but Same Seed = Different Variation.
Here is the generator used for the previous image.
NOTE: The Image Seed can't return the same picture*
*Due to the model and generation settings using an Ancestral Sampler which doesn't converge which will mean slight variations/differences to the images, though the main 'essence' of the image are still there. See this thread for an explanation/demo.
NOTE: Since the AI model will be updated if there is a latest version, your Seed might not return the same result in the future!
Here is an example of how negative prompting affects the image.
Then, upon applying the negativePrompt, with the same prompt and seed, we can see that the items changed and removed or lessened the items specified in the negativePrompt while still retaining the base image essence without the negative prompt.
Here is the generator used for the previous image.
In adding tags and negative prompts, order matters.
Here is the generator used for the previous image.
For creating a good image, a usual order is:
First is the raw prompt, what we want in our image. Then is the content type, what type of image do we want. Next is the description, how we would describe our image, what details does it have, etc. Then what is the style of the image, do we want it to look like something that a popular artist makes, etc. Then lastly how is the image composed or laid out.
This tags/prompt order is just a guideline you can rearrage them on what you want to be first applied to the image.
For example, if you want to set the composition of the image first before specifying the style, you can rearrage them and see what suits your image best.
Here is a example of the prompt robot, geometric, 90s style, cinematic shot rearranged:
Here is the generator used for the previous image.
We would also recommend placing the tag categories in their own lists like so:
tag_contentType
$output = [this.selectAll.joinItems(",")]
...
tag_description
$output = [this.selectAll.joinItems(",")]
...
tag_style
$output = [this.selectAll.joinItems(",")]
...
tag_composition
$output = [this.selectAll.joinItems(",")]
...
tags
$output = [this.selectAll.joinItems(",")]
[tag_contentType]
[tag_description]
[tag_style]
[tag_composition]Then, you can use this Permutation Calculator to generate possible arrangements of your tag categories.
Here is a template with all of the settings to get you started with the prompting.
Guidance Scale determines how the AI will try to match the given prompt, here is an example of it.
woman portrait, goth aesthetic, 90s style, cinematic shot. The guidance scale is from 1 to 30 and as we increase it, the AI tries to match the given prompt.
Here is a prompt: woman portrait, hippie aesthetic, 90s style, cinematic shot with steps of 5 in guidance scale (5 to 30):
Here is the generator used for the previous image.
The default guidance scale is 7.
Resolution determines the size of the image that will be generated. Here is an example of it:
woman portrait, hippie aesthetic, 90s style, cinematic shot. Here we can see that a 512x768 is good for portraits but 768x512 is good for a cinematic portrait shot. The 512x512 is the default.
Guidance Scale here is default (7)
Here is the goth aesthetic prompt:
robot, geometric prompt:abstract geometric red prompt:Here is the generator used for the previous images.
Since mostly the image depends on the seed, we can 'hunt' seeds by either setting a start and end seed or generate a string of numbers to input to the seed setting of the prompt.
NOTE: Since the AI model will be updated if there is a latest version, your Seed might not return the same result in the future!
seed: 104 and resolution:768x512. Then we change the prompt while retaining those parameters, will look like this:For example, you are searching for a good portrait seed, you can just have the prompt portrait and see the 'feel' of the seed with that simple prompt.
Here is the 'basic' portrait from the seed 104 and resolution 512x768 and the result of the original prompts:
Here is the generator for the example of seed hunting (might be better to create your own)
Now that we have the seed and resolution, we can now test the effects of tags. For example, what would happen after a tag is added or rearranged.
goth aesthetic:You can also apply this to the negativePrompt, here is an example:
painting, monochrome, cosmetics, worse quality as the negative prompt. We can see that adding the monochrome reduced the black and white on the image. Here is on the hippie aesthetic prompt:Here is when the tag in the prompt is sequential while the negativePrompt is static and when the prompt is static while the negative prompt is not:
Here is the generator used in the previous examples.
As previously stated, guidance scale is how much the model will try to match the prompt given. Previously, we are changing the 'global' guidance scale, but we can also change the tag's guidance scale or also called emphasis.
By default the tag's scale is 1. To increase it, we need to enclose them in parenthesis, and in doing so, it will increase its scale by 1.1, meaning 1*1.1. Upon enclosing it to two parenthesis i.e. ((tag)), it will once again increase by 1.1, meaning the value will be 1*1.1*1.1
The equation will be 1.1**N, where N is the number of parenthesis.
Here is an example of it in the goth aesthetic tag:
(tag:value), in which we can see here:We can also emphasize the tags in the negative prompt like so:
monochrome tag and we can see that the lipstick of the portrait turned red, and the dress is somewhat purply due to the brownish color filter.
Here are both in the hippie aesthetic:
Here is for both man portrait:
Here is the generator used for the previous images.
We recommend using the (tag:value) method since using multiple parenthesis doesn't look good for readability. You can achieve the same values as the multiple parenthesis using the equation:
(tag:[(1.1**value).toFixed(2)])You can also group multiple tags like: (tag1, tag2, tag3:value) to have them all have the same emphasis.
At Automatic1111, they use [from:to:value] to blend tags, unfortunately, we cannot do that in the model we have in Perchance.
We can blend tags using [from:to:value] by adding .map(a => a.getRawListText) in the $output like so:
tags
$output = [this.selectAll.map(a => a.getRawListText).joinItems(",")]
...
To format the blended tag, we need to escape the [] in Perchance like so:
tags
$output = [this.selectAll.map(a => a.getRawListText).joinItems(",")]
\[from:to:amount\]
The value determines how much of the from is present in the blend. For example:
[from:to:0] - which means 0 of the from tag, and 1 of the to tag.[from:to:0.25] - which means 0.25 of the from tag, and 0.75 of the to tag.[from:to:0.75] - which means 0.75 of the from tag, and 0.25 of the to tag.[from:to:1] - which means 1 of the from tag, and 0 of the to tag.
NOTE: Only single word tags are allowed in the blend (no tags with spaces etc.,) if you are going to display the prompt in the generator since it will throw an error.
[hippie:goth:value] where value is from 0.00 to 1.00 with 0.1 increment:[from:to:value] is to hover on the image, since Perchance evaluates it upon displaying on the page.
Here is with [goth:hippie:value]:
Here is with 0.00 to 1.00 with 0.05 increment: goth:hippie
hippie:gothHere is the generator used in the images above.
Another way to blend them is like the following:
(tag1:1) (tag2:1.2)You can emphasize the tag that you would want to apply on the first tag. Still order matters on which tag is first and which is the second. Here is an example of it:
goth aesthetic is first then the hippie aesthetic is second. Here is hippie aesthetic first, and goth aesthetic second.Here is the generator used in the images above.
This also works for the negative prompts.
Here is another example of blending with de-emphasizing.
man portrait and woman portrait with de-emphasizing on the other.
Here is the generator used in the images above.
Another way of 'blending' tags is using AND or OR on the tags, here is an example.
goth aesthetic and hippie aesthetic, we can see that the AND tries to incorporate both while the OR tries to incorporate either. The OR method looks like the (hippie aesthetic:1.0) (goth aesthetic:1.0) method like above.
Tag Order seems to matter still.
Here is the generator used in the image above.
Here is with the images and having the tags deemphasized and emphasized (from 0.5 to 1.5) and ANDed.
goth aesthetic first.Here is the generator used in the images above.
Some say that commas in prompts to separate the tags doesn't affect the overall image. However, sometimes using commas to separate the prompt does give a significant effect ot the image:
woman portrait goth aesthetic, 90s style, cinematic shot) seems to have more contrast on the imagewoman portrait goth aesthetic 90s style, cinematic shot) seems to be the only one without a black lipstick.hippie aesthetic
Feel free to change up your commas and see a difference in the images.
Here is the generator used in the images above.
NOTE: You can also try to use . instead of a , and having space between the , or . and a tag i.e. tag1(space),(space)tag2 or tag1 . tag2
You can add and remove tags at a certain point on the generation using the following syntax:
[to:when] - adds to to the prompt after a fixed number of stepsto when the step is at when.A hill [with a monolith:0.5] means:A hill - for the first half of the generation stepsA hill with a monolith - for the remaining generation steps[from::when] - removes from the prompt after a fixed number of stepsfrom when the step is at when.A hill [with a monolith::0.5] means:A hill with a monolith - for the first half of the generation stepsA hill - for the remaining generation steps
NOTE: Remember to escape the [] with \[ and \] so Perchance wouldn't throw an Error.
0.05 per step.
Here is the hippie aesthetic being added at different steps upon generation.
goth aesthetic being removed at different steps upon generation.Here is the generator used in the images above.
You can change when the tag is applied depending on the step. This is also known as 'Alternating Words' in Automatic1111. The syntax is:
[tag1|tag2] - means, the generation will start with tag1, then the next step will use tag2 then it will alternate back to tag1 and so on.[tag1|tag2|...|tagN] - means the generation will alternate between the tags per step until tagN and it will loop back to the start.[tag1|] - means that generation will start with tag1, then the next step is blank, then tag1 and so on, i.e. tag is being applied every even steps every two steps: 0,2,4,...[|tag1] - means that generation will start with blank, then tag1, then blank, i.e. tag is being applied every odd steps every two steps: 1,3,5,...[tag1||] - means that tag1 will be applied every three steps starting from 0 i.e. 0,3,6,...[|tag1|] - means that tag1 will be applied every three steps starting from 1 i.e. 1,4,7,...
NOTE: Remember to escape the [] with \[ and \] so Perchance wouldn't throw an Error.
goth aesthetic being applied at different intervals starting at 0:Here is the generator used in the image above.
Here is goth being alternated with different keywords:
goth being alternated with increasing keywords:Here is the generator used in the images above.
BREAK KeywordAccording to the Automatic1111, tokens are separated into chunks of 75. That is, if your prompt has over 75 tags/prompt, it would create another chunk.
We can use the BREAK keyword, to force those chunk separation without needing to fill the remaining tokens.
woman portrait, goth aesthetic, 90s style BREAK cinematic shot would be something like:['woman','portrait',',','goth','aesthetic',',','90s','style',0,0,0,0,...,0]['cinematic','shot']Here is the prompt with BREAK on different places of the prompt:
NOTE: You can also add , before or after the BREAK like so: , BREAK, BREAK ,, BREAK, , BREAK ,. Use what generates good image for you.
Here is the generator used in the images above.
Here is another prompt example from this reddit thread about the BREAK keyword:
BREAK after each prompt. It also showcases the , before or after or none on the BREAKs.
Here is the generator used in the images above.
Based on the dev's comment about the models, the AI model used will be changed based on the keywords/prompt.
Here are some 'normal' keywords that trigger a different model, which will output quite a different result due to the model being used.
my little pony, sonic, chimera, faun, goblin, dryad, mer(maid|man), humanoid, imp, android, cyborg, anthropomorphic, anthro, + any gender, slime girl, aron, twili, centaur, fursuit, paw, avianabraanime, slime girl, danbooru, pixiv
NOTE: These keywords aren't final, and may be changed without notice.
The list of AI models available isn't available to the public and further testing may be needed to know which keywords trigger which models.
You can visit the following sites for prompt inspirations:
Here is a generator that showcases artists and their effect on the Text to Image Prompt:
Less is More usually, but you can finetune with hundreds of tags/negative prompts.
t2i-framework PluginYou can use the prompt techniques here on the textarea of the t2i-framework generators like so:
You can get the input settings upon generation finish like so:
promptSetting
prompt = [prompt]
negativePrompt = [negPrompt]
guidanceScale = 7
resolution = 512x512
seed = -1
size = 400
onFinish(data) =>
console.log(data.inputs)
// You can access the inputted settings
// with `data.inputs.prompt`, `data.inputs.seed`, etc.
This guide will not talk about how to use the gallery, please see the Plugin Page for the settings and how to use the gallery.