Text-to-Image Chains
This page will tell you how to create text-to-image workflows using chains on Flush.
Example 1: Prompt Enhancement
Many complex image generation workflows involve using LLMs to vastly improve prompts for stable diffusion. Below, we show a simple example of how we can use an LLM to enhance a starter prompt to generate images using a text-to-image model with Flush.
First, lets initialize our models. We’ll be using Flush’s OpenAI’s GPT-4 wrapper as the LLM and Stable Diffusion XL as the text-to-image model.
Now, let’s define a prompt to pass into this chain. We use Flush’s prompt templates to create a prompt for GPT-4 to format a prompt for stable diffusion simply based on a topic:
As such, we can pass in a topic into the subject
parameter in our chain, and GPT-4 will generate a prompt optimized for stable diffusion. Now, let’s put everything together with our Chain.
As such, if we look at the chain we constructed above, llm_output
, is the output from the LLM (Refer here for more information about formatting chains). This is then passed as the prompt for the stable diffusion model.
Finally, diffusion_output
is the final output that is returned from this chain.
Note that we are able to pass in either a PromptTemplate
or a string for the first model in the chain, but every other prompt after that must be a string in the format of a PromptTemplate
, i.e.
Python’s f-string. We can also add to this prompt with the previous llm_output
if desired:
We can then run this chain with the following command:
Note how we pass in “urban photography” as the subject due to the prompt template we defined above. This gives us the following output:
This is a significant improvement that if we just passed “urban photograpy” as the prompt into Stable Diffusion XL: