Image to Prompt - How to reverse engineer a prompt from just an image
Let's learn to use a few tricks, and a few tools for creating your images from inspiration you find
Hello from sunny Utah where some of the anticipated flooding is unfortunately happening. Hasn’t been too bad yet, hopefully that continues. Let’s dive into today’s topic.
I recently saw this sunflower explorer image online and thought “I love the feeling of this image, I want to make something similar.”
With AI image generators, it happens to me constantly. Especially because I know there is some magic combination of model, prompt, and settings that would let me actually create something similar.
Today let’s examine the tools, tricks, and techniques we can use, to go from image to prompt. We will examine how to:
Directly look the prompt up in metadata
Learn to recognize art styles, which let us describe the image to create
Use tools to generate a prompt from an image
But first, let’s talk about the morality of creating similar images.
Is it wrong to create similar images?
Steve Jobs once proclaimed, "Good artists copy, great artists steal." Although Steve attributed the quote to Picasso, its true origin remains uncertain. Nevertheless, this idea sheds light on the ongoing debate between inspiration and theft.
However, the idea does illuminate the vigorous debate of inspiration vs theft.
Artists, like all humans, continuously learn from one another. In fact, artists copy each other so frequently that there's a specific term for this practice: master study.
A master study involves examining the process and creative choices of an accomplished painter to gain insight into how they arrived at their finished piece. Artists select another artwork, typically a famous or well-regarded one, and replicate it.
They do this to learn. And importantly, they don’t try to pass their output as an original. They do it to learn.
All of human culture is people observing the world, including things made by others, and creating their own versions.
Stealing is bad. Getting inspired by others is good and impossible to avoid. But the line between stealing art and getting inspired by is a blurry one, and often comes down to intent and execution.
Here’s a famous example. In Macbeth, a ghost prophecies that “Macbeth shall never vanquished be” until the very forest marches on his castle. But then the English army marches on the castle holding branches from the forest, and Macbeth is vanquished.
J.R.R. Tolkien has such “bitter disappointment and disgust” at this “shabby use” that, as he told the poet W.H. Auden, he invented a moving, talking forest, which actually uproots and goes to war in The Lord of the Rings. And for now, the public knows Tolkien’s trees better than Shakespeare’s. He stole like an artist.
Going back to painters, counterfeiters copy, and conceal they are doing so. Students copy, as artistic training. Assistants copy, as labor and extra hands for more famous artists.
But where is the line?
Sturtevant is a famous artist who blatantly copied others, and she serves as an intriguing example.
Her rendition of Johns' Target with Four Faces recently sold for $314,000. The buyer wasn't deceived by a counterfeiter; rather, they saw value in Sturtevant's art and considered it to be genuine.
As Sturtevant shows, the border between original and copy, invention and plagiarism, is constantly up for negotiation. You can read more about Sturtevant here.
So is it wrong to create a similar image as someone else? No. It is part of the process of learning and creating.
Is it wrong to create similar images in an attempt to take that artist’s rewards, whether that be fame, money, or prestige? Yes.
But I can’t tell you where good copying ends and bad copying begins you have to determine that for yourself.
From this point on in this article, let us assume together, that we are creating similar images for good reasons, to learn, remix, and create art. Now let’s learn how to do it.
Directly look up the prompt for AI art
It may surprise you that a lot of images made with Stable Diffusion include all the prompts and related info in the metadata of the image itself. Just like in Zoolander, ‘the prompt is inside the image’
This website explains how a PNG stores metadata and lets you upload images and read the data out.
Luckily it is even easier than that. Most of the Stable Diffusion GUIs can also read this information.
For example in the popular Auto1111, you just go to the PNG info tab, and drag the image in and it will access the metadata and you can create your own image with just a click.
So if you see an image you like online that was created by AI art, try reading the metadata to see if the creation parameters are included.
If the image has been shared a lot, you might be seeing a copy of a copy, often with metadata missing. You can use tools like TinEye, Google Images, or SauceNAO to try and find the earliest and hopefully original image.
But if the image was human-created, or the prompt info was not included, you can use knowledge of art styles to describe the image.
Being able to describe an image better, lets you create other images like it
I would highly recommend you read this short essay ‘Magic for English Majors’ about prompting and AI art: https://oneusefulthing.substack.com/p/magic-for-english-majors
In it the author argues that the more you know about the history of art, and are aware of art styles in general, the more powerful image generation systems become.
He gives this image as an example.
If you don’t know how to describe these styles of art, you can’t replicate them.
But if you do, you can use them as inspiration to make your own.
From top right to bottom left, we have
Spider-Man in the style of Ukiyoe
Spider-Man in the style Muncha
Spider-Man in the style of a halftone offset lithograph
Spider-Man in the style Man Ray
Spider-Man in the style with a batik costume
Spider-Man in the style in a German expressionist style
If know the 5th image is Spider-Man in a batik costume, you can create something similar.
But you can also gain so much more than just being able to create your own image you gain a more rich understanding of the world.
For example, batik is an Indonesian technique of wax-resist dyeing applied to the whole cloth. This technique originated from the island of Java, Indonesia.
Now you have a seed of inspiration. What else can you create in the batik style?
This is why there has been an explosion of interest in art styles, and artist styles.
For example this Notion doc lists various artists and their styles for Stable Diffusion.
This Google Doc does the same for Midjourney.
Note - there is another moral debate going on in addition to the ‘steal vs inspire’ and that is the use of artists’ images in training data. We will save discussing that topic for another episode.
Learning art styles, art movements, and even technical details like the blurry backgrounds produced by high-end cameras is called bokeh and can help you create images.
With AI, prompts are like spells. Learning how to describe the world is like learning new spells.
Learning to describe the world or imaginary worlds better, give you more power.
There are also tools that can help you describe the world by asking the computer to describe it.
Tools to generate a prompt from an image
One of the breakthroughs that helped create AI image generators was OpenAIs CLIP.
CLIP (Contrastive Language-Image Pre-Training) is a neural network trained on a variety of (image, text) pairs. Given an image, it predicts the most relevant text snippet. That means you can give it an image, and it will describe it.
This article does a good job of describing the technical details of CLIP.
CLIP is actually pretty wild. We can give the computer the task of describing an image, and it actually does it very well.
The invention of CLIP is one of the things that allows AI image generators to work at all. The image generator takes a prompt, generates an image, then feeds that image to CLIP to see if it is getting close to the prompt.
We can also just use it to describe an image.
Here are 5 tools that can help generate prompts from images:
Auto1111 CLIP built-in
Replicate Image to Prompt
Midjourney /describe feature
Unprompt Image Search + prompts
Other CLIP Interrogators
Auto1111 CLIP built-in
Once again Auto1111 has CLIP description built in.
Go to the Interrogator tab and drag in your image, select the options, and click generate to describe the image.
The first time you run CLIP interrogator it will download a few gigabytes of models.
Replicate Image to Prompt
URL: https://replicate.com/methexis-inc/img2prompt
Image in, description comes out. A nice online tool, methexis-inc hosts a public img2prompt tool on Replicate.
Just drag your image into the drop zone, and hit Submit. After some time, usually less than a few minutes, you get the output description.
Midjourney /describe feature
Midjourney just released a new tool, which is /describe.
If you need a Midjourney primer or refresher, you can read my article here.
Just type /describe which will give you a drop zone for your image.
Drag your image in and Midjourney will make a Midjourney prompt for it.
Luckily prompts are pretty interchangeable. Just discard the Midjourney parameters which are always preceded by a double dash --
In Midjourney you can click the number of the prompt you think will work best and it will submit that to be generated by Midjourney.
I will note that you are probably training Midjourney on what a good prompt is, but since any action taken on Midjourney trains it, I don’t think this is an issue.
Unprompt Image Search + prompts
URL: https://unprompt.ai/
Basically, a reverse image search, where the results include prompts.
Upload your image and see images like it, with their prompts. This can help you figure out how to describe your image.
Mousing over an image, or clicking on the image will show you the prompt details.
Sometimes it is missing the model, which for Stable Diffusion is very important, but it can be useful to gain additional ways of describing the image.
Other Clip Interrogators
There are many CLIP Interrogators.
A few of the ones I have used include:
Results of image to prompt experiment
Using all the methods above, let’s try to recreate the original sunflower image. I’ll give the prompt for each, with bold being what changed.
I like the results. If I had more time to spend, I would continue to make the character more cartoonish, improve the background, and work on a warmer feeling.
However, I do think my version captures the feeling and movement of the original.
Using these tools and techniques, you can create your own images from inspiration you find online.
Good luck!
See you next week
-Josh
PS - a number of you indicated you would like a free mini-course on how to create apps with ChatGPT. I am going to make it, well two actually.
How to make apps with ChatGPT for non-coders and
How to make apps with ChatGPT for coders.