Stable diffusion for waitbutwhy
You thought I would be satisfied with my waitbuwhy AI generator in Look at my Cutie Pie? Quite the opposite. My friends are complaining…
You thought I would be satisfied with my waitbuwhy AI generator in Look at my Cutie Pie? Quite the opposite. My friends are complaining about the stick man doesn’t look similar enough, so I started to train it on a stable diffusion model. Do you think I’m crazy? Funny content still demands quality!
My original drawing
I was using Dalle 2 models for generating tim urban stick man. I tried lots of prompts to test. This is the output I’m able to generate. It’s not too far off, but it is definitely a stretch to say that Tim draws this. Therefore, I decide to use waitbutwhy’s drawing images to train my stable diffusion models.
Stable diffusion
I will give you a short overview of what is stable diffusion. Stable Diffusion is a deep learning, text-to-image model released in 2022. Before learning about stable diffusion, the basis of a diffusion model contains two parts.
Forward Diffusion Process → add noise to the image.
Reverse Diffusion Process → remove noise from the image.
Stable Diffusion is the “Latent Diffusion Model” (LDM). It means that the diffusion process happens in the latent space. It is faster than a diffusion model.
What is a latent space? The latent space is simply a representation of compressed data in which similar data points are closer together in space.
In short, we will need to encode the images into latent data, the forward and reverse diffusion processes will be done in the latent space. I know you will say you don’t understand it. It’s okay because our main dish today is the AI-generated output from stable diffusion!
First-generation AI stick-man output
Input: random photos from waitbutwhy.
Training sample output (Sooooo cute 😍😍😍)
AI testing output
Okay…😅 Most of the things AI gets right. A stick man, but sometimes they have many eyes? But one thing I really like is the color choice. It’s not related to Tim Urban’s drawing anymore, but I appreciate AI’s creativity. As for the conversation box, AI doesn’t understand English. It is talking nonsense.
Second-generation AI stick-man output
Input: clean photos with only one stick in it
Training sample output
AI testing output
People! You really need to take a look at these cute drawings! They are somehow waitbutwhy drawing but with even more personality!
Third-generation AI stick-man output
Input: clean photos with one or more stick men.
Training sample output
AI testing output
It is good too! In general, the quality is higher though it becomes less creative.
AI Stickman Award of the Day
Award 🏆 The most Tim Urban stickman
Left (AI generation 2): “Happy stick man.” Simple but neat!
Middle (AI generation 2): “Stick man loves marriage.” Apparently, AI knows the essence of marriage much more.
Right (AI generation 3): “Stick man turns on.” It doesn’t seem to be relevant to the texts, but it’s two people together, and they are both pretty good!
Award 🏆 The creative stick man
Left (AI generation 2): “Stick man loves edamame.” Simple but hit the right spots!
Middle (AI generation 2): “Stick man loves fruit.” AI is so creative. The head becomes an apple and it is adorable!
Right (AI generation 2): “Stick man loves baby.” Though it’s not about baby, it intelligently merges the most waitbutwhy element on the ballon. With the kids kissing, it does deliver love and cute feelings.
Resources
waitbutwhy as usual
All the images about stable diffusion: https://medium.com/@steinsfu/stable-diffusion-clearly-explained-ed008044e07e