Stable diffusion for waitbutwhy

You thought I would be satisfied with my waitbuwhy AI generator in Look at my Cutie Pie? Quite the opposite. My friends are complaining…

Jul 03, 2023

You thought I would be satisfied with my waitbuwhy AI generator in Look at my Cutie Pie? Quite the opposite. My friends are complaining about the stick man doesn’t look similar enough, so I started to train it on a stable diffusion model. Do you think I’m crazy? Funny content still demands quality!

My original drawing

I was using Dalle 2 models for generating tim urban stick man. I tried lots of prompts to test. This is the output I’m able to generate. It’s not too far off, but it is definitely a stretch to say that Tim draws this. Therefore, I decide to use waitbutwhy’s drawing images to train my stable diffusion models.

Stable diffusion

I will give you a short overview of what is stable diffusion. Stable Diffusion is a deep learning, text-to-image model released in 2022. Before learning about stable diffusion, the basis of a diffusion model contains two parts.

Forward Diffusion Process → add noise to the image.
Reverse Diffusion Process → remove noise from the image.

Stable Diffusion is the “Latent Diffusion Model” (LDM). It means that the diffusion process happens in the latent space. It is faster than a diffusion model.

What is a latent space? The latent space is simply a representation of compressed data in which similar data points are closer together in space.

In short, we will need to encode the images into latent data, the forward and reverse diffusion processes will be done in the latent space. I know you will say you don’t understand it. It’s okay because our main dish today is the AI-generated output from stable diffusion!

First-generation AI stick-man output

Input: random photos from waitbutwhy.

Training sample output (Sooooo cute 😍😍😍)

AI testing output

Okay…😅 Most of the things AI gets right. A stick man, but sometimes they have many eyes? But one thing I really like is the color choice. It’s not related to Tim Urban’s drawing anymore, but I appreciate AI’s creativity. As for the conversation box, AI doesn’t understand English. It is talking nonsense.

Second-generation AI stick-man output

Input: clean photos with only one stick in it

Training sample output

AI testing output

People! You really need to take a look at these cute drawings! They are somehow waitbutwhy drawing but with even more personality!

Prompt: Happy stick man (This one is soooo good!)

Prompt: stick man loves Edamame (actually quite good!)

Third-generation AI stick-man output

Input: clean photos with one or more stick men.

Training sample output

AI testing output

It is good too! In general, the quality is higher though it becomes less creative.

Prompt: stick man loves Edamame (so cute!)

AI Stickman Award of the Day

Award 🏆 The most Tim Urban stickman

Left (AI generation 2): “Happy stick man.” Simple but neat!
Middle (AI generation 2): “Stick man loves marriage.” Apparently, AI knows the essence of marriage much more.
Right (AI generation 3): “Stick man turns on.” It doesn’t seem to be relevant to the texts, but it’s two people together, and they are both pretty good!

Award 🏆 The creative stick man

Left (AI generation 2): “Stick man loves edamame.” Simple but hit the right spots!
Middle (AI generation 2): “Stick man loves fruit.” AI is so creative. The head becomes an apple and it is adorable!
Right (AI generation 2): “Stick man loves baby.” Though it’s not about baby, it intelligently merges the most waitbutwhy element on the ballon. With the kids kissing, it does deliver love and cute feelings.

Resources

waitbutwhy as usual
All the images about stable diffusion: https://medium.com/@steinsfu/stable-diffusion-clearly-explained-ed008044e07e

Esther is a confused human being

Discussion about this post

Ready for more?