When I teach my primary school class, I always read during the lunch break. In my opinion, the best lunch break stories contain three ingredients.
- They're funny
- They're magical
- They contain beautiful pictures
Sometimes it's hard to find such an ideal story! So... :) can we generate them using text/image generation? This is a mini-project, playing around with Huggingface-hosted models. Based on a prompt, I use GPT2 to generate the rest of the story and a Stable Diffusion model to generate an accompanying picture. Hopefully the magic and fun will then follow automatically....
Open notebooks/generate_text_from_prompt.ipynb to generate the story based on a prompt. Currently the prompt is
"The goat was just riding his bicycle through the fields when he saw"
The resulting text snippets will be saved in generated_output/*.txt
Open notebooks/generate_images_from_prompt.ipynb to generate the pictures accompanying the story.
Currently, the prompt is
"a goat on a bicycle"
The resulting images will be saved in generated_output/*.png
The cycling goat lunch break story.
- The images generated by the Stable Diffusion model. They are indeed goats, on things that are somewhat close to bikes.

Some candidate stories generated by GPT2.
The goat was just riding his bicycle through the fields when he saw ichthyosaurs on the horizon, the tattered carcass of which he'd seen in the film. "I just thought ichthyosaurs were pretty big!" he admits. "I didn't know this was real. All I knew was that a lot of them were there. " He has heard lots of stories of hairy-heeled hippos and crocodiles, but had never been involved in a monster...
The goat was just riding his bicycle through the fields when he saw ixelas standing at the end of a hill. He looked up and said, 'what the hell is going to happen to me here?' The goat said, 'I want to be a scientist, not a farmer'. He was very nice and he never told anybody that he was a scientist. The goat said, 'how do scientists know there are not all the goats?' The goat said, 'well, we know that all the goats come from a land before it has dried and we see them when they come. But there are not all the goats coming back and there are still more goats waiting for us...'
The goat was just riding his bicycle through the fields when he saw the goat's horn. He had never seen any goats before so he stopped at the gate to bring him. I asked him if he had ever heard a goat tell someone about him being a goat but he was reluctant to answer. He told me he had only heard about goat's horns once but now had heard it twice before. We all agreed to follow him until we reached and at that point my friend called to tell me the goat was dead. I had hoped that he would talk to the goat but he didn't. When I called the dead goat, he told me he had the horns
- the quality of the images can easily be improved by increasing the number of inference steps in the stable diffusion model. This would take some time (or GPUs) though.
- I have seen that never transformer models (e.g. GPT3) provides stories of much higher quality than these. Hopefully we can soon use this to also make the stories a bit less random.
- Ideally the story is accompanied by multiple images. We could for example create a script which first generates the story using GPT2 and then picks multiple prompts at multiple points in the text (we can select the sentences either randomly or by using a summarization model), and generates an image based on those. The result would be similar to having an image on each page of a book.