Lab For AI

Lab For AI

How to Build a Custom Storybook Generator that Replicates Gemini's Storybook

A Quick Tutorial for Building a Storybook Generator using AG2 and Image Model

Yeyu Huang's avatar
Yeyu Huang
Sep 02, 2025
∙ Paid
1
Share

You must have heard about Gemini's new Storybook feature. It lets you input one prompt to make interesting stories with pictures, texts, and sound. Do you want a clone that matches the same level of quality and consistency, with even more extensibility? This post covers building your own version of a children's storybook generator. It's easy to customize and commercialize, and it works anywhere without location limits.

Gemini Storybook

Gemini Storybook is a tool developed by Google, released in August 2025. It creates personalized, illustrated stories about anything, with read-aloud narration. Just describe the story you want, add files or photos if you like, and Gemini makes a unique 10-page storybook. It uses Gemini's models in a straightforward way. It's free in the Gemini app as a Gem, but only in some countries, like normal Gemini model usage. If you haven't tried it, give it a go first.

For example, consider a seven-year-old child who can't sleep at their grandmother's house. You can create a storybook to help with that. The Gem might make something called "Willow's Sleepover Star," about a fox's journey to its grandma's house. The characters remain consistent, but sometimes details, such as Grandma's hair, vary. It's good overall, but you can ask Gemini for optimizations. Still, you can't change much about how it works. For business or extra ideas, it's hard to scale or add things.

Why Build This?

That's why I built my own application to create storybooks. I wanted something I could tweak, like adding more pages, changing the style freely or with restrictions, or using different models. This helps deploy in countries or areas with limits on commercial or free models, saving costs for unlimited books. In this demo, I kept it simple for kids' stories. Users input the child's name, age, and interests. The story makes the kid the main character.

Let’s quickly see the live demo.

The Agents Setup

Think of this like a small book studio team. The user gives the basic idea, then the team handles it.

  • Concept Developer: Lead author or editor. Builds the big picture as a story brief: title, summary, key characters with exact looks (hair color, clothes), setting, three-part plot, picture style. Ensures the kid is hero, tone is fun and hopeful, with small lessons like kindness. Limits to 1-3 main characters. Output: JSON brief for others.

  • Story Writer: Narrative writer. Turns brief into full story text, split into user-defined pages. Follows plot, uses age-appropriate words, ensures flow. Focuses on text only. Output: JSON with page numbers and text.

  • Art Director: Visual planner. Describes each page based on text and brief: character positions, expressions, background, lighting, mood. For cover, specifies title placement. Copies character details exactly. Limits scene text to 2-3 words. Output: JSON with page visuals.

  • Prompt Illustrator: AI image prompt expert. Makes detailed prompts from art directions and style. Copies character details word-for-word for consistency. Focuses on main characters, simple backgrounds. Cover is bright with title. Output: JSON with cover and page prompts.

Keep reading with a 7-day free trial

Subscribe to Lab For AI to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Yeyu Huang
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture