How to create images and infographics with GPT APIs

3 min readApr 19, 2023

Image generation models like DALL-E, Midjourney, or Stable Diffusion generate an image starting from a prompt.

One of the problems of those models is that they cannot generate images with text like infographics, social media posts, etc. Tools like Canva works really well for those use cases.

Using Canva templates and brand features, we can create great social media posts in minutes using the same communication style.

Unfortunately, this is not possible yet using generative AIs. So, how can we create infographics using GPT (and DALL-E) APIs?

The concept is quite simple. Based on my last post, we can now generate structured content from GPT (like JSON). That’s all we need to succeed.

Before we start, let me give you an output example:

In this really simple example, we’re creating a Linkedin carousel with:

a title
an image
7 slides each with a title and description

How does it work?

Before we jump into a detailed description, let me briefly reference the process.

The first component is a template library. A template is a file containing dynamic content and a JSON descriptor. In this example, I used handlebars templates with HTML and CSS. The JSON descriptor contains a description of what is needed to feed this template.

The second component is the Content engine. We can request text content from GPT and images from DALL-E. Of course, we can create multiple adaptors in order to make the engine work with different external services.

The content engine takes in input a main idea and generates a JSON containing everything required. Also, the images are saved on disk and are generated as a location reference.

At this point, we can save everything on a file in order to review it manually. Once is reviewed, we can take the auto-generated id and with that, we can feed the render engine.

The render engine takes a valid JSON in input, a template, and generates the image/carousel.

Thanks to this auto-generation and manual review, we can still have control over what we publish.

In the demo below you can see an example. In this case, I’m not editing the content, to show how it works without further adjustment. Interestingly, using the same input multiple times, we get really different (but always useful) results.

How to expand this tool?

The first possible expansion consists of an idea generator. First of all, we start by generating macro themes. For example, the prompt I used is:

I’m a software engineer. Create for me a list as long as you can of macro-areas that I can discuss in my Linkedin posts. You have to cover both hard skills, soft skills, career advice, study advice, and everything else.

This generated 20 topics. Then you can loop on each topic and generate ideas. One iteration of this process generated more or less 400 ideas. Since are stored in a JSON file, topics and ideas can be manually reviewed before the post-generation.

I hope you enjoyed this idea and this proof of concept. More content on this will follow. Cheers :D

How to create images and infographics with GPT APIs

How does it work?

How to expand this tool?

Written by Alfonso Graziano