The new txt2imghd project is based on the “GOBIG” mode from another off-shoot of Stable Diffusion, which in turn is the model used to create most of the AI art you’ve probably seen lately. Images created with txt2imghd can be larger than the ones created with most other generators — the demo images are 1536×1536, while Stable Diffusion is usually limited to 1024×768, and the default for Midjourney is 512×512 (with optional upscaling to 1664 x 1664).

Txt2imghd has a clever way of upscaling images. According to the project’s documentation, it “creates detailed, higher-resolution images by first generating an image from a prompt, upscaling it, and then running img2img on smaller pieces of the upscaled image, and blending the result back into the original image.” It’s a clever work-around for the limits of video cards, but as you might expect, the result takes longer to generate than a single low-resolution image.

The updated version has roughly the same system requirements as regular Stable Diffusion, which recommends a graphics card with at least 10 GB of video memory (VRAM). If you’re interested in trying it out, you can run the model in your browser (a free GitHub account is required). You can also download the code to run on your own computer from the source link below.

Source: GitHub