vluz

vluz@kbin.social · 1 year ago

I do SDXL generation in 4GB at extreme expense of speed, by using a number of memory optimizations.
I’ve done this kind of stuff since SD 1.4, for the fun of it. I like to see how low I can push vram use.

SDXL takes around 3 to 4 minutes per generation including refiner but it works within constraints.
Graphics cards used are hilariously bad for the task, a 1050ti with 4GB and a 1060 with 3GB vram.

Have an implementation running on the 3GB card, inside a podman container, with no ram offloading, 1 vcpu and 4GB ram.
Graphical UI (streamlit) run on a laptop outside of server to save resources.

Working on a example implementation of SDXL as we speak and also working on SDXL generation on mobile.
That is the reason I’ve looked into this news, SSD-1B might be a good candidate for my dumb experiments.

vluz@kbin.social · 1 year ago

Don’t trust brave, never will.

vluz@kbin.social · 1 year ago

DS1 to DS3, I lost count of the hours.