Connect with us

Technology

MIT scientists have just figured out how to make the most popular AI image generators 30 times faster

Published

on

/ 5112 Views

Popular artificial intelligence (AI) powered image generators can run up to 30 times faster thanks to a technique that condenses an entire 100-stage process into one step, new research shows.

Scientists have devised a technique called "distribution matching distillation" (DMD) that teaches new AI models to mimic established image generators, known as diffusion models, such as DALL·E 3, Midjourney and Stable Diffusion. 

This framework results in smaller and leaner AI models that can generate images much more quickly while retaining the same quality of the final image. The scientists detailed their findings in a study uploaded Dec. 5, 2023, to the preprint server arXiv.

"Our work is a novel method that accelerates current diffusion models such as Stable Diffusion and DALLE-3 by 30 times," study co-lead author Tianwei Yin, a doctoral student in electrical engineering and computer Science at MIT, said in a statement. "This advancement not only significantly reduces computational time but also retains, if not surpasses, the quality of the generated visual content.

Diffusion models generate images via a multi-stage process. Using images with descriptive text captions and other metadata as the training data, the AI is trained to better understand the context and meaning behind the images — so it can respond to text prompts accurately. 

Related: New AI image generator is 8 times faster than OpenAI's best tool — and can run on cheap computers

In practice, these models work by taking a random image and encoding it with a field of random noise so it is destroyed, explained AI scientist Jay Alammar in a blog post. This is called "forward diffusion," and is a key step in the training process. Next, the image undergoes up to 100 steps to clear up the noise, known as "reverse diffusion" to produce a clear image based on the text prompt.

Trending