Any dummies explanation of what diffusion is?

benanne · on March 26, 2024

I've since moved on to work primarily on diffusion models, so I have a series of blog posts about that topic as well!

- https://sander.ai/2022/01/31/diffusion.html is about the link between diffusion models and denoising autoencoders, IMO the easiest to understand out of all interpretations; - https://sander.ai/2023/07/20/perspectives.html covers a slew of different perspectives on diffusion models (including the "autoencoder" one).

In a nutshell, diffusion models break up the difficult task of generating natural signals (such as images or sound) into many smaller partial denoising tasks. This is done by defining a corruption process that gradually adds noise to an input until all of the signal is drowned out (this is the "diffusion"), and then learning how to invert that process step-by-step.

This is not dissimilar to how modern language models work: they break up the task of generating text into a series of easier next-word-prediction tasks. In both cases, the model only solves a small part of the problem at a time, and you apply it repeatedly to generate a signal.