Stable Diffusion & open-source AI
With the public launch of Stable Diffusion’s model and training set under a permissive license, the world has changed in terms of the ability of anyone to use AI models directly. The cost to train this model was $600,000, which is not something within reach for many people who would otherwise want to experiment locally. For now, setting it up on your computer is a relatively complex and technical process, but this will improve. We can expect to see more and more people playing with, and even tuning and building mash-ups between this and other AI tooling.
The first thing that happens with any great new open-source resource is that there’s a proliferation of experimentation and derivative projects that tie things together in interesting ways, and we’re already starting to see that here. For example, someone created a project which animates Stable Diffusion images by interpolating between two prompts. When discussed on Hacker News, Andreas Jansson (the author of the model) described the experience of creating it as follows:
The thing that really strikes me is that open source ML is starting to behave like open source software. I was able to take a pretrained text-to-image model and combine it with a pretrained video frame interpolation model and the two actually fit together! I didn’t have to re-train or fine tune or map between incompatible embedding spaces, because these models can generalize to basically any image. I could treat these models as modular building blocks.
This TikTok video from Karen X is another example of the possibilities that can arise from tying together multiple AI projects into a very interesting video project when the power is given to individuals to enhance their creativity.
We are entering a period where more and more models and training sets are becoming available for all kinds of commercial, intellectual, and even NSFW purposes, and this presents a potentially thorny ethical landscape. In Simon Willison’s article “Stable Diffusion is a Really Big Deal” there’s a section dedicated to the ethics of all of this, which highlights that the training data for the model was based on a public set of images scraped from the web, for which the creators did not give consent, and which produced a model that could be a considered a direct threat to their livelihoods.
It’s going to be interesting to follow how these models evolve and deal with the various ethical considerations in play. In the meantime, anyone can test out a public web interface for Stable Diffusion on Dreamstudio, or browse the public source repos on GitHub. If you’d like to view the kinds of things Stable Diffusion is capable of, there’s a great index of examples on Lexica. The license itself is viewable here.