Latent Diffusion

4 Posts

Generated Video Gets Real(er): OpenAI’s Sora, a new player in text-to-video generation

OpenAI’s new video generator raises the bar for detail and realism in generated videos — but the company released few details about how it built the system.

Latent Diffusion

Better Images, Less Training: Würstchen, a speedy, high-quality image generator

The longer text-to-image models train, the better their output — but the training is costly. Researchers built a system that produced superior images after far less training.

Graphic model of Stable Audio difffusion and transcoding process

Latent Diffusion

Music Generation For the Masses: Stability.ai launches Stable Audio, a text-to-music generator.

Stability.ai, maker of the Stable Diffusion image generator and StableLM text generator, launched Stable Audio, a system that generates music and sound effects from text. You can play with it and listen to examples here. The service is free for 20 generations per month up to 45 seconds long.

Latent Diffusion

What the Brain Sees: How a text-to-image model generates images from brain scans

A pretrained text-to-image generator enabled researchers to see — roughly — what other people looked at based on brain scans. Yu Takagi and Shinji Nishimoto developed a method that uses Stable Diffusion to reconstruct images viewed by test subjects...

Latent Diffusion

Generated Video Gets Real(er): OpenAI’s Sora, a new player in text-to-video generation

Better Images, Less Training: Würstchen, a speedy, high-quality image generator

Music Generation For the Masses: Stability.ai launches Stable Audio, a text-to-music generator.

What the Brain Sees: How a text-to-image model generates images from brain scans

Subscribe to The Batch