Under Review

M-VADER

Create images from multimodal input

M-VADER is a diffusion model for image generation where the output can be specified using arbitrary combinations of images and text. The model enables the generation of images specified using combinations of image and text, and combinations of multiple images.

 

M-Vader.png