Freemium

Dall-E

Create a new picture given on a text input, or create variations of a given picture

DALL-E 3 is the latest iteration by OpenAI, advancing its capability to generate images that precisely conform to the text provided, marking substantial improvements over DALL-E 2¹². DALL-E 3 is now integrated with ChatGPT↗︎, allowing users to refine image generation requests with more detailed prompts and aiding in brainstorming for image ideas³². This version also focuses on enhancing security features to prevent the generation of violent, hateful, or adult content, and declines requests involving images in the style of living artists or public figures by name⁴². It's made available on various platforms like ChatGPT Plus, ChatGPT Enterprise, Bing's AI Image Creator↗︎, and Microsoft Designer⁵.

Technology

DALL-E 3 continues to leverage the Transformer architecture and the "Deep Learning" technique to translate text descriptions into images. It's built natively on ChatGPT, facilitating a more nuanced understanding and rendering of detailed text prompts into images. The multimodal implementation with GPT-3 and the cooperative role of CLIP remains integral, with DALL-E 3 benefitting from the diffusion model conditioned on CLIP image embeddings utilized in DALL-E 2.

Capabilities

With DALL-E 3, the range of image generation has expanded, delivering images that closely adhere to the text prompts. This version significantly enhances the capability to manage detailed prompts and integrates directly with ChatGPT to refine and tweak image generation requests. The concerns regarding the potential misuse for deepfakes and misinformation, as well as the implications for technological unemployment, continue to be relevant.

Ethical concerns

The ethical concerns surrounding DALL-E 3 mirror those of DALL-E 2, including algorithmic bias, potential for misinformation propagation, and the ease of bypassing content filters. The additional safety measures in DALL-E 3, such as declining requests related to public figures and living artists, aim to mitigate some of these concerns, although the broader ethical implications remain.

Technical limitations

The technical limitations concerning language understanding, complex sentence handling, and subject-specific image generation may persist in DALL-E 3. However, the integration with ChatGPT is aimed at addressing some of these limitations by refining text prompts for better image generation.

Open-source implementations

There have been several attempts to create open-source implementations of DALL-E. Craiyon↗︎, formerly known as DALL-E Mini, is an AI model based on the original DALL-E that was trained on unfiltered data from the internet. It gained attention in mid-2022 for its ability to generate humorous imagery.