New Stable Diffusion 3 Outperforms Peers: Midjourney and Dall-E

February 26, 2024

Stability AI has announced its latest generative AI tool, Stable Diffusion 3 (SD3), which reportedly outperforms competing AI image creators.

Although the model is not yet broadly available, Stability AI has opened a waitlist for an early preview of the image-generating tool, much to the delight of the AI community.

The “most capable” one

According to an article by Decrypt, this latest open source model is way ahead of its peers, Midjourney, Dall-E, and Google ImageFX, as it “excels at prompt adherence and can understand natural language instead of keywords and tags.”

Stability AI itself has referred to its offering as the “most capable text-to-image model” to date, with “improved performance in multi-subject prompts, image quality, and spelling abilities.” SD3 follows the release of Stable Diffusion XL (SDXL).

According to the firm, the SD3 suite of models comes in different sizes, from smaller setups with 800 million parts to huge ones with 8 billion parts.

“This approach aims to align with our core values and democratize access, providing users with a variety of options for scalability and quality to best meet their creative needs,” said Stability AI in an announcement.

“Stable Diffusion 3 combines a diffusion transformer architecture and flow matching.”

Also read: Improbable’s New Metaverse Tech Could Host 40,000 Users Simultaneously

The AI community loves the improvements

While SDXL, which was released last year, established itself as the “most advanced image generator,” Stability AI says the SD3 is an improvement of that. From strong prompt adherence to resistance to prompt leakage, SD3 is made in such a way that it generates content as per users’ requests.

In a post on the X platform, Emad Mostaque, who leads Stable Diffusion, said SD3 can also understand and create videos, although these features are still in the planning stage.

Some users are already smitten by this latest model and its capabilities.

AI-focused YouTuber MattVidPro said: “This AI image generator is the best we’ve ever seen in terms of prompt understanding and text generation.”

“It is leaps above the rest, and it’s truly mind-blowing.”

Another user, a machine learning engineer named Ralph Brooks, described it as “amazing.”

Pretty much. The SD3 arch can accept more than video and image, more details soon.

We have 100x less (literally) the resources of some of the others in this field though, have to work hard. https://t.co/6udkySZWMx

— Emad (@EMostaque) February 22, 2024

SD3 and its peers

Other users with access to the tool are also making comparisons with its competitors in AI image generation, which are SDXL, Dall-E, and Midjourney. The comparisons favor the SD3, and Decrypt also ran some tests to prove this.

In their first test, Decrypt prompted: “Epic anime artwork of a wizard atop a mountain at night casting a cosmic spell into the dark sky that says ‘Stable Diffusion 3’ made out of colorful energy.”

The results showed SD3 was close to the request while Midjourney “failed at the prompt generation, didn’t generate a mountain, and the wizard was not casting a cosmic spell.”

Caution against abusers

Stability acknowledged the risks with generative AI being experienced the world over, such as deepfakes and the spread of misinformation by bad actors.

The firm indicated it has taken measures to ensure safe and responsible use of its image-generating technology too.

“This means we have taken and continue to take reasonable steps to prevent the misuse of Stable Diffusion 3 by bad actors,” said Stability AI.

“Safety starts when we begin training our model and continues throughout the testing, evaluation, and deployment. In preparation for this early preview, we’ve introduced numerous safeguards,” added the AI firm.

In line with this, Stability AI also pledged to continue collaborating with industry experts and researchers to ensure the integrity of the model ahead of its public release.

“Our commitment to ensuring generative AI is open, safe, and universally accessible remains steadfast.”