Sora AI Produces Eye-Popping Videos Instantly

Sora AI Produces Eye-Popping Videos Instantly

Sora, an impressive new generative video model created by OpenAI, can take a brief text description and transform it into a minute-long, intricate, high-definition film clip.

OpenAI, the parent company of the ChatGPT chatbot and the still-image generator DALL-E, is among the many companies vying to enhance this instant video generator. Other companies include start-ups like Runway and tech giants like Google and Meta Platforms Inc., the owners of Facebook and Instagram.

The technology has the potential to completely replace less skilled digital artists while speeding up the work of seasoned moviemakers.

Also Read: OpenAI’s Co-Founder, Andrej Karpathy, Steps Down, Eyes Personal Ventures

Releasing Sora

OpenAI named its new system Sora, the Japanese word for sky. The technology’s development team, including the researchers Tim Brooks and Bill Peebles, chose the name because it “evokes the idea of limitless creative potential.”

They also said the company had yet to release Sora to the public because it was still looking into the risks associated with the system. Rather, OpenAI is sharing the technology with a selected group of academics and other outside researchers who will “red team” it, a term to describe searching for potential misuses.

According to Dr. Brooks, the intention here is to give a preview of what is on the horizon so that people can see the capabilities of this technology and get feedback.

OpenAI Tags the Videos

OpenAI already tags videos created by the system with watermarks to indicate they were generated by artificial intelligence (AI). However, the company acknowledges that these can be removed. They added that they can also be challenging to identify.

According to OpenAI, they are teaching artificial intelligence (AI) to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction.

Additionally, they are granting access to several visual artists, designers, and filmmakers to gain feedback on how to advance the model to be most helpful for creative professionals.

They are sharing their research progress early to start working with and getting feedback from people outside of OpenAI and to give the public a sense of what AI capabilities are on the horizon.

Developing Sora

However, OpenAI declined to disclose the number of videos the system learned from or where they came from. They only stated that the training included both publicly available videos and videos licensed by copyright holders.

The company has been sued several times for using copyrighted content. It is probably trying to keep an advantage over competitors, so it doesn’t disclose anything about the data used to train its technologies.

Furthermore, the model has a profound comprehension of language, enabling it to accurately interpret prompts and generate compelling characters that vividly convey emotions. Sora can also cause several shots that maintain the visual shot and character within a single-generated video.

OpenAI shared the prompt to generate a video on their X handle, causing several reactions from X users.

The Model’s Weaknesses

According to OpenAI, the current model has weaknesses. It may need help with accurately simulating the physics of a complex scene and may need help understanding specific instances of cause and effect. For example, a person might bite a cookie, but afterward, the cookie may not have a bite mark.

The model may also need to clarify the spatial details of a prompt, for example, mixing up left and right, and may need help with precise descriptions of events that take place over time, like following a specific camera trajectory.

Image credits: Shutterstock, CC images, Midjourney, Unsplash.