ByteDance Introduces Boximator: A Text-Controlled Video Generation Model

February 20, 2024 – Today, reports emerged indicating that prior to Sora’s domination in the video generation space, ByteDance, a Chinese technology giant, had also introduced a disruptive video model known as Boximator.

Unlike other models such as Gen-2 and Pink1.0, Boximator boasts the unique capability of precisely controlling the movements of characters or objects within generated videos through textual inputs.

In response to these reports, a representative from ByteDance clarified that Boximator is currently a research project focused on exploring techniques for controlling object motion in video generation and is not yet ready for commercialization as a fully-fledged product.

Moreover, the representative acknowledged that there are still significant gaps in terms of image quality, fidelity, and video duration when compared to leading video generation models developed internationally.

Meanwhile, OpenAI recently unveiled its inaugural video generation model, Sora, which has already been hailed as a game-changer in the field. With just a prompt, Sora can generate one-minute high-definition videos, showcasing its remarkable ability to produce complex scenes featuring multiple characters and specific types of movements, while accurately rendering details of objects and backgrounds.

On its official website, OpenAI has showcased 48 video examples, demonstrating Sora’s proficiency in accurately portraying video details and its profound understanding of how objects exist in the real world. The model’s ability to generate emotionally rich characters further underscores its potential to revolutionize the video generation landscape.

Leave a Reply