ByteDance Introduces SDXL-Lightning Text-to-Image AI Model; Begins Beta Testing for AI-Generated Video Feature in CapCut

As OpenAI’s video generation tool Sora showcases the artificial intelligence giant’s capabilities, Chinese short video giant ByteDance is actively launching its own image and video AI tools.

In recent weeks, ByteDance has introduced a text-to-image generation AI model called SDXL-Lightning. Additionally, the Chinese firm’s editing app CapCut has rolled out an AI-generated video feature for its overseas version, enabling users to create videos.

Earlier this year, ByteDance released the ultra-high-definition text-to-video model MagicVideo-V2, which can create videos compared to rival text-to-video models like Gen-2, Stable Video Diffusion, and Pika 1.0.

Furthermore, ByteDance announced the resignation of Zhang Nan, former CEO of its Douyin Group, who will now concentrate her efforts on the development of CapCut. Zhang, a seasoned veteran who has previously led various business units at ByteDance, may contribute to the enhancement of ByteDance’s AI video capabilities in her new role.

ByteDance’s Text-to-Image SDXL-Lightning AI Model

ByteDance’s newly released text-to-image AI model, SDXL-Lightning, is currently trending among models on Hugging Face Spaces, alongside Google’s recently launched Gemma series and Stability AI’s Stable Cascade.

In the field of image generation, advanced models rely on diffusion processes, which typically require 20 to 40 calls to neural networks, consuming massive computational resources and resulting in relatively slow generation speeds.

ByteDance’s SDXL-Lightning model reportedly achieves faster generation speeds through progressive adversarial distillation technology. The model can generate high-quality and high-resolution images in two or four steps, significantly accelerating generation speeds. It also reduces computational costs significantly.

The SDXL-Lightning model shows potential for use cases requiring fast image generation, such as real-time advertising creativity and game character design. This technology can also be applied to quickly generate high-quality videos and audios.

SDXL-Lightning can be seamlessly integrated as a speed-up plug-in into various styles of SDXL models such as cartoons and animations and supports popular control plug-ins like ControlNet and generation software like ComfyUI.

ByteDance Begins Beta Testing of CapCut’s Video Generation Feature

Sora is capable of generating 60-second videos, a key offering of ByteDance’s popular short-video apps TikTok and Douyin. Therefore, it’s no surprise that ByteDance is striving to keep up with leading industry companies like OpenAI in AI video generation tools.

As a first step, ByteDance has initiated beta testing of some new features in its CapCut editing app, including an AI-generated video feature. Since CapCut is primarily an editing app, these features extend beyond video generation to other editing functions.

This latest move follows previous announcements. In November 2023, CapCut tested an AIGC tool called "Dreamina," enabling users to input text and generate four creative images using AI. These images can be produced in various styles, such as abstract or realistic. Reportedly, this tool will be used for content creation, including graphics or short videos, on ByteDance’s Douyin app.

It is expected that ByteDance will continue to enhance its AI video tools and integrate them into its popular short video apps, crucial to maintaining TikTok and Douyin’s leading position in the video segment.

However, CapCut faces similar challenges as rival products like Runway, Pika, and Genmo, including issues with unnatural movements, low fidelity, and other quality concerns in the generated videos.

In this regard, CapCut and other AI video tools still lag behind Sora. Additionally, CapCut’s video generation feature can take days to complete and is occasionally unavailable at the time of writing.

A Key Personnel Change Could Signal ByteDance’s Ambitions in AI Video Generation Tools

Just one week before the debut of Sora, it was announced that Zhang Nan, the former CEO of ByteDance’s Douyin Group, had resigned. She will now focus her energy on the development of CapCut in the future.

High-level personnel changes often accompany business adjustments, and entrusting the management of CapCut to someone familiar with the Douyin ecosystem underscores ByteDance’s goal of seizing new opportunities in AI video generation.

Insiders close to CapCut revealed that over the past year, Zhang Nan has devoted the majority of her energy to related business areas of CapCut. She personally led the team in seeking breakthroughs in AIGC and is about to launch a product for AI-generated videos.

Overall, ByteDance’s massive data resources in short video and social media provide it with a unique advantage in the development of text-to-video models. The release of MagicVideo-V2 and its significant improvement in performance have demonstrated ByteDance’s technological strength and innovative capabilities in this field.


Related News: