Chinese tech giants, AI ‘godmother’ Li Fei-Fei race to seize the edge in world models


A wave of companies, from the start-up launched by artificial intelligence “godmother” Li Fei-Fei to the largest Chinese tech firms, are racing to introduce their latest approaches to world models – an emerging field aimed at extending AI beyond language processing to learning from and comprehending physical reality.
Alibaba Group Holding on Thursday unveiled Happy Oyster, which it called an open-ended world model designed for real-time and “flowy” virtual world creation and interaction, according to a statement from the e-commerce group’s Alibaba Token Hub (ATH) business unit, newly formed to consolidate its core AI initiatives. Alibaba owns the South China Morning Post.

Happy Oyster supported two modes of virtual world creation, according to ATH: a directing mode for building a world based on text and image prompts and a wandering mode for exploring that world.

Unlike conventional AI video tools, which generate one-off clips that top out at a dozen seconds or a few minutes, Happy Oyster could generate video clips of up to three minutes showing virtual worlds, the company said. In addition, the model could continuously respond to instructions throughout the generation process, as opposed to the conventional, one-shot AI paradigm, the company said.

Happy Oyster supports two modes of virtual world creation: a directing mode for building a world based on text and image prompts and a wandering mode for exploring that world.
Happy Oyster supports two modes of virtual world creation: a directing mode for building a world based on text and image prompts and a wandering mode for exploring that world.
This meant users could keep developing their imaginative worlds with new ideas, ATH said. For instance, a demo video showed that during the generation process, a user could simply type “black crows fly past” to conjure up a flock of flying crows, or order characters to “talk to each other”.

The launch came a day after San Francisco-based World Labs, co-founded by Li, a Stanford professor, in early 2024, unveiled Spark 2.0, an open-source 3D Gaussian splatting rendering engine that aims to give even less powerful devices, such as smartphones, the ability to view large-scale and detailed 3D images.

  • Related Posts

    Centre cuts export tax on petrol, diesel and jet fuel from June 1; domestic rates remain unchanged – Firstpost

    Amid the ongoing crisis in West Asia, the Central Government revised export taxes on petrol, diesel and aviation turbine fuel (ATF) for the next fortnight. Amid the ongoing crisis in…

    Continue reading
    How tomatoes have become the latest symbol of America’s affordability squeeze – Firstpost

    Tomatoes, a staple ingredient found everywhere from fast-food sandwiches to fine dining dishes, are increasingly serving a purpose beyond the kitchen: They have become a persistent reminder of escalating living…

    Continue reading

    Leave a Reply

    Your email address will not be published. Required fields are marked *