Multimodal Large Language Models for Multi-Subject In-Context Image Generation

Explore MUSIC, a novel multimodal large language model designed to solve multi-subject image generation challenges through advanced reasoning and spatial pla...

Level: advanced

By Yucheng Zhou

Category: research