🔥🔥🔥 A curated list of papers on LLMs-based multimodal generation (image, video, 3D and audio).
text-to-speech
multimodality
text-to-image
text-to-audio
text-to-video
text-to-music
multimodal-models
aigc
large-language-models
text-to-3d
multimodal-generation
text-to-sound
large-vision-language-models
multimodal-large-language-models
-
Updated
Nov 6, 2024 - HTML