Change the repository type filter
All
Repositories list
78 repositories
Show-o
Public[ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.- A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
ShowUI
PublicOpen-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.- 💻 A curated list of papers and resources for multi-modal Graphical User Interface (GUI) agents.
MovieSeq
PublicFQGAN
PublicVideoLISA
PublicMovieBench
PublicIDProtector
PublicROICtrl
Publicvideogui
PublicShow-1
Public- [ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
sparseformer
Public(ICLR 2024, CVPR 2024) SparseFormerLOVA3
Public(NeurIPS 2024) Learning to Visual Question Answering, Asking and AssessmentExo2Ego-V
Publicwatermark-steganalysis
PublicEvolveDirector
PublicGUI-Narrator
PublicRingID
Public