From 6e5cf2361ee287c604c4d09eed37b7e91c4af267 Mon Sep 17 00:00:00 2001 From: Yi Jiang Date: Sun, 12 Jan 2025 17:37:53 +0800 Subject: [PATCH] [upd format] Third-party Usage and Research --- README.md | 63 +++++++++++++++++++------------------------------------ 1 file changed, 22 insertions(+), 41 deletions(-) diff --git a/README.md b/README.md index 49ae835..eb7f33b 100644 --- a/README.md +++ b/README.md @@ -163,47 +163,28 @@ We'll provide the sampling script later. (`Note please report accuracy numbers and provide trained models in your new repository to facilitate others to get sense of correctness and model behavior`) -[12/30/2024] Next Token Prediction Towards Multimodal Intelligence: https://github.com/LMM101/Awesome-Multimodal-Next-Token-Prediction - -[12/30/2024] Varformer: Adapting VAR’s Generative Prior for Image Restoration: https://arxiv.org/abs/2412.21063 - -[12/22/2024] Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching: https://github.com/imagination-research/distilled-decoding - -[12/19/2024] FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching: https://github.com/OliverRensu/FlowAR - -[12/13/2024] 3D representation in 512-Byte: Variational tokenizer is the key for autoregressive 3D generation: https://github.com/sparse-mvs-2/VAT - -[12/19/2024] FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching: https://github.com/OliverRensu/FlowAR - -[12/9/2024] CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction: https://carp-robot.github.io/ - -[12/5/2024] Infinity ∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis: https://github.com/FoundationVision/Infinity - -[12/5/2024] Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis: https://github.com/yandex-research/switti - -[12/4/2024] TokenFlow🚀: Unified Image Tokenizer for Multimodal Understanding and Generation: https://github.com/ByteFlow-AI/TokenFlow - -[12/3/2024] XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation: https://github.com/lxa9867/ImageFolder - -[11/28/2024] CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient: https://github.com/czg1225/CoDe - -[11/27/2024] SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE: https://github.com/cyw-3d/SAR3D - -[11/26/2024] LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization: https://arxiv.org/abs/2411.17178 - -[11/28/2024] Scalable Autoregressive Monocular Depth Estimation: https://arxiv.org/abs/2411.11361 - -[11/15/2024] M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation: https://github.com/OliverRensu/MVAR - -[10/14/2024] HART: Efficient Visual Generation with Hybrid Autoregressive Transformer: https://github.com/mit-han-lab/hart - -[10/3/2024] ImageFolder🚀: Autoregressive Image Generation with Folded Tokens: https://github.com/lxa9867/ImageFolder - -[07/25/2024] ControlVAR: Exploring Controllable Visual Autoregressive Modeling: https://github.com/lxa9867/ControlVAR - -[07/3/2024] VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling: https://github.com/daixiangzi/VAR-CLIP - -[06/16/2024] STAR: Scale-wise Text-to-image generation via Auto-Regressive representations: https://arxiv.org/abs/2406.10797 +| **Time** | **Research** | **Link** | +|--------------|--------------------------------------------------------------------------------------------------|--------------------------------------------------------------------| +| [12/30/2024] | Next Token Prediction Towards Multimodal Intelligence | https://github.com/LMM101/Awesome-Multimodal-Next-Token-Prediction | +| [12/30/2024] | Varformer: Adapting VAR’s Generative Prior for Image Restoration |https://arxiv.org/abs/2412.21063 | +| [12/22/2024] | Distilled Decoding 1: One-step Sampling of Image Auto-regressive Models with Flow Matching | https://github.com/imagination-research/distilled-decoding | +| [12/19/2024] | FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching | https://github.com/OliverRensu/FlowAR | +| [12/13/2024] | 3D representation in 512-Byte: Variational tokenizer is the key for autoregressive 3D generation | https://github.com/sparse-mvs-2/VAT | +| [12/9/2024] | CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction | https://carp-robot.github.io/ | +| [12/5/2024] | Infinity ∞: Scaling Bitwise AutoRegressive Modeling for High-Resolution Image Synthesis | https://github.com/FoundationVision/Infinity | +| [12/5/2024] | Switti: Designing Scale-Wise Transformers for Text-to-Image Synthesis | https://github.com/yandex-research/switti | +| [12/4/2024] | TokenFlow🚀: Unified Image Tokenizer for Multimodal Understanding and Generation | https://github.com/ByteFlow-AI/TokenFlow | +| [12/3/2024] | XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation | https://github.com/lxa9867/ImageFolder | +| [11/28/2024] | CoDe: Collaborative Decoding Makes Visual Auto-Regressive Modeling Efficient | https://github.com/czg1225/CoDe | +| [11/28/2024] | Scalable Autoregressive Monocular Depth Estimation | https://arxiv.org/abs/2411.11361 | +| [11/27/2024] | SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE | https://github.com/cyw-3d/SAR3D | +| [11/26/2024] | LiteVAR: Compressing Visual Autoregressive Modelling with Efficient Attention and Quantization | https://arxiv.org/abs/2411.17178 | +| [11/15/2024] | M-VAR: Decoupled Scale-wise Autoregressive Modeling for High-Quality Image Generation | https://github.com/OliverRensu/MVAR | +| [10/14/2024] | HART: Efficient Visual Generation with Hybrid Autoregressive Transformer | https://github.com/mit-han-lab/hart | +| [10/3/2024] | ImageFolder🚀: Autoregressive Image Generation with Folded Tokens | https://github.com/lxa9867/ImageFolder | +| [07/25/2024] | ControlVAR: Exploring Controllable Visual Autoregressive Modeling | https://github.com/lxa9867/ControlVAR | +| [07/3/2024] | VAR-CLIP: Text-to-Image Generator with Visual Auto-Regressive Modeling | https://github.com/daixiangzi/VAR-CLIP | +| [06/16/2024] | STAR: Scale-wise Text-to-image generation via Auto-Regressive representations | https://arxiv.org/abs/2406.10797 | ## License