ViTAE-Transformer

All

20 repositories

ViTAE-Transformer-Scene-Text-Detection
Public
A comprehensive list [GoMatching@NeurIPS'24, DeepSolo(++)@ CVPR'23, DPText-DETR@AAAI'23, I3CL@IJCV'22] of our research works related to scene text detection, spotting, etc., including papers, codes.
ocr deep-learning pytorch scene-text-detection vision-transformer scene-text-spotting
TeX
•3•79•0•0•Updated Nov 12, 2024Nov 12, 2024
LeMeViT
Public
The official repo for [IJCAI'24] "LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation"
deep-learning remote-sensing attention object-detection semantic-segmentation scene-classification vision-transformer
Python
•3•41•1•0•Updated Nov 11, 2024Nov 11, 2024
RSP
Public
The official repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining"
deep-learning remote-sensing classification imagenet object-detection transfer-learning semantic-segmentation change-detection pre-training foundation-models
Python
•
MIT License
•6•133•12•1•Updated Nov 7, 2024Nov 7, 2024
DeepSolo
Public
The official repo for [CVPR'23] "DeepSolo: Let Transformer Decoder with Explicit Points Solo for Text Spotting" & [ArXiv'23] "DeepSolo++: Let Transformer Decoder with Explicit Points Solo for Multilingual Text Spotting"
detection-transformer scene-text-spotting chinese-text-spotting multilingual-text-spotting explicit-point-query
Python
•
Other
•34•248•27•4•Updated Aug 9, 2024Aug 9, 2024
SAMRS
Public
The official repo for [NeurIPS'23] "SAMRS: Scaling-up Remote Sensing Segmentation Dataset with Segment Anything Model"
deep-learning sam transfer-learning semantic-segmentation pre-training segment-anything-model dataset remote-sensing
Python
•14•284•32•0•Updated Aug 5, 2024Aug 5, 2024
ViTPose
Public
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
deep-learning pytorch pose-estimation mae distillation self-supervised-learning vision-transformer
Python
•
Apache License 2.0
•186•1.4k•87•6•Updated Jul 24, 2024Jul 24, 2024
MTP
Public
The official repo for [JSTARS'24] "MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining"
deep-learning remote-sensing classification object-detection transfer-learning semantic-segmentation change-detection pre-training vision-transformer foundation-models
Python
•
MIT License
•11•170•15•0•Updated Jun 17, 2024Jun 17, 2024
ViTAE-Transformer-Remote-Sensing
Public
A comprehensive list [SAMRS@NeurIPS'23, RVSA@TGRS'22, RSP@TGRS'22] of our research works related to remote sensing, including papers, codes, and citations. Note: The repo for [TGRS'22] "An Empirical Study of Remote Sensing Pretraining" has been moved to: https://github.com/ViTAE-Transformer/RSP
deep-learning remote-sensing classification object-detection transfer-learning semantic-segmentation change-detection self-supervised-learning vision-transformer
TeX
•53•460•9•0•Updated Jun 6, 2024Jun 6, 2024
SimDistill
Public
The official repo for [AAAI 2024] "SimDistill: Simulated Multi-modal Distillation for BEV 3D Object Detection""
deep-learning simulation distillation 3d-object-detection bird-view-image
Python
•
Apache License 2.0
•2•27•5•0•Updated May 16, 2024May 16, 2024
APTv2
Public
The official repo for the extension of [NeurIPS'22] "APT-36K: A Large-scale Benchmark for Animal Pose Estimation and Tracking": https://github.com/pandorgan/APT-36K
benchmark deep-learning dataset transfer-learning pose-estimation few-shot-learning pre-training pose-tracking animal-pose-estimation vision-transformer
Python
•
Apache License 2.0
•0•12•1•0•Updated May 15, 2024May 15, 2024
QFormer
Public
The official repo for [TPAMI'23] "Vision Transformer with Quadrangle Attention"
deep-learning backbone classification object-detection attention-mechanism semantic-segmentation pose-estimation vision-transformer
Python
•
MIT License
•10•175•3•0•Updated Apr 10, 2024Apr 10, 2024
P3M-Net
Public
The official repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving"
image-matting vision-transformer deep-learning pytorch
Python
•
MIT License
•9•93•6•0•Updated Mar 30, 2024Mar 30, 2024
Remote-Sensing-RVSA
Public
The official repo for [TGRS'22] "Advancing Plain Vision Transformer Towards Remote Sensing Foundation Model"
deep-learning pytorch remote-sensing object-detection transfer-learning semantic-segmentation scene-classification self-supervised-learning vision-transformer foundation-models
Python
•
MIT License
•32•418•25•2•Updated Dec 20, 2023Dec 20, 2023
SAMText
Public
The official repo for the technical report "Scalable Mask Annotation for Video Text Spotting"
deep-learning sam dataset scene-text-spotting segment-anything-model video-text-spotting
0•17•2•0•Updated May 3, 2023May 3, 2023
I3CL
Public
The official repo for [IJCV'22] "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection"
scene-text-detection vision-transformer deep-learning pytorch
Python
•2•8•3•0•Updated Apr 12, 2023Apr 12, 2023
ViTAE-Transformer-Matting
Public
A comprehensive list [AIM@IJCAI'21, P3M@MM'21, GFM@IJCV'22, RIM@CVPR'23, P3MNet@IJCV'23] of our research works related to image matting, including papers, codes, datasets, demos, and citations. Note: The repo for [IJCV'23] "Rethinking Portrait Matting with Privacy Preserving" has been moved to: https://github.com/ViTAE-Transformer/P3M-Net
computer-vision deep-learning survey privacy-preserving image-matting vision-transformer
TeX
•24•231•1•0•Updated Apr 11, 2023Apr 11, 2023
ViTAE-Transformer
Public
The official repo for [NeurIPS'21] "ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias" and [IJCV'22] "ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond"
imagenet object-detection semantic-segmentation mscoco ade20k imagenet-classification vision-transformer vitae-transformer deep-learning
Python
•28•253•11•0•Updated Apr 5, 2023Apr 5, 2023
ViTAE-VSA
Public
The official repo for [ECCV'22] "VSA: Learning Varied-Size Window Attention in Vision Transformers"
deep-learning backbone classification attention-mechanism vision-transformer object-detection instance-segmentation
Python
•9•157•8•0•Updated Mar 17, 2023Mar 17, 2023
VOS-LLB
Public
The official repo for [AAAI'23] "Learning to Learn Better for Video Object Segmentation"
video-object-segmentation vos vision-transformer deep-learning
Python
•0•11•2•0•Updated Feb 17, 2023Feb 17, 2023
ViTDet
Public
Unofficial implementation for [ECCV'22] "Exploring Plain Vision Transformer Backbones for Object Detection"
vision-transformer deep-learning pytorch object-detection
Python
•
Apache License 2.0
•46•532•17•0•Updated Apr 24, 2022Apr 24, 2022