PVT

Paper：Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Origin Repo：whai362/PVT
Code：pvt.py

Evaluate Transforms：

# backend: pil
# input_size: 224x224
transforms = T.Compose([
    T.Resize(248, interpolation='bicubic'),
    T.CenterCrop(224),
    T.ToTensor(),
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

Model Details：

Model	Model Name	Params (M)	FLOPs (G)	Top-1 (%)	Top-5 (%)	Pretrained Model
PVT-Tiny	pvt_ti	13.2	1.9	74.96	92.47	Download
PVT-Small	pvt_s	24.5	3.8	79.87	95.05	Download
PVT-Medium	pvt_m	44.2	6.7	81.48	95.75	Download
PVT-Large	pvt_l	61.4	9.8	81.74	95.87	Download

Citation：

@misc{wang2021pyramid,
    title={Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions}, 
    author={Wenhai Wang and Enze Xie and Xiang Li and Deng-Ping Fan and Kaitao Song and Ding Liang and Tong Lu and Ping Luo and Ling Shao},
    year={2021},
    eprint={2102.12122},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pvt.md

pvt.md

PVT

Files

pvt.md

Latest commit

History

pvt.md

File metadata and controls

PVT