| 03 |
An image is worth 16x16 words: Transformers for image recognition at scale |
pdf |
#Pre-Print |
2021 |
| 04 |
End-to-end object detection with transformers |
pdf |
#EECV |
2020 |
| 05 |
Deformable DETR: # Deformable Transformers for End-to-End Object Detection |
pdf |
#Pre-Print |
2021 |
| 06 |
Dynamic DETR: End-to-End Object Detection With Dynamic Attention |
pdf |
#ICCV |
2021 |
| 07 |
UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers |
pdf |
#CVPR |
2021 |
| 08 |
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection |
pdf |
#Pre-Print |
2022 |
| 09 |
DINOv2: Learning Robust Visual Features without Supervision |
pdf |
#Pre-Print |
2024 |
| 10 |
### Efficient detr: improving end-to-end object detector with dense prior |
|
|
2021 |
| 11 |
### Dab-detr: Dynamic anchor boxes are better queries for detr |
|
|
2022 |
| 12 |
### Sparse detr: Efficient end-to-end object detection with learnable sparsity |
|
|
2022 |
| 13 |
### Co-DETR: DETRs with Collaborative Hybrid Assignments Training |
|
|
2023 |
| 14 |
DETRs Beat YOLOs on Real-time Object Detection |
pdf |
#CVPR |
2024 |
| 15 |
PVT2 |
|
|
|
| 16 |
Twins |
|
|
|
| 17 |
Swin Transformer - Hierarchical Vision Transformer Using Shifted Windows |
pdf |
#ICCV |
2022 |
| 18 |
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers |
pdf |
#NeurIPS |
|
| 19 |
CG ViT: Global Context Vision Transformers |
pdf |
#PMIR |
2023 |
| 20 |
DynamicViT |
|
|
|
| 21 |
Focal Self-attention for Local-Global Interactions in Vision Transformers |
pdf |
#NeurIPS |
2022 |
| 22 |
CSWin Transformer |
|
|
|
| 23 |
MaxViT |
|
|
|
| 24 |
MinViT |
|
|
|
| 25 |
InternImage |
|
|
|
| 26 |
UFO (Unified Feature Optimization) Transformer |
|
|
|
| 27 |
LaVin-DiT: Large Vision Diffusion Transformer |
pdf |
#Pre-Print |
2024 |
|
|
|
|
|