- 1) Pubilc Datasets and Challenges
- 2) Annotation Tools
- 3) Pioneers and Experts
- 4) Blogs and Videos
- 5) Papers and Sources Codes
- (CVPR 2016) Cityscapes Dataset [github - CityscapesScripts]
- (CVPR 2016) TT100K (Traffic-Sign Detection and Classification in the Wild)(中国交通信号标志数据集)
- (arxiv 2018) CrowdHuman Dataset (A Benchmark for Detecting Human in a Crowd)(拥挤人群人体检测)
- (TPAMI 2023) SODA: A large-scale Small Object Detection dAtaset (小目标检测) [method CFINet]
- CSAILVision/LabelMeAnnotationTool [Source code for the LabelMe annotation tool.]
- tzutalin/LabelImg [🖍️ LabelImg is a graphical image annotation tool and label object bounding boxes in images]
- wkentaro/Labelme [Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation)]
- openvinotoolkit/CVAT [A Powerful and efficient Computer Vision Annotation Tool]
- Ericsson/EVA [A web-based tool for efficient annotation of videos and image sequences and has an additional tracking capabilities]
- (github) High-resolution networks (HRNets) for object detection
- (github) MMDetection: an open source object detection toolbox based on PyTorch by CUHK
- (github) TensorFlow Object Detection API
- (github) Yolov5 Yolov4 Yolov3 TensorRT Implementation
- (github) A Faster Pytorch Implementation of Faster R-CNN
- (github) FAIR's research platform for object detection research, implementing popular algorithms
- (github) Awesome Detection Transformer for Computer Vision (CV) (by
IDEA-Research
) - (github) detrex is a research platform for Transformer-based Instance Recognition algorithms (by
IDEA-Research
) - (github) (yolo-gradcam) yolo model with gradcam visual. 即插即用,不需要对源码进行任何修改!
- 👍(github) A collection of some awesome public YOLO object detection series projects
- (CSDN blog) 理解COCO的评价指标:AP,AP50,AP70,mAP,AP[.50:.05:.95]
- (CSDN blog) 目标检测——One-stage和Two-stage的详解
- (CSDN blog) Anchor-free的目标检测文章
- (CSDN blog) 目标检测Anchor-free分支:基于关键点的目标检测(最新网络全面超越YOLOv3)
- (CSDN blog) YOLO V4 Tiny改进版来啦!速度294FPS精度不减YOLO V4 Tiny
- (blog) Custom Object Detection Tutorial with YOLO V5
- (zhihu) 如何评价YOLOv5?
-
(ACM Computing Surveys 2022.09) Transformers in Vision: A Survey [paper link][arxiv link (2021.01)][
Australian
,USA
] -
(ACM Computing Surveys 2022.12) Efficient Transformers: A Survey [paper link][arxiv link (2020.09)][
Google
] -
(AI Open 2022) A Survey of Transformers [paper link][arxiv link (2021.06][
FDU
] -
(TPAMI 2022) A Survey on Vision Transformer [paper link][arxiv link (2020.12)][
Huawei
,PKU
,Dacheng Tao
] -
(TPAMI 2023) Towards Large-Scale Small Object Detection: Survey and Benchmarks [paper link][arxiv link (2022.07)][project link][datasets
SODA-D
andSODA-A
]
-
❤R-FCN(NIPS2016) R-FCN: Object Detection via Region-based Fully Convolutional Networks [arxiv link][Codes|Caffe&MATLAB(offical)][Codes|Caffe(unoffical)]
-
DCN(ICCV2017) Deformable Convolutional Networks [arxiv link][Codes|MXNet(offical based on R-FCN)][Codes|MXNet(unoffical based on R-FCN)]
-
❤MaskRCNN(ICCV2017) Mask R-CNN [paper link][codes|official]
-
RetinaNet(ICCV2017) Focal Loss for Dense Object Detection [arxiv link][Codes|PyTorch(unoffical)][[Codes|Keras(unoffical)]
-
Repulsion(CVPR2018) Repulsion Loss: Detecting Pedestrians in a Crowd [arxiv link][Codes|PyTorch(unoffical using SSD)][Codes|PyTorch(unoffical using RetinaNet)][CSDN blog]
-
CornerNet(ECCV2018) CornerNet: Detecting Objects as Paired Keypoints [arxiv link][Codes|PyTorch(offical)][Codes|PyTorch(offical CornerNet-Lite)]
-
❤CenterNet(arxiv2019) Objects as Points [arxiv link][Codes|PyTorch(offical)]
-
CenterNet(arxiv2019) CenterNet: Keypoint Triplets for Object Detection [arxiv link][Codes|PyTorch(offical)]
-
❤FCOS(ICCV2019) FCOS: Fully Convolutional One-Stage Object Detection [arxiv link][Codes|PyTorch_MASK_RCNN(offical)][Codes|PyTorch(unoffical improved)][Codes|PyTorch(unoffical using HRNet as backbone)][blog_zhihu]
-
VFNet(arxiv2020) VarifocalNet: An IoU-aware Dense Object Detector [arxiv link][Codes|offical with MMDetection & PyTorch]
-
👍ATSS(CVPR2020 Oral, Best Paper Nomination) Bridging the Gap Between Anchor-based and Anchor-free Detection via Adaptive Training Sample Selection [paper link][arxiv link][code|official][CSDN blog][
Shifeng Zhang
] -
GFLV1 (NIPS2020) Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection [paper link][arxiv link][code|official][
Li Xiang
, based onATSS
] -
GFLV2 (CVPR2021) Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection [paper link][arxiv link][code|official][
Li Xiang
, based onATSS
] -
👍TOOD(ICCV2021 Oral) TOOD: Task-aligned One-stage Object Detection [paper link][arxiv link][project link][code|official]
-
GFL(QFL + DFL)(TPAMI2022) Generalized Focal Loss: Towards Efficient Representation Learning for Dense Object Detection [paper link][
Li Xiang
, based onATSS
]
DETRs --> Pros: eliminates the hand-designed anchor and NMS components; Cons: slow training convergence and hard-to-optimize queries. Thus, researchers should put their efforts of optimizing transformer-based detectors in accelerating training convergence and reducing optimization difficulty
-
❤DEtection TRansformer(DETR) (ECCV2020 BestPaper) End-to-End Object Detection with Transformers [paper link][codes|official][bilibili paper reading video][bilibili paper reading video 2][
facebookresearch
] -
❤DeformableDETR (ICLR2021 OralPaper) Deformable DETR: Deformable Transformers for End-to-End Object Detection [paper link] [codes|official][bilibili paper reading video]
-
ConditionalDETR (ICCV2021) Conditional DETR for Fast Training Convergence [paper link][arxiv link][code|official]
-
DeFCN (CVPR2021) End-to-End Object Detection with Fully Convolutional Network [paper link][arxiv link][codes|official][
Megvii BaseDetection
] -
SparseR-CNN (CVPR2021, TPAMI2023) Sparse R-CNN: End-to-End Object Detection with Learnable Proposals [paper link][arxiv link][code|official]
-
Anchor-DETR (AAAI2022) Anchor detr: Query design for transformer-based detector [paper link][
Megvii Technology
] -
SparseDETR (ICLR2022) Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity [openreview link][arxiv link][code|official]
-
DAB-DETR (ICLR2022) DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR [paper link][codes|official]
-
DN-DETR (CVPR2022 OralPaper) DN-DETR: Accelerate DETR Training by Introducing Query DeNoising [paper link] [codes|official][
IDEA Research
, https://www.idea.edu.cn/] -
DINO (ICLR2023) DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection [openreview link][arxiv link][codes|official][
IDEA Research
] -
HDETR (CVPR2023) DETRs with Hybrid Matching [arxiv link][code|official][code|official (
H-Deformable-DETR
)][code|official (H-PETR-3D
)][code|official (H-PETR-Pose
)] -
DDQ-DETR (CVPR2023) Dense Distinct Query for End-to-End Object Detection [arxiv link][code|official]
-
LiteDETR (CVPR2023) Lite DETR : An Interleaved Multi-Scale Encoder for Efficient DETR [paper link][arxiv link][code|official][
IDEA Research
, https://www.idea.edu.cn/] -
❤RT-DETR(CVPR2024)(arxiv2023) DETRs Beat YOLOs on Real-time Object Detection [paper link][arxiv link][code|official whole-repo][code|official branch-repo][code|reproduced by ultralytics][
Baidu
,PaddlePaddle
,PaddleDetection
]
Anchor-base:
YOLOv2
,YOLOv3
,YOLOv4
,Scaled-YOLOv4
,YOLOv5
,YOLOR
,TPH-YOLOv5
,YOLOv5-Lite
,YOLOv7
; Anchor-Free:YOLOv1
,YOLOX
,PP-YOLOE
,YOLOv5u
,YOLOv6
,YOLOv7u
,YOLOv8
-
❤YOLOv1(CVPR2016) You Only Look Once: Unified, Real-Time Object Detection [paper link][project link (darknet)][code|official by darknet][
Joseph Redmon
,Santosh Divvala
,Ross Girshick
,Ali Farhadi
] -
❤YOLOv2(CVPR2017) YOLO9000: Better, Faster, Stronger [paper link][project link (darknet)][code|unofficial PyTorch][
Joseph Redmon
,Ali Farhadi
] -
YOLOv3(arxiv2018.04) YOLOv3: An Incremental Improvement [paper link][project link (darknet)][code|unofficial PyTorch (github 2018.12)]code|unofficial PyTorch][paper:
Joseph Redmon
,Ali Farhadi
][code:ultralytics
,Glenn Jocher
] -
YOLOv4(arxiv2020.04) YOLOv4: Optimal Speed and Accuracy of Object Detection [paper link][code|official DarkNet][code|official PyTorch_YOLOv4][code|unofficial PyTorch][
Alexey Bochkovskiy
,Chien-Yao Wang
,Hong-Yuan Mark Liao
,Taiwan + Intel
,DarkNet
] -
Scaled-YOLOv4(CVPR2021) Scaled-YOLOv4: Scaling Cross Stage Partial Network [paper link][code|official][
Chien-Yao Wang
,Alexey Bochkovskiy
,Hong-Yuan Mark Liao
,Taiwan + Intel
] -
❤YOLOv5(github 2020.06) YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite [No published paper][code|official by ultralytics][Docs|official by ultralytics][author: Glenn Jocher][
ultralytics
,Glenn Jocher
] -
YOLOR(arxiv2021.05) You Only Learn One Representation: Unified Network for Multiple Tasks [paper link][code|official][
Chien-Yao Wang
,I-Hau Yeh
,Hong-Yuan Mark Liao
,Taiwan
] -
❤YOLOX(arxiv2021.07) YOLOX: Exceeding YOLO Series in 2021 [paper link][code|official by MegEngine-BaseDetection][code|official by MegEngine][Docs|official][
Megvii
,Jian Sun
] -
TPH-YOLOv5(ICCVW2021) TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-Captured Scenarios [paper link][arxiv link][code|official PyTorch][
Transformer
,SwinTransformer
] -
YOLOv5-Lite(github 2021.08) YOLOv5-Lite:Lighter, faster and easier to deploy [No published paper][code|official]
-
PP-YOLOE(arxiv2022.03) PP-YOLOE: An evolved version of YOLO [paper link][code|official][
PaddlePaddle
,PaddleDetection
,Baidu
, based on theTAL (task-aligned assignment loss)
proposed inTOOD
] -
YOLOv6(arxiv2022.09) YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications [paper link][code|official by meituan][YOLOv6 v3.0: A Full-Scale Reloading (arxiv2023.01)][
meituan
] -
❤YOLOv7(CVPR2023)(arxiv2022.07) YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors [paper link][code|official][code|official - pose , blog - ipynb][code|official - seg, blog - ipynb][code|YOLOv7-Seg based on YOLOv5][code|YOLOv7-Pose-Estimation by RizwanMunawar][code|YOLOv7-Segmentation by RizwanMunawar][blog|YOLOv7-Segmentation by RizwanMunawar][
Chien-Yao Wang
,Alexey Bochkovskiy
,Hong-Yuan Mark Liao
,Taiwan + Intel
][re-parameterized module
+dynamic label assignment
+trainable bag-of-freebies
][Following the YOLOv5 project arch
] -
DAMO-YOLO(arxiv2022.11) DAMO-YOLO: A Report on Real-Time Object Detection Design [arxiv link][project link (github)]
-
RTMDet(arxiv2022.12) RTMDet: An Empirical Study of Designing Real-Time Object Detectors [arxiv link][code|official][
Shanghai AI Laboratory + S-Lab, Nanyang Technological University + Tianjin University + SJTU
] -
❤YOLOv8(github 2023.01) YOLOv8 🚀 in PyTorch > ONNX > CoreML > TFLite [No published paper][code|official by ultralytics][Docs|official by ultralytics][
ultralytics
,Glenn Jocher
] -
YOLO-NAS(github 2023.05) A Next-Generation, Object Detection Foundational Model generated by Deci’s Neural Architecture Search Technology [project link][super-gradients github][zhihu blog]
-
YOLO-MS(arxiv2023.08) YOLO-MS: Rethinking Multi-Scale Representation Learning for Real-time Object Detection [arxiv link][code|official][
Nankai University
] -
Gold-YOLO(NIPS2023)(arxiv2023.09) Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism [paper link][arxiv link][code|official][
Huawei
] -
YOLOv9(arxiv2024.02) YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information [arxiv link][code|official][
Chien-Yao Wang
,I-Hau Yeh
andHong-Yuan Mark Liao
inTaiwan
] -
YOLOv10(arxiv2024.05) YOLOv10: Real-Time End-to-End Object Detection [arxiv link][code|official][
THU
][The code base is built withultralytics
andRT-DETR
.]