What is YOLOv4?
YOLOv4 addresses the disadvantages of R-CNN, which has low FPS and does not require fixed-size input images because of SPP(Spatial Pyramid Pooling).
However, YOLOv4 is not as accurate as R-CNN. YOLOv4 attempts to solve both problems and strikes a balance between speed and accuracy. Then, let’s see how YOLOv4 is structured.
Structure of YOLOv4
Backbone : CSPDarknet53
BOF for backbone : CUTMIX, mosaic data augmentation
BOS for backbone :CSP, MiWRC
Neck : SPP, modified PAN(Path Aggregation Network)
Head : YOLOv3
BOF for detector : CIoU-loss, CmBN, DropBlock regularization, Mosaic data augmentation, Self-Adversarial Training, Eliminate grid sensitivity, Using multiple anchors for a single ground truth, Cosine annealing scheduler, Optimal hyperparameters, Random training shapes.
BOS for Detector : Mish activation, SPP-block, SAM-block, PAN path-aggregation block, DIoU-NMS
The whole structure
BOF & BOS
BOF = better accuracy, not increase inference cost conventional object detector.
BOS = post-processing methods that increase the inference cost by a small amount but significantly improve the accuracy of object detection