What is YOLOv4?

YOLOv4 addresses the disadvantages of R-CNN, which has low FPS and does not require fixed-size input images because of SPP(Spatial Pyramid Pooling).

However, YOLOv4 is not as accurate as R-CNN. YOLOv4 attempts to solve both problems and strikes a balance between speed and accuracy. Then, let’s see how YOLOv4 is structured.

Structure of YOLOv4

Backbone : CSPDarknet53
- BOF for backbone : CUTMIX, mosaic data augmentation
- BOS for backbone :CSP, MiWRC
Neck : SPP, modified PAN(Path Aggregation Network)
Head : YOLOv3
- BOF for detector : CIoU-loss, CmBN, DropBlock regularization, Mosaic data augmentation, Self-Adversarial Training, Eliminate grid sensitivity, Using multiple anchors for a single ground truth, Cosine annealing scheduler, Optimal hyperparameters, Random training shapes.
- BOS for Detector : Mish activation, SPP-block, SAM-block, PAN path-aggregation block, DIoU-NMS
The whole structure

BOF & BOS

BOF = better accuracy, not increase inference cost conventional object detector.
BOS = post-processing methods that increase the inference cost by a small amount but significantly improve the accuracy of object detection

Implementation

Use YOLOv4

YOLOv4

What is YOLOv4?

Structure of YOLOv4

BOF & BOS

Implementation

Further Reading

YOLOv3(You Only Look Once)

SSD(Single Shot Detection)

Face-Recognition