What is scaled-YOLOv4?
Scaled-YOLOv4 is designed to be used on every device, whether they are low-end or high-end. Additionally, it can perform real-time object detection on 16, 30, and 60 FPS videos or embedded systems. As indicated by the name scaled-YOLOv4, it includes YOLOv4 and the upper and lower bounds of linear scaling models, which are YOLOv4-large and YOLOv4-tiny. Then, let’s see the structure of scaled-YOLOv4.
Structure of scaled-YOLOv4
Backbone : CSPDarknet53 with no computation of down-sampling convolution for cross-stage process. Scaled-YOLOv4 is divided into YOLOv4-tiny and YOLOv4-large.
YOLOv4-tiny : CSPOSANet with PCB architecture to form the backbone of YOLOv4.
And also to make YOLOv4-tiny compute as low as possible then $O(whkb^2)$, it performs model scaling by :
computational block of YOLOv4-tiny where
b = number of channels in the base layer.
k = number of layers.
g = growth rate which is the number of filters used in each convolutional layer.
Neck : Uses CSP-ized PAN architecture in YOLOv4 (shown as (a) in the image below) and 2 reversed CSP dark layers (shown as (b) in the image below). Unlike the original SPP module, in scaled-YOLOv4, the SPP module is inserted in the middle position of the first computation list group of the CSPPAN.
- YOLOv4-large : A fully CSP-ized model starting from YOLOv4-P5 and scaling up to YOLOv4-P6 and YOLOv4-P7.
YOLOv4-P6 reaches real-time performance at 30 FPS video when the width scaling factor is equal to 1.
YOLOv4-P7 reaches real-time performance at 16 FPS video when the width scaling factor is 1.25.
BOF for backbone : CUTMIX, mosaic data augmentation
BOS for backbone :CSP, MiWRC
- YOLOv4-large : A fully CSP-ized model starting from YOLOv4-P5 and scaling up to YOLOv4-P6 and YOLOv4-P7.
Head : YOLOv3
BOF for detector : CIoU-loss, CmBN, DropBlock regularization, Mosaic data augmentation, Self-Adversarial Training, Eliminate grid sensitivity, Using multiple anchors for a single ground truth, Cosine annealing scheduler, Optimal hyperparameters, Random training shapes.
BOS for Detector : Mish activation, SPP-block, SAM-block, PAN path-aggregation block, DIoU-NMS