What is CSP?
CSP is a concept designed to reduce the amount of computation while enabling multiple gradient combinations. As a solution, the input to a dense block of CSP is divided into two parts. One part is used directly in the concatenation with the final output of the DB+TB chain, and the other part is used as an input in the dense block, like:
This concept provides two advantages:
Increases the gradient flow paths, which helps in removing the replicated updates of the weights in the dense block.
Breaking a base layer into two parts helps decrease the number of multiplications in the dense block, which further helps increase inference speed.
DenseBlock
A DenseBlock is designed to perform pooling on CNNs on DenseNet.
DenseNet
Concatenates feature maps from all layers. Links the feature maps of the previous layer to the feature maps of all subsequent layers. Thus, the size of every feature map must be the same and the number of feature maps must be relatively small.
Performs pooling operations between dense blocks. The pooling operation is performed with BN, 1x1 conv, and 2x2 avg_pool. This is called the transition layer. If the number of input channels of the transition layer is $m$, $\theta \times m$ number of channels is output. That is, transition layer decreases the size of feature map(growth rate) and the number of channels.