Traditional neural network layers use a matrix
multiplication to describe the interaction between each input unit and each
output unit. This means every output unit interacts with every input unit.
Convolutional networks, however, typically have sparse interactions. This is accomplished
by making the kernel smaller than the input and using it for the whole image.

Parameter sharing
concept is used in CNN. It refers to using the same parameter for more than one
function (input values) in a model. In a convolutional neural net, each member
of the kernel is used at every position of the input. The parameter sharing
used by the convolution operation means that rather than learning a separate
set of parameters for every location, we learn only one set. This is also
called as sparse connectivity.

If the function that a layer needs to learn is
indeed a local, translation invariant function, then the layer will be
dramatically more eﬃcient if it uses convolution rather than matrix
multiplication. If the necessary function does not have these properties, then
using a convolutional layer will cause the model to have high training error.

**Pooling**

First stage, the layer performs several convolutions
in parallel to produce a set of presynaptic activations. In the second stage, each
presynaptic activation is run through a nonlinear activation function, such as
the rectiﬁed linear activation function. This stage is sometimes called the detector
stage. In the third stage, we use a pooling function. A pooling function
replaces the output of the net at a certain location with a summary statistic
of the nearby outputs. For example, the max pooling operation reports the
maximum output within a rectangular neighborhood. Pooling helps to make the
representation becomes invariant to small translations of
the input.

**Zero-padding**

Zero-padding
setting is when just enough zero-padding is added to keep the size of the
output equal to the size of the input. It calls same convolution, full
convolution, in which enough zeroes are added for every pixel to be visited k
times in each direction.

**The CNN behavior analysis will be explained in the further articles from the R&D team at SiliconMentor working in Computer Vision, Biomedical Signal Analysis, VLSI and their associated domains.**