What is Convolutional Neural Networks? Full Guide

Table of Contents

Introduction To Multiple Layers Neural Networks

Efficient Neural networks are multiple layers of neural networks, which is also called Deep Learning neural network.In such systems, each layer of neuron have a specific role which will depend on the design and the goal which is targeted. A typical Multi-layer neural network can have dozens of layers.The number of layers is called the depth of a neural network.

A layer can perform for example one of the following operations

Filtering;
Detecting (Patterns) ;
Extractring (Patterns) ;
Subsampling;
Merging (fusion);
Fuzzing;
Pooling & Max-Pooling;
Convolution;
Classification (usually output layer);
Regression (usually output layer).

Each layer has a given number of neurons, called the width of the neural network. The total amount of neurons is called the size of the neural network.

Given a classification problem, it is needed to define the amount of layers and the values of the widths of each layer, as well as defining the role of each layer.

Each layer transforms the data, extracting features, narrowing, shrinking, flattening or pruning it or, to the contrary, adding entropy, randomness to it. This is the complicated part when dealing with deep learning networks. Most must be designed from experience, intuition, trial & errors and experiments.

Many templates are also available from previous research from data scientists and some blocks can be re-used to create new deep learning designs.

A good design is the one which has been proven from real-world use cases to be robust and trustworthy.

There are many sub-types of neural networks. For example Convolutional neural networks, Bayesian neural networks etc… here, in what follows, we are interested about the Convolutional networks.

Some Quick Facts About The Convolutional Neural Networks

Generalities

Convolutional Neural Networks (CNNs) are a special class of neural networks generalizing multilayer perceptrons (eg feed-forward networks ). CNNs are primarily based on convolution operations, eg ‘dot products’ between data represented as a matrix and a filter also represented as a matrix. The Convolution operation can be seen as an alternative to the Matrix product. The result can be seen as a regularization or ‘smoothing’ of the raw input data.

A typical Convolutional network will create a 3D matrix from raw data.These data can be imagery taken from photography or they can be as well imagery representing frequency over time with color codes for various power intensity.

In the above representation, each cell represents a neuron.

This technique involves spatial data which are invariant under translation shifts. For these reasons, Convolutional neural networks are also called shift invariant or space invariant artificial neural networks (SIANN).

The convolution layer is usually followed by a pooling layer and that is repeated for as many rounds as necessary,

Convolution Operation

The Convolution operation is similar to the product between two matrices.

If M and N are two matrices then (MN)_{ij} = Σ_kM_ikN_kj

The convolution operation between two matrices is more complicated. A matrix K, usually “small” plays the role of a ‘filter” and regularize an other matrix, A ,using the formula :

(A*K)_{ij} =Σ_s,tK_stA_i-s,j-t

In this formula, s and t runs through, respectively -a to +a and -b to +b which implies that the filter matrix has dimensions 2a+1 x 2b+1. Usually 2a+1=2b+1=F.

The size of the convoluted matrix is given by C = ((L-F+2P)/S)+1.

L is the size of the input matrix A , F the size of the filter matrix , S the stride, and P the eventual padding applied to the input matrix A.

We represent the geometrical meaning of that operation:

In the above representation, L=5, F=3, S=1, P=0 and therefore C=(5-3)/1+1=3.

The convolution operation is well known in image processing.

Main Operations Performed In A Convolutional Neural Network

As mentioned previously, a Convolutional neural network will mainly perform convolution operations but that’s usually not enough for it to be efficient, it needs several other operations done in addition to convolution. We will detail these operations in what follows.

operation	definition
convolution	A “filter” is applied over the RADAR data (image),the filter scans a range of “cells”at a time and it builds an output feature matrix which is able to predict the class to which each of the feature belongs
pooling	Pooling is also the practice of downsampling. It reduces the amount of information after convolution has been applied. There are different types of pooling: max-pooling or average pooling for instance. The idea is to find the maximum/average value in a “pool” of data inside the values and replace that pool by that unique value.
subsampling	Normalize all data in a pool by giving all of them a unique value. It’s similar to pooling.
dropout	A technique to prevent overfitting which happens when the network is fully connected.Dropout will set some weights to zero and concretely will remove some neurons and connections from the network, usually randomly.
flatten	Reduce the dimensions of an input layer
dense	Totally connects one layer to another
softmax	Maps the output to a normalized probability distribution representing the probability of appearance in a class.

Some Classical Convolutional Neural Network Designs

Over the years, several CNN designs have been found to behave optimally for several classes of problems, usually Image Recognition. Here we list several neural networks designs that have been successfully used for all sorts of data classification,especially image classifications.

Lenet-5

The LeNet architecture is one of the first CNN and was developed by Yann LeCun, Leon Bottou, Yoshua Bengio and Patrick Haffner in 1998. This design became very popular for the automatic recognition of handwritten digits and for document recognition in general. LeNet-5 was successfully used in order to achieve the classification of hand-written digits on bank cheques in the U.S.A.

LeNet-5 has 6 hidden layers C₁→S₂→C₃→S₄→C₅→F₆.

C₁,C₃,C₅ are convolution layers.S₁,S₄ are pooling layers and the last layer, F₆, achieves fully connection.

This is a relatively simple design, at least from a modern perspective. It has some unusual features. For instance not all the data produced by the layer S₂ are used by the following layer, the convolution layer C₃ so to produce different patterns from different outputs.

The LeNet designs are multilevel machine learning algorithm using spatial relationships so to reduce the amount of parameters and therefore improving training performance

LeNet-5 model was successfully used to classify moving targets from RADAR Doppler data reflecting the changes in velocity (e.g acceleration/.deceleration) . The model was used as an image classificator from time-frequency image representations.

VGGNet

VGGNet is a ‘modern’ CNN which consists of 16 convolutional layers. It bears similarities to the AlexNet but have more filters. VGGNET has 1.38 million parameters and so it consumes a lot of CPU power.

GoogleNet

The GoogleNet design is based on a 22 layer deep CNN using a sub-module which performs small convolutions and which is called by the designers “inception

Module”. This inception module uses batch normalization and the gradient descent method known as RMSprop.

The number of parameters are around 4 millions and it also requires important CPU power to be run.

AlexNet

AlexNet is based on five convolutional layers which are followed by three fully connected layers.

It also uses very extensively the ReLu activation function..AlexNet has more than 60 millions of parameters and so it is one of the most powerful CNN design.

ResNet

The resNet family includes designs named ResNet50 or ResNet18.

Resnets are residual neural networks which use the concept of pyramidal cells similar to the ones found in biological cortex neurons.

ResNets use skips to jump from one layer to another ‘distant’ layer so to avoid the problem of vanishing gradients.

A typical residual network such as ResNet50 is 50 layers deep.

Other Designs

Such neural networks designs like DivNet-15, ZFnets, Densenets and U-nets have been also applied.

Conclusion

Convolutional neural networks are one of the most powerful machine learning algorithms currently available, when it comes to image recognition and classification. Modern designs can involve hundreds of millions of parameters and therefore they can be very costly to run and can involve a lot of time for training.

Acodez is a renowned website development and Emerging Technology Services company in India. We offer all kinds of web application development services to our clients using the latest technologies. We are also a leading digital marketing company providing SEO, SMM, SEM, Inbound marketing services, etc at affordable prices. For further information, please contact us.

Looking for a good team
for your next project?

Here're some more related blogs

What is Agentic Commerce Protocol and How is it Redefining the Future of Online Shopping?

Posted on Oct 13, 2025 | ACodes series

What is the Main Goal of Generative AI?

Posted on Sep 17, 2025 | AI and ML

Prompt Engineering for Developers: The New Coding Superpower

Posted on Sep 10, 2025 | ACodes series

We'd love to talk with you

Delhi NCR - India

Mumbai - India

Bangalore - India

Calicut - India

What Is Convolutional Neural Networks?

Introduction To Multiple Layers Neural Networks