What are Artificial Neural Networks (ANN)? A Complete Guide

In this article, we will explain the nature of Artificial Neural Networks (ANNs)

Table of Contents

Overview of the concept of ANNs

ANNs are made up of artificial neurons that are connected together to form a directed graph. They are designed to form a Machine Learning system that can learn and perform tasks such as discrimination and classification. The ANNs are inspired by the architecture of the biological neurons inside the brain. ANNs are especially good at pattern matching and they are widely used for such purposes.

Some Quick Facts about the Biological Neurons and the Brain

As we mentioned it, the architecture of neural networks has been inspired from the architecture of the biological neurons in the human brain.

Anyway, it must be underlined that neural networks are by no means supposed to constitute a realistic modelization of the way the human brain is working.

The human brain is in fact based on a giant communications system, made of billions of neurons (which are nerve cells). There are different types of neurons. An individual neuron has a simple single structure and the different parts are:

The Cell body;
Dendrites;
Axons

A neuron can be seen as a unit processing information, receiving electric impulses from the dendrites. What is important is that the dendrite signal will vary in terms of frequency but will have a constant intensity.

If the cumulative amount of incoming impulses reaches a given ceiling value, then the neuron will output a signal via the axon.

Activating and inhibiting impulses can be emitted and they can superpose together so that the inhibiting signal reduces the activating signal.

When a neuron outputs a signal, it sends the impulse through the axon and synapses to other neurons.

The synapses are largely responsible for determining the frequency of the signal and its nature (eg either inhibiting or activating)

The different unit building blocks of the brain are quite simple bricks which decide, from the reception of input signals, if they will output a signal.

The mechanism may look very simple but the brain is extraordinarily complex because there are a huge amount of connections between neurons (“basic processing units”) . The amount of interconnections between neurons is around 200,000. Besides, neurons operate in parallel which gives the whole system a lot of strength. This big amount of connections is essential in the learning process of the brain.

The Artificial Neuron

We have already detailed in the perceptron article, what is an artificial neuron. However we will recall the main facts here.

An artificial neuron is an abstract computer structure which acts as a basic processing unit. It receives N input signals and transforms them into a single signal using a weighted sum, then it uses an activation function to fire the output signal(s).

If ‎Φ is the activation function, then the output signal is Φ(∑_i=0…Nm_iX_i)

Where the m_i’s are the N weights and the bias and the X_i’s are the N inputs and the bias input.

The activation function can be of different shapes. If the activation function is the Heaviside function then the artificial neuron is a perceptron.

Some common activation functions are:

Hyperbolic tangent (tanh)
Sigmoid (logistic function)
Linear functions
Rectified Linear Unit (ReLu) (variants: leaky, parametric…)
Others: Swish, Softmax….

The Basic Architecture of a Neural Network

As a consequence of the observation of the way the brain is working, Artificial Neural Networks have been given a similar – but not identical – design.

A lot of mathematical models of the Human brain have been created and while they are all different , they share – as a common minimal set – the following features:

Several inputs coming either from ‘outside’ or from other processing units (dendrites);
‘Weights’ indicating how an input signal influence the processing unit that receives it (this is the frequency & nature of the electric signal received via the synapses);
A function which sums up all the inputs (the addition of all input signals in the neuron);
A system to compute a ceiling value. If the sum of the inputs exceeds that ceiling then the signal is transmitted, otherwise, the signal is not transmitted
An outgoing signal (the signal sending to the outside through the axon);

Multi-Layer

An artificial neural network is made up of layers. Layer is a generic term which encompasses a set of artificial neurons considered as ‘nodes’ and that are operating at a specific depth inside a neural network.

Layers are divided into three categories:

The input layer;
The Hidden layers;
The output layer.

The input layer receives the input data to process. So each node of this layer can be seen as the input variable of a function.

The hidden layers contain the processing itself. These layers apply weights and thresholds to the inputs to generate the relevant outputs.

Each hidden layer is connected to the other, which is next to it so that they form a processing chain.

The output layer is at the end, this is the last layer and outputs the result

Here is an example of a multi-layer neural network.

The output layer can consist of a single neuron or – to the contrary – of many neurons, depending on the model chosen.

Back-Propagation

Back-propagation defines the way the neural network is learning and auto-adjusting its parameters. Back-propagation is based on the gradient descent optimization method.

In back-propagation, the error, e.g the difference between the expected values and the computed values (in training phase) are ‘back-propagated’ to the previous layer of the neurons so that they adjust their computations.

The re-computation of parameters follow an optimization algorithm which is roughly the gradient descent method.

The following illustration shows how a layer of neurons is back-propagating error to the previous layer so that weights and ceils could be re-adjusted.

Delta Rule

The delta rule is a simplified version of the gradient descent and back-propagation. It applies when the neural network is a single-layer. The formula computes the “delta” which must be applied to compute new weights from an existing processing.

The delta rule specifies that the correction Δ for the weight w_ij – defined as the i^th weight of the j^th neuron – is:

Δw_ij=α(t_j-y_j)g'(h_j)x_i

α is a constant named the learning rate;
g(x) is the neuron’s activation function;
g’ is the derivative of g;
t_j is the target (expected) output;
h_j is the weighted sum of the inputs of the j^th neuron;
y_j is the j^th output;
x_i is the i^th output.

Technically, the delta rule is obtained by performing the minimization of the error in the output of the neural network through gradient descent.

Gradient Descent

The gradient descent is a general optimization algorithm which is based on the idea that a function F decreases quite fast when one goes along the direction indicated by its negative gradient. E.g from a to the direction of -∇F(a).That method is used in the context of neural networks to perform the backpropagation so that the error between observed values and expected values get minimized. The way backpropagation and gradient descent work together should be detailed in a separate article.

Backpropagation is an advanced method for training a neural network and several problems may occur such as the “vanishing gradients problem”.

Perceptron vs Artificial Neural Networks

The perceptron is historically the first of the neural networks. The perceptron denotes often different concepts such as a machine, an algorithm, an artificial neuron equipped with the Heaviside activation function and a single-layer neural network using the perceptron neurons.

The perceptron network should always be considered single-layer because a multi-layer perceptron is nothing more than a feed-forward neural network.

Concretely the perceptron is not used anymore but remains important for historical reasons.

The perceptron provides linear classification while general neural networks can perform non-linear classification and therefore classify any non-linearly separable data.

Here we show the difference between classification of XOR values using a perceptron (impossible) and using a nonlinear neural network.

As one can see the classification of XOR data is possible with non-linear “general” neural networks.

In the next articles, we shall detail how neural networks operate and we shall give as well several examples of such ANNs using various data and activation functions.

Acodez is a leading web development company in India offering Emerging Technology Services to our clients across the globe. As a web design company, we offer all web development services too to our clients using the latest technologies. We also have a dedicated digital marketing division wherein we provide SEO, SMM, SEM, Inbound marketing services, etc at affordable prices. For further information, please contact us.

Looking for a good team
for your next project?

Here're some more related blogs

AI and the Future of Design

Posted on Mar 07, 2023 | AI and ML

What is ChatGPT and why is it considered a disruptive technology?

Posted on Feb 18, 2023 | AI and ML

What is Web 3.0 and Why is It Considered Important?

Posted on Feb 28, 2022 | AI and ML

We'd love to talk with you

Delhi NCR - India

Mumbai - India

Bangalore - India

Calicut - India

What are Artificial Neural Networks? A Complete Guide

Overview of the concept of ANNs

Some Quick Facts about the Biological Neurons and the Brain

The Artificial Neuron

The Basic Architecture of a Neural Network

Multi-Layer