Neural Network Components

Neural networks reflect the behavior of the human brain, allowing computer programs to recognize patterns and solve common problems in the fields of AI, machine learning, and deep learning.

Artificial neural networks (ANNs) are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer.

dnn

Fully Connected Layers

Every neuron in one layer is connected to every neuron in the next layer. This is achieved by a large matrix multiplication.

Two main steps:

Linear step: $z = w^T x + b$
Activation step: $a = \sigma{(z)}$

neural networks

info

$w$ has the same shape as $x$ . Therefore, transposing $w$ is necessary.

Logistic Regression

Fully Connected Neural Network

Matrix Dimensions

Convolutional Layers

The 2D convolution operation:

Start with a kernel, which is simply a small matrix of weights;
The kernel "slides" over the 2D input data, performing an elementwise multiplication with the part of the input the kernel is currently on.
Summing up the results into a single output pixel.

info

Convolutions over volumes/channels and multiple filters:

Convolutions Over Volumes

note

Hyperparameters to tune in a convolutional layer:

Stride: apply filters at every pixel or skip some?
Padding.
Number of filters.

Parameters to learn in a convolutional layer:

Each filter could be [X x Y x C] in size.
- C stands for different channels.
Each filter comes with 1 bias value.

Padding

Pooling Layers

Pooling Operations are used to pool features together, often downsampling the feature map to a smaller size (aka. reduce dimensionality). They can also induce favorable properties such as translation invariance in image classification, as well as bring together information from different parts of a network in tasks like object detection (e.g. pooling different scales).

— paperswithcode.com

Hyperparameters:

$f$ : filter size.
$s$ : stride.
type: max or average pooling.

note

No parameters to learn.

Pooling Layers

Max Pooling

Average Pooling

Activation Functions

It introduces non-linearties, which is helpful for learning complex patterns.

ReLU

Rectified Linear Unit, or ReLU, is a type of activation function that is linear in the positive dimension, but zero in the negative dimension. The kink in the function is the source of the non-linearity.

— paperswithcode.com

f(x) = max(0, x)

relu

Leaky ReLU

Leaky Rectified Linear Unit, or Leaky ReLU, is a type of activation function based on a ReLU, but it has a small slope for negative values instead of a flat slope. The slope coefficient is determined before training, i.e. it is not learnt during training. This type of activation function is popular in tasks where we we may suffer from sparse gradients, for example training generative adversarial networks.

— paperswithcode.com

leaky relu

Sigmoid Activation

The output range: [-1, 1].

\sigma(z) = \frac{1}{! + e^{-z}}

sigmoid

Tanh Activation

Historically, the tanh function became preferred over the sigmoid function as it gave better performance for multi-layer neural networks. But it did not solve the vanishing gradient problem that sigmoid suffered, which was tackled more effectively with the introduction of ReLU activations.

— paperswithcode.com

tanh

SoftMax Activation

SoftMax

References

Vocabulary

verb

To lead by persuasion or influence; incite or prevail upon.
To cause, bring about, lead to.
To cause or produce (electric current or a magnetic state) by a physical process of induction.
To infer by induction.
To lead in, bring in, introduce.
To draw on, place upon.

adjective

Pleasing, encouraging or approving.
Useful or helpful.
Convenient or at a suitable time; opportune.
Auspicious or lucky.

Cannot find definitions for "invariance".

Fully Connected Layers​

Convolutional Layers​

Padding​

Pooling Layers​

Max Pooling​

Average Pooling​

Activation Functions​

ReLU​

Leaky ReLU​

Sigmoid Activation​

Tanh Activation​

SoftMax Activation​

References​

Vocabulary​

induce

favorable

Fully Connected Layers

Convolutional Layers

Padding

Pooling Layers

Max Pooling

Average Pooling

Activation Functions

ReLU

Leaky ReLU

Sigmoid Activation

Tanh Activation

SoftMax Activation

References

Vocabulary