Skip to main content

Neural Network Components

Neural networks reflect the behavior of the human brain, allowing computer programs to recognize patterns and solve common problems in the fields of AI, machine learning, and deep learning.

Artificial neural networks (ANNs) are comprised of a node layers, containing an input layer, one or more hidden layers, and an output layer.

dnn

Fully Connected Layers

Every neuron in one layer is connected to every neuron in the next layer. This is achieved by a large matrix multiplication.

Two main steps:

  • Linear step: z=wTx+bz = w^T x + b
  • Activation step: a=σ(z)a = \sigma{(z)}

neural networks

info

ww has the same shape as xx. Therefore, transposing ww is necessary.

Logistic Regression

Fully Connected Neural Network

Matrix Dimensions

Convolutional Layers

The 2D convolution operation:

  1. Start with a kernel, which is simply a small matrix of weights;
  2. The kernel "slides" over the 2D input data, performing an elementwise multiplication with the part of the input the kernel is currently on.
  3. Summing up the results into a single output pixel.

standard convolution

info

Convolutions over volumes/channels and multiple filters:

Convolutions Over Volumes

note

Hyperparameters to tune in a convolutional layer:

  • Stride: apply filters at every pixel or skip some?
  • Padding.
  • Number of filters.

Parameters to learn in a convolutional layer:

  • Each filter could be [X x Y x C] in size.
    • C stands for different channels.
  • Each filter comes with 1 bias value.

Padding

same padding

Pooling Layers

Pooling Operations are used to pool features together, often downsampling the feature map to a smaller size (aka. reduce dimensionality). They can also induce favorable properties such as translation invariance in image classification, as well as bring together information from different parts of a network in tasks like object detection (e.g. pooling different scales).

paperswithcode.com

Hyperparameters:

  • ff: filter size.
  • ss: stride.
  • type: max or average pooling.
note

No parameters to learn.

Pooling Layers

Max Pooling

max pooling

Average Pooling

average pooling

Activation Functions

It introduces non-linearties, which is helpful for learning complex patterns.

ReLU

Rectified Linear Unit, or ReLU, is a type of activation function that is linear in the positive dimension, but zero in the negative dimension. The kink in the function is the source of the non-linearity.

paperswithcode.com

f(x)=max(0,x)f(x) = max(0, x)

relu

Leaky ReLU

Leaky Rectified Linear Unit, or Leaky ReLU, is a type of activation function based on a ReLU, but it has a small slope for negative values instead of a flat slope. The slope coefficient is determined before training, i.e. it is not learnt during training. This type of activation function is popular in tasks where we we may suffer from sparse gradients, for example training generative adversarial networks.

paperswithcode.com

leaky relu

Sigmoid Activation

The output range: [-1, 1].

σ(z)=1!+ez\sigma(z) = \frac{1}{! + e^{-z}}

sigmoid

Tanh Activation

Historically, the tanh function became preferred over the sigmoid function as it gave better performance for multi-layer neural networks. But it did not solve the vanishing gradient problem that sigmoid suffered, which was tackled more effectively with the introduction of ReLU activations.

paperswithcode.com

tanh

SoftMax Activation

SoftMax

References

Vocabulary

Cannot find definitions for "induce".

Cannot find definitions for "favorable".

Cannot find definitions for "invariance".