AutoEncoders

AutoEncoders

Autoencoders are neural networks that learn a sparse representation of the input. In other words, once trained on adequate training data, an autoencoder may be used to produce compressed copies of an input data point that retain the majority of the input's information (features) while utilizing much fewer bits of information.

An autoencoder is a neural network that has three components: an encoding function (which translates the data into a simpler space, such as a hidden layer with fewer nodes than the input layer), a decoding function (which reverses this process, such as an output layer with the same number of nodes as the input layer), and a distance metric (which measures loss as a distance between the original input and learned representation). Restricted Boltzmann Machines are the first sort of autoencoders, and they are significant enough in the history of deep learning to deserve their own name.

Autoencoders are neural network equivalents to simpler variable compression algorithms like PCA and LDA. Autoencoders are commonly used in deep learning applications for pretraining, which involves establishing the "right" weights for a neural network ahead of time so that the algorithm does not have to work as hard to converge as it would if it were, to begin with fully random weights. In reality, the development of quicker convergence algorithms has eliminated the necessity and value of pretraining.

Thus, autoencoders are no longer employed in cutting-edge deep-learning research. In commercial applications, they also do not outperform simpler (typically information-theoretic) compression techniques such as JPEG and MG3. They are now mostly utilized as a preprocessing step on high-dimensional data before being fed into T-SNE algorithms.

Architecture of Autoencoders

A typical autoencoder architecture has three key components:

  • Encoding Architecture: The encoder architecture is made up of a sequence of layers with diminishing numbers of nodes that eventually lead to a latent view representation.
  • Latent View Representation: The latent view represents the lowest level space in which inputs are reduced but the information is retained.
  • Decoding Architecture: The decoding architecture is the mirror image of the encoding design, but the number of nodes in each layer rises, resulting in a comparable (nearly) input.

A highly fine-tuned autoencoder model should be able to reconstruct the same input that was passed in the first layer.

Working of Autoencoders

Let us explore the mathematics behind autoencoders. The primary idea underlying autoencoders is to learn a low-level representation of high-dimensional input. Let us try to grasp the encoding process with an example. Consider a data representation space (an N-dimensional space used to represent data), and consider the data points represented by two variables: x1 and x2. The Data Manifold is the place within the data representation space where the real data exists.

Output:

AutoEncoders

To represent this data, we now use two dimensions: X and Y. However, the dimensions of this space can be reduced to lesser dimensions, such as 1D. If we could define the following:

  • Reference Point on the Line-A
  • Angle L with the horizontal axis

Then every other point, say B, online A may be represented by the distance "d" from A and the angle L.

Output:

AutoEncoders
AutoEncoders

But the crucial question here is what logic or rule can be used to express point B in terms of A and angle L. In other words, what is the equation for B, A, and L? The solution is straightforward: there is no set equation, but the unsupervised learning process produces the best possible equation. To put it simply, the learning process is a formula or equation that changes B into A and L. Let's look at this from an autoencoder's standpoint.

Consider the autoencoder without hidden layers; the inputs x1 and x2 are decoded to a lower representation d, which is then projected into x1 and x2.

Now we will explore the sparse autoencoders, stacked autoencoders, and variational autoencoders. We will also visualize the autoencoders' latent encodings by mapping them to two dimensions using TSNE. It will help us to identify unique clusters within the data.

Code:

Importing Libraries

Reading the Dataset


Output:

AutoEncoders

Stacked AutoEncoder

It is a form of neural network made up of many layers of autoencoders layered on top of one another. Autoencoders are unsupervised learning algorithms that are used to learn features and reduce dimensionality.

Output:

AutoEncoders
AutoEncoders

We will reconstruct from the test data

Output:

AutoEncoders

Now, we will map the latent representations to 2 dimensions.

Output:

AutoEncoders

It reconstructs nicely, but we may add convolutions to improve the quality. First, we will add regularisation by creating sparse autoencoders. This regularisation can be used on the coding layer.

Sparse AutoEncoder

It is a form of neural network that functions similarly to a regular autoencoder but with extra restrictions to generate sparsity in the learned representations. Sparse autoencoders are used for feature learning and dimensionality reduction tasks, with the goal of learning a compact and sparse representation of the input data.

In a sparse autoencoder, we restrict the activations of the middle layer to be sparse by adding an L1 Penalty to the activations of the middle layer. So, this means - that a lot of the activations of the middle layer will be zero - and the autoencoder will be forced to assign non-zero values only to the most important attributes of the data.

Output:

AutoEncoders
AutoEncoders

Output:

AutoEncoders

Now we will view the Latent Representation coded by the autoencoder by applying TSNE to the latent representation.


Output:

AutoEncoders

Using CNNS to Build AutoEncoder


Output:

AutoEncoders

Output:

AutoEncoders

Now we will visualize a Lower dimensional representation of the data - using the encoder output and applying TSNE.


Output:

AutoEncoders

The clusters appear to be extremely clear. The 4s are fairly near to the 9s, which makes sense given that their tops are comparable. Similarly, 9 and 7 appear close, which makes sense given their comparable structure. Therefore, our hidden representation makes sense.

Another Convolutional Autoencoder (With Different Filters)

Output:

AutoEncoders

Output:

AutoEncoders

Sparse Autoencoder - Using Convolutions

Output:

AutoEncoders

Output:

AutoEncoders

The 4s are fairly near to the 9s, which makes sense given that their tops are comparable. Similarly, 9 and 7 appear close, which makes sense given their comparable structure. Therefore, our hidden representation makes sense.

Denoising AutoEncoder

The goal is to contaminate the training data with noise and then use an autoencoder to restore it. In this method, the autoencoder learns to de-noise the data, allowing it to understand the data's important properties.

Denoising Using Random Noises


Output:

AutoEncoders

Output:

AutoEncoders

Output:

AutoEncoders

We observe the overlap between 9, 7, and 4, which we described previously. We also see a minor overlap between 2 and 7, which makes sense. Denoising enabled the neural network to correctly encode just the most critical aspects of the inputs. The unimportant characteristics could not be relied on since the data was noisy.

Denoising Autoencoders not only improve dimensionality reduction by focusing on the most relevant aspects of the data but they may also be used for data denoising because they have been trained on noisy data to recover the original data.

Denoising Using Dropout

Here, we apply dropout to the input pixels and let the network rebuild them. This is another type of denoising autoencoder. We expect that at the conclusion of this process, the network will have learned how to rebuild the missing pixels.

Output:

AutoEncoders

Output:

AutoEncoders

Denoising Autoencoders improved the differentiation between clusters. It displays overlap between clusters where it is fair, such as between 4, 9, and 3, 8.

Generative Modelling

The original autoencoder we built does not perform well as a generative model. So, if we sampled random vectors from the latent space, we will most likely get an image that does not look like any digit from 0 - 9. Regular Autoencoders are good for performing Anomaly detection but do not work well as generative models.

To make a good generative model, we need a Variational Autoencoder.

Output:

AutoEncoders
AutoEncoders

Let us store the statistical properties of the latent space present in the training data.


Output:

AutoEncoders

As we can see, the autoencoder did not perform well as a generative model, as the false images produced do not appear natural. However, we can see that it captured the broad strokes.

Variational AutoEncoders(VAE)

Variational autoencoders are more suited for generative modeling. We may utilize them to create new data. As we shall see, data created by VAE will appear more realistic.



Output:

AutoEncoders
AutoEncoders

Output:

AutoEncoders

As we can see, these images generated by the Variational Autoencoder look better than the ones generated through a regular autoencoder.






Latest Courses