W-GAN

W-GAN

Deep learning has undergone a revolution thanks to Generative Adversarial Networks (GANs), which make it possible to produce realistic synthetic data. Even while conventional GANs have been incredibly successful, they can generate low-quality samples and experience instability during training. Wasserstein Generative Adversarial Networks (WGANs) were offered as a solution to these problems. Compared to conventional GANs, WGANs have a number of benefits, including increased sample quality, better training dynamics, and increased stability.

The basis of Wasserstein Generative Adversarial Networks is the Wasserstein distance between probability distributions, often referred to as Earth-Mover's distance. WGANs maximize the Wasserstein distance as opposed to conventional GANs, which quantify the difference between distributions using the Jensen-Shannon or Kullback-Leibler divergences. In addition to producing more stable training dynamics, the Wasserstein distance offers a more significant way to quantify distribution dissimilarity.

Components of W-GAN

  • Generator and Discriminator: WGANs are made up of a generator and a discriminator, just as conventional GANs. The discriminator assesses how realistic the created samples are in relation to actual data, whereas the generator creates synthetic data samples.
  • Wasserstein Distance: Using the Wasserstein distance as the goal function rather than more conventional divergence metrics is the primary novelty of WGANs. A more accurate indicator of distribution dissimilarity is the Wasserstein distance, which measures the amount of "work" needed to change one distribution into another.
  • Gradient Penalty: WGANs incorporate a gradient penalty word to make the discriminator subject to the Lipschitz constraint. This penalty enhances training stability and promotes smoothness in the discriminator's output.

Now for the implementation part, we will implement the W-GAN with Gradient Penalty for MNIST Augmentation.

Code:

Importing Libraries

Utilities

Now we will provide the utilities that are required.

Now we will create a function that will help in the preparation of data for model training.

Building the Model

By using the Wasserstein distance, the original Wasserstein GAN generates a value function with superior theoretical qualities compared to the value function employed in the initial GAN publication. The discriminator, also known as the critic, must lie inside the space of 1-Lipschitz functions in order for WGAN to work. The authors suggested using weight clipping to accomplish this limitation. Even while weight clipping is effective, it can lead to unwanted behavior and be a troublesome technique to impose the 1-Lipschitz constraint. For example, a very deep WGAN discriminator (critic) frequently fails to converge.

Weight clipping is not the only solution suggested by the WGAN-GP approach to guarantee seamless training. The authors suggested a "gradient penalty" in place of trimming the weights, which involves adding a loss term to maintain the discriminator gradients' L2 norm around 1.

Generator

We first input random noise into the generator and then shape it into the MNIST picture format. The general procedures are as follows:

  • Input noise should be fed into a thick layer.
  • Modify the result to include three dimensions. This represents the (width, length, and number of filters).
  • Use Conv2DTranspose to do a deconvolution with a stride of two and a half-number of filters.
  • The features are upsampled to the training picture size in the final layer. This instance is 28 x 28 x 1.

It's important to note that batch normalization is applied to all layers but the final deconvolution layer. Using selu as the activation for the intermediate deconvolution and tanh for the output is great practice.

Discriminator

The discriminator will lower the dimensionality of the input pictures by using strided convolutions. LeakyRELU activates these as best practices. Without any activation, the output features will be flattened and sent to a 1-unit dense layer.

Output:

W-GAN
W-GAN
W-GAN

W-GAN in Action

Let?s now see the W-GAN, when being put to use.

Output:

W-GAN
W-GAN
W-GAN
W-GAN

Image Generated

Let us have a look at the images that are generated by the W-GAN model.

Output:

W-GAN
W-GAN

Evaluation

Now, we will use Frechet Distance to represent the evaluation of some generated data samples compared to real data. Frechet Distance is a measure of similarity between two curves or shapes.

Output:

W-GAN

Here is what we can interpret from the above output:

  1. The first score, 49.98, is comparatively high in relation to the other scores. It implies that there is a large deviation between the produced sample distribution in the first set and the distribution of actual data.
  2. The second score, 27.99, is still rather high even if it is lower than the first.
  3. Compared to the prior two sets, the third set's score of 20.37 shows even more improvement in the resemblance between the produced and real data distributions.
  4. At 11.02, the fourth score is substantially lower than the prior ones. This implies that the fourth set's produced sample distribution closely resembles the distribution of actual data.

  5. At 21.36, the fifth score is marginally better than the fourth but still rather low when compared to the first results.
  6. The sixth score, 30.76, is less than the second and third but greater than the fifth.
  7. The produced and real data distributions fit quite well, as seen by the seventh score of 17.90.
  8. Even lower than the seventh score, the eighth score of 15.67 indicates that the distributions' similarity might still be improved.
  9. With a score of 21.52, the ninth is a little better than the eighth but still rather low when compared to the starting scores.
  10. The produced and real data distributions show a rather good agreement, with the tenth score of 20.16.





Latest Courses