Chapter - Generative Adversarial Networks (GANs)

In this section of the book I will cover Generative Adversarial Networks (GANs). Generative Adversarial Networks (GANs) (Goodfellow et al. 2014) are a first attempt at creating generative models. In the context of games, GANs are modeled as a two player adversarial games. One of the biggest challenges faced with supervised learning is annotating the data. We cannot annotate automatically and without annotations we cannot train our learning models. But what if we could substitute the annotation of the data for something else? For instance, what if we could model the annotation task as a game or use other previous knowledge about the world as labels. These ideas are one of the main motivations for GANs. GANs are deep neural networks that consist of a generator network connected to a discriminator network. The discriminator network has training data and the generator network only has random or noise data as input. GANs are essentially 2 player games where one player (the generator) creates synthetic data samples, while the second player (the discriminator) takes the generated sample and performs a classification. This classification is performed to determine if the synthetic sample is similar to the distribution of the discriminator's training data. Since both networks are connected, the deep neural network (GAN) can learn to generate better synthetic samples with the help of the discriminator’s feedback. Basically, the discriminator tells the generator how to adjust its weights to produce better synthetic samples. Generative Adversarial Networks are methods that use 2 deep neural networks to interact with each other and generate data. Its formulation is consistent with 2 player adversarial game frameworks. One of the 2 algorithms (or networks) tries to learn a data distribution and produce new samples similar to the samples in the real data (the generator). The second algorithm (the discriminator) is a classifier that tries to determine if the new samples generated by the generative algorithm are fake or real. These 2 algorithms work together to achieve an optimal outcome of producing better output samples from the Generator. The generator in a GAN is based on Auto-encoders. Therefore, before looking at GANs, we will look at the Auto-encoder

Copyright and License

Copyright © by Ricardo A. Calix.
All rights reserved. No part of this work may be reproduced or transmitted in any form or by any means, without written permission of the copyright owner.
MIT License.

FTC and Amazon Disclaimer

This post/page/article includes Amazon Affiliate links to products. This site receives income if you purchase through these links. This income helps support content such as this one.

Generative Adversarial Networks

In this section of the book I will cover Generative Adversarial Networks (GANs). Generative Adversarial Networks (GANs) (Goodfellow et al. 2014) are a first attempt at creating generative models. In the context of games, GANs are modeled as a two player adversarial games. One of the biggest challenges faced with supervised learning is annotating the data. We cannot annotate automatically and without annotations we cannot train our learning models. But what if we could substitute the annotation of the data for something else? For instance, what if we could model the annotation task as a game or use other previous knowledge about the world as labels. These ideas are one of the main motivations for GANs. GANs are deep neural networks that consist of a generator network connected to a discriminator network. The discriminator network has training data and the generator network only has random or noise data as input. GANs are essentially 2 player games where one player (the generator) creates synthetic data samples, while the second player (the discriminator) takes the generated sample and performs a classification. This classification is performed to determine if the synthetic sample is similar to the distribution of the discriminator's training data. Since both networks are connected, the deep neural network (GAN) can learn to generate better synthetic samples with the help of the discriminator’s feedback. Basically, the discriminator tells the generator how to adjust its weights to produce better synthetic samples.
Generative Adversarial Networks are methods that use 2 deep neural networks to interact with each other and generate data. Its formulation is consistent with 2 player adversarial game frameworks. One of the 2 algorithms (or networks) tries to learn a data distribution and produce new samples similar to the samples in the real data (the generator). The second algorithm (the discriminator) is a classifier that tries to determine if the new samples generated by the generative algorithm are fake or real. These 2 algorithms work together to achieve an optimal outcome of producing better output samples from the Generator.
The generator in a GAN is based on Auto-encoders. Therefore, before looking at GANs, we will look at the Auto-encoder

Autoencoders

Autoencoders are a type of compression method where a neural network learns how to represent a vector of size “m” into a vector of size “n” where m >> n. Here, the input and output vectors in the network are the original sample and the reproduced sample and the hidden layer of the network is the new compressed representation of the input vector. The objective function minimizes the difference/distance between the original input sample and the reproduced output sample.

Original GAN

The original GAN consists of a generator and a Discriminator and was proposed in a 2014 paper by Ian Goodfellow. The general architecture of the GAN can be seen in the figure below.

Generating MNIST digits with GANs

In this section, I will describe how to implement a GAN that can generate images. The algorithm will work with the MNIST data set. As always, the code can be downloaded from my GitHub.
First we import the libraries.

Next we can define the parameters as follows

We load the MNIST data in a similar way to what we did in previous chapters

It is a good idea to print the shapes of the tensors before creating the DataLoaders.

Now we are ready to define the GAN architectures. GANs have a Generator and a Discriminator In the following code segment we define the architecture for the Generator. Notice that this will be a neural network of size 100x256x784.

I also tried a deeper architecture for the GAN but was not able to get it to learn in 30 epochs as I did with the simple MLP GAN. I am including it here and leave it as a exercise for the reader.

The architecture for the Discriminator can be seen in the next code segment.

Notice that the architecture for this network is 784x100x50x1. Its input is an image vector (real or fake) and its output is one neuron with value 0 or 1. The following function can be used to generate seed random noise vectors for the Generator input. For training we want batches of noise vectors.

To generate individual images, we can use the following function to generate seed noise vectors

The Training function for the GAN is the most complicated we have seen so far. The full code can be seen seen below. As can be seen, we train the discriminator twice and the generator once. The Discriminator looks at a real image and it should predict that it is real (a one). Then the discriminator looks at a generated image (fake) and it should predict that it is a fake (a zero). The discriminator weights should be updated accordingly for these objectives. The final step is to update the weights of the Generator. Here, we want to trick the discriminator. So now the generated image (fake) is given to the Discriminator but we want it to say that it is real (a one). We do this using the loss of the Discriminator but adjust the weights of the Generator. So the Generator weights are updated in such a way that it generates images that trick the Discriminator into predicting that the fake images are true (a one). Notice that before training, we need to squeeze and reshape the input \textbf{xb} tensor from [batch, 1, 28, 28] to [batch, 784] . We can do that with the following statements.

and the GAN training function is

list_losses_real    = []
list_losses_fake    = []
list_losses_tricked = []
def training_loop(  N_Epochs, G_model, D_model, D_loss_fn, G_opt, D_opt   ):
    for epoch in range(N_Epochs):
        for xb, yb in train_dl:              ## xb = [batch, 1, 28, 28]
            xb = torch.squeeze(xb, dim=1)
            xb = xb.reshape((-1, 784))
            #################################################
            ## G_model.eval()     ## No G training
            ## gen_img = G_model( random_G_vector_input() )
            gen_img = G_model( random_G_batch_vector_input() ).detach()
            ## Train D with real data
            D_real_y_pred = D_model(  xb  )
            D_real_loss   = D_loss_fn( D_real_y_pred, torch.ones((batch_size, 1)) )
            D_opt.zero_grad()
            D_real_loss.backward()
            D_opt.step()
            ## Train D with fake data
            D_fake_y_pred = D_model(  gen_img  )
            D_fake_loss   = D_loss_fn( D_fake_y_pred, torch.zeros((batch_size, 1)))
            D_opt.zero_grad()
            D_fake_loss.backward()
            D_opt.step()
            ## G_model.train()    ## yes G training
            #################################################
            ## D_model.eval()     ## No D training
            ## gen_img = G_model( random_G_vector_input() )
            gen_img = G_model( random_G_batch_vector_input() )
            ## Train G with D_loss (need to trick D)
            D_tricked_y_pred = D_model(  gen_img  )
            D_tricked_loss   = D_loss_fn( D_tricked_y_pred, torch.ones((batch_size, 1)) )
            G_opt.zero_grad()
            D_tricked_loss.backward()
            G_opt.step()
            ## D_model.train()    ## yes D training
        if epoch % 1 == 0:
            print("******************************")
            print(epoch, "D_real_loss=", D_real_loss)
            print(epoch, "D_fake_loss=", D_fake_loss)
            print(epoch, "D_tricked_loss=", D_tricked_loss)
            list_losses_real.append(        D_real_loss.detach().numpy()  )
            list_losses_fake.append(        D_fake_loss.detach().numpy()  )
            list_losses_tricked.append(  D_tricked_loss.detach().numpy()  )

Finally, we can call the core functions and print the losses during training.

We can seen below that the losses are going down for both the Generator and Discriminator.

******************************
0 D_real_loss= tensor(0.1157, grad_fn=<BinaryCrossEntropyBackward0>)
0 D_fake_loss= tensor(0.0114, grad_fn=<BinaryCrossEntropyBackward0>)
0 D_tricked_loss= tensor(7.8405, grad_fn=<BinaryCrossEntropyBackward0>)
******************************
1 D_real_loss= tensor(0.2492, grad_fn=<BinaryCrossEntropyBackward0>)
1 D_fake_loss= tensor(0.0604, grad_fn=<BinaryCrossEntropyBackward0>)
1 D_tricked_loss= tensor(5.4396, grad_fn=<BinaryCrossEntropyBackward0>)
******************************
2 D_real_loss= tensor(0.1128, grad_fn=<BinaryCrossEntropyBackward0>)
2 D_fake_loss= tensor(0.0804, grad_fn=<BinaryCrossEntropyBackward0>)
2 D_tricked_loss= tensor(4.2457, grad_fn=<BinaryCrossEntropyBackward0>)

...

******************************
27 D_real_loss= tensor(0.6145, grad_fn=<BinaryCrossEntropyBackward0>)
27 D_fake_loss= tensor(0.3502, grad_fn=<BinaryCrossEntropyBackward0>)
27 D_tricked_loss= tensor(1.2594, grad_fn=<BinaryCrossEntropyBackward0>)
******************************
28 D_real_loss= tensor(0.4972, grad_fn=<BinaryCrossEntropyBackward0>)
28 D_fake_loss= tensor(0.3535, grad_fn=<BinaryCrossEntropyBackward0>)
28 D_tricked_loss= tensor(1.2053, grad_fn=<BinaryCrossEntropyBackward0>)
******************************
29 D_real_loss= tensor(0.5411, grad_fn=<BinaryCrossEntropyBackward0>)
29 D_fake_loss= tensor(0.5512, grad_fn=<BinaryCrossEntropyBackward0>)
29 D_tricked_loss= tensor(1.2962, grad_fn=<BinaryCrossEntropyBackward0>)

Using the following function, we can plot the losses.

We can see that the losses of the Generator did not go below the discriminator loss. A good objective would be for them to be equal.

And that is it. The GAN is now trained. We can now proceed to test it and generate a few images. We can do that with the following code segment.

I did this several times and the results of the generated images are as follows. This first image looks like a bad zero.

This next image looks like a better version of the zero.

And I believe this looks like a five.

Conditional GANs

The conditional GAN (CGAN) can generate more than random images from a distribution. Instead, in the case on MNIST, for example, it can generate the image of an image given the corresponding label for the image. The architecture for the CGAN can be seen below.

Summary

In this chapter, a description of Generative Adversarial Networks was provided. Some sample code was addressed as well as some applications of GANs.