Member-only story
Building a GAN with PyTorch
Realistic Images Out of Thin Air?
Generative Adversarial Networks (GANs), proposed by Goodfellow et al. in 2014, revolutionized a domain of image generation in computer vision — no one could believe that these stunning and lively images are actually generated purely by machines. In fact, people used to think the task of generation was impossible and were surprised with the power of GAN, because traditionally, there simply is no ground truth we can compare our generated images to.
This article introduces the simple intuition behind the creation of GAN, followed by an implementation of a convolutional GAN via PyTorch and its training procedure.
Intuition Behind a GAN
Unlike traditional classification, where our network predictions can be directly compared to the ground truth correct answer, ‘correctness’ of a generated image is hard to define and measure. Goodfellow et al., in their original paper Generative Adversarial Networks, proposed an interesting idea: use a very well-trained classifier to distinguish between a generated image and an actual image. If such a classifier exists, we can create and train a generator network until it can output images that can completely fool the classifier.

GAN is the product of this procedure: it contains a generator that generates an image based on a given dataset, and a discriminator (classifier) to distinguish whether an image is real or generated. The detailed pipeline of a GAN can be seen in Figure 1.
Loss Function
Optimizing both the generator and the discriminator is difficult because, as you may imagine, the two networks have completely opposite goals: the generator wants to create something as realistic as possible, but the discriminator wants to distinguish generated materials.
To illustrate this, we let D(x) be the output from a discriminator, which is the probability of x being a real image, and G(z) be the output of our generator. The discriminator is analogous to a binary classifier, and so the goal for the discriminator would be to maximise the function:
which is essentially the binary cross entropy loss without the negative sign at the beginning. On the other hand, the goal of…