The Conditional Generative Adversarial Network (cGAN) is a model used in deep learning, a derivative of machine learning. It enables more precise generation and discrimination of images to train machines and allow them to learn on their own. The concept of cGAN was first published in 2014 by Mehdi Mirza and Simon Osindero.
To understand what a cGAN is, you first need to become familiar with Deep Learning. This process involves feeding a computer program thousands of data points so that it can learn to recognize them. The Generative Adversarial Network (GAN) represents an initial training approach. It facilitates a dialogue between two networks: the generator and the discriminator.
On one side, the generator creates fake images that are supposed to be as realistic as possible, with the aim of deceiving the opposing network: the discriminator.
On the other side, the discriminator observes images coming from both the generator and a database. It must determine which images come from the database (and label them as real) and which images are generated by the generator (and are therefore fake).
The discriminator correctly classifies fakes as fakes and real images as real, receiving positive feedback. If it fails in its task, it receives negative feedback. Gradually, thanks to the Gradient Descent algorithm, it can determine the range of data that allows it to recognize a real image, learn from its mistakes, and improve. Thus, it progressively enhances its ability to create more relevant objects.
The cGAN or how to maximize the performance of the generator and the discriminator
With a conditional GAN, it’s possible to send more precise information, called class labels, to both the generator and the discriminator to guide their data generation. These pieces of information help specify the data produced by the generator and the discriminator, allowing them to arrive at the desired results more quickly.
The labels guide the generator’s production to generate more specific information. For example, instead of producing images of clothing in general, it will produce images of pants, jackets, or socks based on the provided label.
On the discriminator’s side, the labels help the network better distinguish between real images and the fake images provided by the generator, making it more efficient.
The cGAN and its multiple uses in the field of machine learning
If you’re having trouble picturing the use of the conditional GAN, here are some examples that might help:
1. Image-to-image translation
cGANs, in particular, allow images to evolve by considering additional information, such as labels. cGANs have enabled the development of the Pix2Pix method, some applications of which include object reconstruction from edges, photo synthesis from label maps, and image colorization.
2. Creating images from text
Thanks to cGANs, it’s possible to create high-quality photos based on text. Using text and the richness of its vocabulary enables the generation of much more precise synthetic images.
3. Video generation
In video, cGANs can also predict future frames of a video based on a selection of previous images.
4. Face generation
cGANs can be used to generate images of faces with specific attributes, such as hair or eye color.
cGANs represent a particular advancement when compared to GANs. They enable deep learning systems to gain precision and efficiency, marking a small revolution in the field of machine learning. This advancement propelled the careers of its two inventors, Mehdi Mirza and Simon Osindero, who now work at DeepMind, a leading company in the sector.