A Guide to GANs (Generative Adversarial Networks)

Aug 10, 2020
4 min read

Generative adversarial networks (GANS) are machine learning approaches that utilize two convolutional neural networks. These networks contest to generate new data instances similar to the training set. Deep learning experts, Ian Goodfellow, together with his colleagues, were the first to propose GANS in the 2014 NeurlPS paper.

GAN is structured into two: The generator that learns to produce plausible data and discriminator that distinguishes fake data from real data produced by the generator. The discriminator also penalizes the generator for generating implausible results.

Use cases of GANs

Generative Adversarial Networks (GANS) has both positive and negative uses as they can learn to imitate any data distribution. They are widely used in video generation, image generation, and voice generation. In the real sense, GANS are robotic artists with impressive outputs that are usually taught to strangely create similar stuff in almost all domains, including images, speech, prose, and music. GANs can also be used in the generation of fake media content, and they are the technology supporting deepfakes.

1. Image to image translation. Generally, GANs are a perfect solution for the image to image translation problems. They usually learn the input to output image mapping with also the loss of functions for the mapping training. Conditional GANs are used in the generation of photorealistic images from inputs of sketches or semantic images. Many image-to-image tasks such as translation of photos from Google maps to satellite photographs, translation to color photographs from sketches, and translation of cityscapes semantic images to photographs; use the pix2pix GANs’ approach

2. Text to image generation. The automatic generation of real images from text is very useful and interesting. For instance, the stackGAN is used to produce accurate images relating to the textual description of objects such as flowers and birds.

3. Increasing the resolution of images. GAN can produce high pixel resolution output images with the Super-Resolution Generative Adversarial Network (SRGAN). This involves the creation of high-resolution images such as street scenes, human faces, and other objects.

4.  Predicting the next video frame. Internal models designed with deep neural networks use a loss-based approach in predicting the future frames of synthetic video sequences with the use of CNN-LSTM—deCNN framework. This aspect is designed for the static elements of video scenes, which can be predicted up to a second.

Christie’s AI Artwork Selling At $432,000

Christie’s became the first auction house to sell an AI portrait of Edmond Belamy generated by a GAN for $432000 (almost 50 times its high estimate). The GAN was open source-based, which was written by Robbie Barrat of Stanford. What makes the artwork unique is that it was created through artificial intelligence by three French students who used a code from the 19-year old programmer, Barrat. The AI community has had a major step through the development of this AI portrait. Image generation with GANs is a multistep process that involves the collection of training data (scraper) that enables your network to replicate, construction of the generative algorithm. Then you begin running the algorithm and sorting out the outputs to pick the best out of the many generated results. From the Belamy story, in combination with their motto, we can truly attest that “creativity is not only for humans” with the creation of art through artificial intelligence.

Advantages and Disadvantages of Using GANs

The Advantages of GANs include:

Ø GANs are learning methods that require less or no supervision; they usually don’t need data to be labeled as they are trained to use unlabeled data in the process of learning the internal data representation. This is more efficient when compared to the manual process of acquiring labeled data which is time-consuming

Ø GANs make the modeling of data distribution better by the generation of clearer and sharper images.

Ø GANs are capable of training all forms of generator networks that are required by other frameworks to attain some specific forms of functionality.

Ø They can produce data similar to the real one, which is widely used in the real world in the generation of images, audio, videos, and texts. The images produced by the GANs are mostly used in marketing, advertisement, e-commerce, and games.

Ø They are also useful in machine learning approaches as they can easily interpret data into various versions by going into the details of the data. The combination of GANs and ML makes it easier to recognize people, cars, trees, and streets and can also determine the distance between different objects.

Disadvantages of GANs

Ø GANs are hard to train; they usually need different types of data repeatedly to check if it functions accurately or not.

Ø The generation of discrete data in the form of a text or speech is a very complex process.

Ø Generally, GANs handle a single task at a time, which makes it difficult to accomplish tasks of guessing the values of a single-pixel given another pixel. This can be fixed by the use of BiGAN that enables you to guess the missing pixels by the use of Gibbs sampling.

Other Potential Uses of GANs:

· GANs are used in image editing. For instance, changing the appearance of an aged individual by changing their hairstyle cannot be accomplished with just the normal image editing tools. Still, with the use of GANs, you will drastically reconstruct and change the appearance of the image.

· Application of GANs for security purposes. Cyber threats are one of the major concerns revolving around the AI world, where even the deep neural networks are prone to hacking. GANs are in the front line to directly address the issues of adversarial attacks, which uses a variety of methods for fooling deep learning architectures. On the other hand, GANs increase the strength of the existing deep learning models against those methods through the creation of similar examples and training the models to detect them.

· GANs are used in cloth translation as they can produce photographs of similar clothes as they appear in online stores.

· They can also be used in photo inpainting (hole filling). This involves repairing and filling a missing section of an image that was reasonably removed with the use of created content.

· Generation of 3D objects such as cars, chairs, etc. with the provision of 2D pictures. Game designers also use GANs to generate 3D backgrounds and avatars to develop realistic appearances.