User:Aitantv/GAN: Difference between revisions

From XPUB & Lens-Based wiki
 
(11 intermediate revisions by the same user not shown)
Line 4: Line 4:


GANs: Generative Adversarial Networks
GANs: Generative Adversarial Networks
* First introduced in 2014
* [https://thispersondoesnotexist.com thispersondoesnotexist.com] created by nvidia
* Generator v Discriminator: The generator tries to create random synthetic outputs (for instance, images of faces), while the discriminator tries to tell these apart from real outputs (say, a database of celebrities). The hope is that as the two networks face off, they'll both get better and better—with the end result being a generator network that produces realistic outputs. Neither side of the equation should have the upper hand.


The generator tries to create random synthetic outputs (for instance, images of faces), while the discriminator tries to tell these apart from real outputs (say, a database of celebrities). The hope is that as the two networks face off, they'll both get better and better—with the end result being a generator network that produces realistic outputs.
Deep Learning Software
* pix2pix
* StyleGan


== Method ==
== Method ==
Line 21: Line 26:
* Deep Nostalgia
* Deep Nostalgia
* Tokkingheads - combine a still with a video to make it talk  
* Tokkingheads - combine a still with a video to make it talk  
*
 
 
StyleGan2
* Create mega data set - at least 1000 images. You can use Fatkun plugin on chrome of DownThemAll to download all images on google.
* An algoithm can then auto-crop the data set so it's ready for the machine.
* Through 'transfer learning' you can first train the machine using one data set (e.g. Umbrellas), then add a second data set (e.g. Clouds), and it will project the first onto the second.
* Expect a final image/moving image of 512x512 or 1024x1024. You can always upscale to higher or lower resolutions
 
Voice
* You can also teach a machine a voice. Once it knows, you can replicate using text.
 
== StyleGan ==
 
Text to Images
* Can create new bird species
* Incredibly detailed hi res image generation
* Shapeshifting
* 'Latent space'- the grey area where people dont really understand what's going on
*
 
Clips
 
{{youtube|G3anJ03BPas}}
 
== StyleGAN: Playing with Latent Space ==
 
{{youtube|dCKbRCUyop8}}
 
Progressive Growing
* Start with low res images & progresses to higher levels. Can take up to 10 days to get a convincing result.

Latest revision as of 23:03, 7 December 2021

Intro

Machine Learning Mastery

GANs: Generative Adversarial Networks

  • First introduced in 2014
  • thispersondoesnotexist.com created by nvidia
  • Generator v Discriminator: The generator tries to create random synthetic outputs (for instance, images of faces), while the discriminator tries to tell these apart from real outputs (say, a database of celebrities). The hope is that as the two networks face off, they'll both get better and better—with the end result being a generator network that produces realistic outputs. Neither side of the equation should have the upper hand.

Deep Learning Software

  • pix2pix
  • StyleGan

Method

Generative Adversarial Networks (GAN)

  • thispersondoesnotexist.com
  • requires deep learning and neural network experience + coding experience
  • GAN made up of two neural networks- Generator + Discriminator
  • Introduced in 2013
  • Generator - creates data that is preceived to be real. It recieves input and generates realistic images based on those images.
  • Discriminator - decides which images created by the Generator are real and fake.
  • StyleGan / RunwayML (web software/app) very easy to use / BigBiGan

DeepFake

  • Deep Nostalgia
  • Tokkingheads - combine a still with a video to make it talk


StyleGan2

  • Create mega data set - at least 1000 images. You can use Fatkun plugin on chrome of DownThemAll to download all images on google.
  • An algoithm can then auto-crop the data set so it's ready for the machine.
  • Through 'transfer learning' you can first train the machine using one data set (e.g. Umbrellas), then add a second data set (e.g. Clouds), and it will project the first onto the second.
  • Expect a final image/moving image of 512x512 or 1024x1024. You can always upscale to higher or lower resolutions

Voice

  • You can also teach a machine a voice. Once it knows, you can replicate using text.

StyleGan

Text to Images

  • Can create new bird species
  • Incredibly detailed hi res image generation
  • Shapeshifting
  • 'Latent space'- the grey area where people dont really understand what's going on

Clips

StyleGAN: Playing with Latent Space

Progressive Growing

  • Start with low res images & progresses to higher levels. Can take up to 10 days to get a convincing result.