Skip to content

bhy06/GAN_Anime

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

56 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GAN_Anime

This is a PyTorch implementation of GANs, focusing on generating anime faces.

To do:

  • Build anime-faces dataset
  • Implement GANs
  • Implement StyleGANs
  • Implement Conditional GANs

Anime-faces Dataset

All anime-faces images are collected and proprecessed by myself. Anime-style images of 45 tags (tags.txt) are collected from danbooru.donmai.us using the crawler tool gallery-dl. After deleting unrelated images without anime-faces, the images are then processed by a anime face detector lbpcascade_animeface in build_animeface_dataset.py. After cropping, meaningless images are deleted manually and the resulting dataset contains about 100,000 anime faces in total. For conditional GANs, anime-faces images of 20 tags (tags_20.txt) (about 50,000 images) are utilized for training. For StyleGANs, after cropping and filtering low-quality images, around 60,000 images (512x512) are utilized for training.

Dataset is here.

Dataset for StyleGAN is here.

GPU

Only a NVIDIA RTX 2080 Ti GPU is used for training. For StyleGAN and StyleGAN2, I spent 8 days training models separately and the models might not fully converge.

Usage

To train the model (default:dcgan),

python train.py --dataRoot path_to_dataset --cuda

Models

In train.py multiple gans are available by initializing --model:

  • GAN: use --model 0 to run models/gan.py
  • DCGAN: use --model 1 to run models/dcgan.py
  • W-DCGAN: use --model 2 to run models/wdcgan.py
  • W-DCGAN_GP: use --model 3 to run models/wdcgan_gp.py
  • W-ResGAN_GP: use --model 4 to run models/wresgan_gp.py
  • CGAN: use --model 5 to run models/cdcgan.py
  • ACGAN: use --model 6 to run models/acgan_resnet.py

Results

1. GAN

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

2. DCGAN

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

3. W-DCGAN

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

4. W-DCGAN_GP

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

5. W-ResGAN_GP

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

6. CGAN

Generated samples are based on the following category order, where the images of each category are shown in each row.

From top to bottom: green_hair, orange_hair, purple_hair, silver_hair, blue_eyes, green_eyes, pink_eyes, red_eyes

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

7. ACGAN

Generated samples are based on the following category order, where the images of each category are shown in each row.

From top to bottom: green_hair, orange_hair, purple_hair, silver_hair, blue_eyes, green_eyes, pink_eyes, red_eyes

Training for 100 epochs (.gif) Generated 64x64 samples (.jpg)

8. StyleGAN

The video of training progression (12,000 iterations) is here.

Here are the generated 512x512 samples (.jpg).

More generated samples can be found here.

9. StyleGAN2

The video of training progression (3,000 iterations) is here.

Here are the generated 512x512 samples (.jpg).

More generated samples can be found here.

Here are the style mixing examples.

Things I've learned

  • GAN is really hard to train since it is difficult to balance D and G.
  • DCGAN generally works better than GAN and it can generate clearer images with details.
  • WGAN trains more stably and has the metric to show the convergence during training, also avoids mode collapse problem.
  • WGAN-GP using gradient penalty shows more powerful performance and better generated images than WGAN using weight clipping.
  • CGAN is also hard to train and easily causes mode collapse problem.
  • ACGAN seems more stable and powerful in generating conditional images.
  • The blobs in training are part of how StyleGAN 'creates' new features and in fact doing something useful.
  • StyleGAN2 changes the AdaIN normalization to eliminate the blobs problem and improve overall quality.

Tips based on personal experience

  • The most important thing for training GANs is to learn to balance D and G.
  • Add noise to D's inputs and labels helps stablize training.
  • Adam is always good, but exponetially decaying learning rate seems not so helpful and makes no significant differences.
  • Training D several times than G sometimes seems helpful (WGAN) but easily makes D so strong thus upsets the existing balance.
  • Giving D higher learning rate than G seems lead to better results.
  • D should be a little more powerful to lead G to generate better images.
  • The learning rate is one of the most critical hyperparameters.
  • One of the more powerful ways to improve performance is data cleaning/augmentation.

Acknowledgements

About

Code in Pytorch for generating anime faces

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages