Difficulty in GAN training
Asked Answered
L

1

6

I am attempting to train a GAN to learn the distribution of a number of features in an event. The Discriminator and Generator trained have a low loss but the generated events have different shaped distributions but I am unsure why.

I define the GAN as follow:

def create_generator():

    generator = Sequential()

    generator.add(Dense(50,input_dim=noise_dim))
    generator.add(LeakyReLU(0.2))    
    generator.add(Dense(25))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(5))
    generator.add(LeakyReLU(0.2))
    generator.add(Dense(len(variables), activation='tanh'))

    return generator


def create_descriminator():
    discriminator = Sequential()

    discriminator.add(Dense(4, input_dim=len(variables)))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(4))
    discriminator.add(LeakyReLU(0.2))
    discriminator.add(Dense(1, activation='sigmoid'))   
    discriminator.compile(loss='binary_crossentropy', optimizer=optimizer)
    return discriminator


discriminator = create_descriminator()
generator = create_generator()

def define_gan(generator, discriminator):
    # make weights in the discriminator not trainable
    discriminator.trainable = False
    model = Sequential()
    model.add(generator)
    model.add(discriminator)
    model.compile(loss = 'binary_crossentropy', optimizer=optimizer)
    return model

gan = define_gan(generator, discriminator)

And I train the GAN using this loop:

for epoch in range(epochs):
    for batch in range(steps_per_epoch):
        noise = np.random.normal(0, 1, size=(batch_size, noise_dim))
        fake_x = generator.predict(noise)

        real_x = x_train[np.random.randint(0, x_train.shape[0], size=batch_size)]

        x = np.concatenate((real_x, fake_x))
        # Real events have label 1, fake events have label 0
        disc_y = np.zeros(2*batch_size)
        disc_y[:batch_size] = 1

        discriminator.trainable = True
        d_loss = discriminator.train_on_batch(x, disc_y)

        discriminator.trainable = False
        y_gen = np.ones(batch_size)
        g_loss = gan.train_on_batch(noise, y_gen)

My real events are scaled using the sklearn standard scaler:

scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)

Generating events:

X_noise = np.random.normal(0, 1, size=(n_events, GAN_noise_size))
X_generated = generator.predict(X_noise)

When I then use the trained GAN after training for a few hundred to a few thousand epochs to generate new events and unscaling I get distributions that look like this:

enter image description here

And plotting two of the features against each other for the real and fake events gives: enter image description here

This looks similar to mode collapse but I don't see how that could lead to these extremal values where everything is cut off beyond those points.

Lederer answered 12/3, 2020 at 16:26 Comment(0)
O
4

Mode collapse results in the generator finding a few values or small range of values that do the best at fooling the discriminator. Since your range of generated values is fairly narrow, I believe you are experiencing mode collapse. You can train for different durations and plot the results to see when collapse occurs. Sometimes, if you train long enough, it will fix itself and start learning again. There are a billion recommendations on how to train GANs, I collected bunch and then brute force my way through them for each GAN. You could try only training the discriminator every other cycle, in order to give the generator a chance to learn. Also, several people recommend not training the discriminator on real and fake data at the same time (I haven't done it so can't say what, if any, the impact is). You might also want to try adding in some batch normalization layers. Jason Brownlee has a bunch of good articles on training GANs, you may want to start there.

Omni answered 27/3, 2020 at 3:23 Comment(5)
Thanks for the tips, on closer inspection by creating the plots I presented every certain number of epochs I realised the GAN output is stuck between -1 and 1 but I'm not entirely sure what in my code is causing the output to be stuck in this rangeLederer
What is your original x_train dataset? The range between -1 and 1 is caused by the tanh activation output layer in your generator. To understand why you are getting that distribution, I need to see your original x_train dataset or a description of it.Omni
Ah yes the original dataset contains variables representing components such as the mass and momentum of different particles in a collision and the aim of the generator is to produce more events of that type. These variables just contain the raw values for these variables so in that case would something like a linear activation function make more sense. The original dataset is passed through sklearns standard scaler function however which I know normalised the datasetLederer
but maybe I need to do some preprocessing to scale all the variables between 1 and -1 beforehand if I am to use a tanh functionLederer
the model collapse problem in your situation is most like because of your data distribution, you should gaussianize you data before training gan, it is easier for the model to learn a gaussian distribution than a skewed distribution. And use a Wasserstien GAN implementation which uses wasserstien loss which gives best results.Frye

© 2022 - 2024 — McMap. All rights reserved.