GaussianDropout vs. Dropout vs. GaussianNoise in Keras
Asked Answered
T

1

3

Can anyone explain the difference between the different dropout styles? From the documentation, I assumed that instead of dropping some units to zero (dropout), GaussianDropout multiplies those units by some distribution. However, when testing in practice, all units are touched. The result looks more like the classic GaussianNoise.

tf.random.set_seed(0)
layer = tf.keras.layers.GaussianDropout(.05, input_shape=(2,))
data = np.arange(10).reshape(5, 2).astype(np.float32)
print(data)

outputs = layer(data, training=True)
print(outputs)

results:

[[0. 1.]
 [2. 3.]
 [4. 5.]
 [6. 7.]
 [8. 9.]]
tf.Tensor(
[[0.    1.399]
 [1.771 2.533]
 [4.759 3.973]
 [5.562 5.94 ]
 [8.882 9.891]], shape=(5, 2), dtype=float32)

edit:

Apparently, this is what I wanted all along:

def RealGaussianDropout(x, rate, stddev):

    keep_prob = 1 - rate
    random_tensor = tf.random.uniform(tf.shape(x))
    keep_mask = tf.cast(random_tensor >= rate, tf.float32)   
    noised = x + K.random_normal(tf.shape(x), mean=.0, stddev=stddev)   
    ret = tf.multiply(x, keep_mask) + tf.multiply(noised, (1-keep_mask))

    return ret


outputs = RealGaussianDropout(data,0.2,0.1)
print(outputs)
Tercet answered 30/12, 2020 at 4:5 Comment(0)
B
2

you are right... GaussianDropout and GaussianNoise are very similar. you can test all the similarities by reproducing them on your own

def dropout(x, rate):

    keep_prob = 1 - rate
    scale = 1 / keep_prob
    ret = tf.multiply(x, scale)
    random_tensor = tf.random.uniform(tf.shape(x))
    keep_mask = random_tensor >= rate
    ret = tf.multiply(ret, tf.cast(keep_mask, tf.float32))
    
    return ret

def gaussian_dropout(x, rate):
    
    stddev = np.sqrt(rate / (1.0 - rate))
    ret = x * K.random_normal(tf.shape(x), mean=1.0, stddev=stddev)
    
    return ret

def gaussian_noise(x, stddev):
    
    ret = x + K.random_normal(tf.shape(x), mean=.0, stddev=stddev)
    
    return ret

Gaussian noise simply adds random normal values with 0 mean while gaussian dropout simply multiplies random normal values with 1 mean. These operations involve all the elements of the input. The classic dropout turn to 0 some input elements operating a scaling on the others

DROPOUT

data = np.arange(10).reshape(5, 2).astype(np.float32)

set_seed(0)
layer = tf.keras.layers.Dropout(.4)
out1 = layer(data, training=True)
set_seed(0)
out2 = dropout(data, .4)
print(tf.reduce_all(out1 == out2).numpy()) # TRUE

GAUSSIANDROPOUT

data = np.arange(10).reshape(5, 2).astype(np.float32)

set_seed(0)
layer = tf.keras.layers.GaussianDropout(.05)
out1 = layer(data, training=True)
set_seed(0)
out2 = gaussian_dropout(data, .05)
print(tf.reduce_all(out1 == out2).numpy()) # TRUE

GAUSSIANNOISE

data = np.arange(10).reshape(5, 2).astype(np.float32)

set_seed(0)
layer = tf.keras.layers.GaussianNoise(.3)
out1 = layer(data, training=True)
set_seed(0)
out2 = gaussian_noise(data, .3)
print(tf.reduce_all(out1 == out2).numpy()) # TRUE

to grant reproducibility we used (TF2):

def set_seed(seed):
    
    tf.random.set_seed(seed)
    os.environ['PYTHONHASHSEED'] = str(seed)
    np.random.seed(seed)
    random.seed(seed)
Baranowski answered 30/12, 2020 at 9:35 Comment(2)
thank you for the detailed answer. It's strange that they made the effort to define two such similar functions. I'm still interested in a layer that works like dropout and randomly distributes noise to, say, 20% of the data. Has no one tried this yet?Tercet
yes, I don't know why precisely... however you can build your own dropout function/layer mixing the masking ability of the classic dropout with tho noise addition of the gaussian method if u retain it valuableBaranowski

© 2022 - 2024 — McMap. All rights reserved.