Advanced Activation layers in Keras Functional API

When setting up a Neural Network using Keras you can use either the Sequential model, or the Functional API. My understanding is the the former is easy to set up and manage, and operates as a linear stack of layers, and that the functional approach is useful for more complex architectures, particularly those which involve sharing the output of an internal layer. I personally like using the functional API for versatility, however, am having difficulties with advanced activation layers such as LeakyReLU. When using standard activations, in the sequential model one can write:

model = Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

Similarly in the functional API one can write the above as:

inpt = Input(shape = (100,))
dense_1 = Dense(32, activation ='relu')(inpt)
out = Dense(10, activation ='softmax')(dense_2)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

However, when using advanced activations like LeakyReLU and PReLU, in that sequential model we write them as separate layers. For example:

model = Sequential()
model.add(Dense(32, input_dim=100))
model.add(LeakyReLU(alpha=0.1))
model.add(Dense(10, activation='softmax'))
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

Now, I'm assuming one does the equivalent in the functional API approach:

inpt = Input(shape = (100,))
dense_1 = Dense(32)(inpt)
LR = LeakyReLU(alpha=0.1)(dense_1)
out = Dense(10, activation ='softmax')(LR)
model = Model(inpt,out)
model.compile(optimizer='rmsprop',
          loss='categorical_crossentropy',
          metrics=['accuracy'])

My questions are:

Is this correct syntax in the functional approach?
Why does Keras require a new layer for these advanced activation functions rather than allowing us to just replace 'relu'?
Is there something fundamentally different about creating a new layer for the activation function, rather than assigning it to an existing layer definition (as in the first examples where we wrote 'relu'), as I realise you could always write your activation functions, including standard ones, as new layers, although have read that that should be avoided?

Recommended topics

Hot tags