I am trying to figure out how to match activation=sigmoid
and activation=softmax
with the correct model.compile(
) loss parameters. Specifically those associated with binary_crossentropy
.
I have researched related topics and read the docs. Also I have built a model and got it working with sigmoid
but not softmax
. And I cannot get it working properly with the "from_logits
" parameters.
Specifically, here it says:
Args:
from_logits
: Whetheroutput
is expected to be a logits tensor. By default, we consider thatoutput
encodes a probability distribution.
This says to me that if you use a sigmoid
activation you want "from_logits=True
". And for softmax
activation you want "from_logits=False
" by default. Here I am assuming that sigmoid
provides logits
and softmax
provides a probability distribution.
Next is some code:
model = Sequential()
model.add(LSTM(units=128,
input_shape=(n_timesteps, n_features),
return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(units=64, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(units=32))
model.add(Dropout(0.3))
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(1, activation='sigmoid'))
Notice the last line is using the sigmoid
activation. Then:
model.compile(optimizer=optimizer,
loss='binary_crossentropy',
metrics=['accuracy'])
This works fine but it is working with the default "from_logits=False" which is expecting a probability distribution.
If I do the following, it fails:
model.compile(optimizer=optimizer,
loss='binary_crossentropy',
metrics=['accuracy'],
from_logits=True) # For 'sigmoid' in above Dense
with this error message:
ValueError: Invalid argument "from_logits" passed to K.function with TensorFlow backend
If I try using the softmax activation as:
model.add(Dense(1, activation='softmax'))
It runs but I get 50% accuracy results. With sigmoid
I am getting +99% accuracy. (I am using a very contrived data set to debug my models and would expect very high accuracy. Plus it is a very small data set and will over fit but that is OK for now.)
So I expect that I should be able to use the "from_logits
" parameter in the compile function. But it does not recognize that parameter.
Also I would like to know why it works with the sigmoid
activation and not the softmax
activation and how do I get it working with the softmax
activation.
Thank you,
Jon.
keras.__version__
,tf.__version__
? The docs that you're citing is fortf.__version__=='1.13.1'
. – Antihero