How can I get around Keras pad_sequences() rounding float values to zero?
Asked Answered
A

1

9

So I have a text classification model built with Keras. I've been trying to pad my varying length sequences but the Keras function pad_sequences() has just returned zeros.

I've figured out that if you have a numpy array like the one below, it works just fine. But once the elements become floats or decimals like the second array it just turns to zeros.

x = [[1, 2], [3,4,5], [4], [7,8,9,10]]
print pad_sequences(x, padding='post')

outputs:

[[ 1  2  0  0]
 [ 3  4  5  0]
 [ 4  0  0  0]
 [ 7  8  9 10]]

But

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post')

outputs:

[[ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]
 [ 0  0  0  0]]

And this:

x = [[.1, .2], [.3,.4,.5], [.4], [.7,.8,.9,.010]]
print pad_sequences(x, padding='post', value=99)

outputs:

[[ 0  0 99 99]
 [ 0  0  0 99]
 [ 0 99 99 99]
 [ 0  0  0  0]]

So I guess this function just ignores floats/decimals. Is there a way I can get around this?

Atone answered 3/1, 2019 at 23:21 Comment(0)
T
14

It is caused by the fact that the default data type considered in the pad_sequences function is int32. Therefore, all the values will be casted to integer (and in this case become zero). To resolve this, pass dtype='float32' argument:

pad_sequences(x, padding='post', value=99, dtype='float32')
Tumefy answered 3/1, 2019 at 23:41 Comment(1)
I get "TypeError: 'float' object cannot be interpreted as an integer". Does keras handle float or is this function only for integers?Widener

© 2022 - 2024 — McMap. All rights reserved.