How to use Keras Embedding layer when there are more than 1 text features
Asked Answered
I

1

9

I understand how to use the Keras Embedding layer in case there is a single text feature like in IMDB review classification. However, I am confused how to use the Embedding Layers when I have a Classification problem, where there are more than a single text feature. For example, I have a dataset with 2 text features Diagnosis Text, and Requested Procedure and the label is binary class (1 for approved, 0 for not approved). In the example below, x_train has 2 columns Diagnosis and Procedure, unlike the IMDB dataset. Do I need to create 2 Embedding layers, one for Diagnosis, and Procedure? If so, what code changes would be required?

x_train = preprocessing.sequences.pad_sequences(x_train, maxlen=20)
x_test = preprocessing.sequences.pad_sequences(x_test, maxlen=20)
model = Sequential()
model.add(Embedding(10000,8,input_length=20)
model.add(Flatten())
model.add(Dense(1, activation='sigmoid')
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
model.fit(x_train, y_train, epochs=10, batch_size=32, validation_split=0.2)
Indre answered 2/4, 2018 at 5:28 Comment(0)
U
6

You have some choices, you could concatenate the the two features into one and create a single embedding for both of them. Here is the logic

all_features = np.hstack(X['diag'] + X['proc'])
X = pad_sequence(all_features, max_len)
# build model as usual, as you can see on a single embedding layer is
# needed.

or you can use the Functional api and build multiple input model

diag_inp = Input()
diag_emb = Embedding(512)(diag_input)
proc_inp = Input()
proc_emb = Embedding(512)(proc_input)

# concatenate them to makes a single vector per sample
merged = Concatenate()[diag_emb, proc_emb]
out = Dense(2,  activation='sigmoid')(merged)
model = Model(inputs=[diag_inp, proc_inp], outputs=[out])

That is you can learn an embedding for the concatenation or you can learn multiple embeddings and concatenate them while training.

Unkenned answered 2/4, 2018 at 5:49 Comment(4)
where is merged being used?Trepang
@Trepang in the concatenation stepUnkenned
In concat step, merged is created. Shouldn't it be something like out = Dense(2, activation='sigmoid')(merged) ?Trepang
@Trepang yep! fixed!Unkenned

© 2022 - 2024 — McMap. All rights reserved.