I would like to use some pre-trained word embeddings in a Keras NN model, which have been published by Google in a very well known article. They have provided the code to train a new model, as well as the embeddings here.
However, it is not clear from the documentation how to retrieve an embedding vector from a given string of characters (word) from a simple python function call. Much of the documentation seems to center on dumping vectors to a file for an entire sentence presumably for sentimental analysis.
So far, I have seen that you can feed in pretrained embeddings with the following syntax:
embedding_layer = Embedding(number_of_words??,
out_dim=128??,
weights=[pre_trained_matrix_here],
input_length=60??,
trainable=False)
However, converting the different files and their structures to pre_trained_matrix_here
is not quite clear to me.
They have several softmax outputs, so I am uncertain which one would belong - and furthermore how to align the words in my input to the dictionary of words for which they have.
Is there a simple manner to use these word/char embeddings in keras and/or to construct the character/word embedding portion of the model in keras such that further layers may be added for other NLP tasks?