word-embedding Questions

1

Solved

I have been learning about NLP models and came across word embedding, and saw the examples in which it is possible to see relations between words by calculating their dot products and such. What I...
Obduliaobdurate asked 4/1, 2020 at 13:10

6

I want to understand what is meant by "dimensionality" in word embeddings. When I embed a word in the form of a matrix for NLP tasks, what role does dimensionality play? Is there a visual example ...
Calendra asked 29/7, 2017 at 23:24

1

I have this code that works for English language but does not work for Persian language from gensim.models import Word2Vec as wv for sentence in sentences: tokens = sentence.strip().lower().split...
Ola asked 23/7, 2018 at 19:57

3

Solved

Hi have my own corpus and I train several Word2Vec models on it. What is the best way to evaluate them one against each-other and choose the best one? (Not manually obviously - I am looking for var...
Afterthought asked 4/10, 2018 at 11:22

1

Solved

I want to develop an NER model where I want to use word-embedding features to train CRF model. Code perfectly working without word-embedding features but when I insert embedding as features for CRF...
Immunology asked 6/11, 2019 at 18:36

1

I have used keras to use pre-trained word embeddings but I am not quite sure how to do it on scikit-learn model. I need to do this in sklearn as well because I am using vecstack to ensemble both k...
Pinette asked 16/3, 2019 at 16:6

2

I don't understand the Embedding layer of Keras. Although there are lots of articles explaining it, I am still confused. For example, the code below isfrom imdb sentiment analysis: top_words = 500...
Interurban asked 12/8, 2017 at 11:6

6

Solved

I want to load a pre-trained word2vec embedding with gensim into a PyTorch embedding layer. How do I get the embedding weights loaded by gensim into the PyTorch embedding layer?
Conscript asked 7/4, 2018 at 18:21

3

Solved

I have trained word2vec in gensim. In Keras, I want to use it to make matrix of sentence using that word embedding. As storing the matrix of all the sentences is very space and memory inefficient. ...
Thromboembolism asked 1/9, 2018 at 8:53

0

I'm trying to solve a time series problem. In short, for each customer and material (SKU code), I have different orders placed in the past. I need to build a model that predict the number of days b...
Bodycheck asked 16/7, 2019 at 8:16

1

I've been thinking about 0-padding of word sequence and how that 0-padding is then converted to the Embedding layer. At first glance, one would think that you want to keep the embeddings = 0.0 as w...

1

Solved

I have a seq2seq model which is working fine. I want to add an embedding layer in this network which I faced with an error. this is my architecture using pretrained word embedding which is working...
Hagiography asked 3/6, 2019 at 20:16

1

Difference between tokenize.fit_on_text, tokenize.text_to_sequence and word embeddings? Tried to search on various platforms but didn't get a suitable answer.
Gwenni asked 5/6, 2019 at 18:56

2

I have the following sequential model that works with variable length inputs: m = Sequential() m.add(Embedding(len(chars), 4, name="embedding")) m.add(Bidirectional(LSTM(16, unit_forget_bias=True,...
Kroo asked 2/8, 2017 at 11:57

1

I need to count the frequency of each word in word2vec's training model. I want to have output that looks like this: term count apple 123004 country 4432180 runs 620102 ... Is it possible to do...
Pressurize asked 12/4, 2019 at 17:42

2

Solved

I found "unk" token in the glove vector file glove.6B.50d.txt downloaded from https://nlp.stanford.edu/projects/glove/. Its value is as follows: unk -0.79149 0.86617 0.11998 0.00092287 0.2776 -0.4...
Warrenwarrener asked 12/3, 2018 at 16:20

2

I would like to use some pre-trained word embeddings in a Keras NN model, which have been published by Google in a very well known article. They have provided the code to train a new model, as well...
Saponin asked 31/5, 2017 at 1:19

2

When using for example gensim, word2vec or a similar method for training your embedding vectors I was wonder what is a good ratio or is there a preferred ratio between the embedding dimension to vo...
Doradorado asked 27/1, 2018 at 19:50

1

Solved

When using GloVe embedding in NLP tasks, some words from the dataset might not exist in GloVe. Therefore, we instantiate random weights for these unknown words. Would it be possible to freeze weig...
Lengthen asked 28/2, 2019 at 11:23

1

According to several posts I found on stackoverflow (for instance this Why does word2Vec use cosine similarity?), it's common practice to calculate the cosine similarity between two word vectors af...
Convoke asked 28/1, 2019 at 22:10

2

I have a set of pre-trained word2vec word vectors and a corpus. I want to use the word vectors to represent words in the corpus. The corpus has some words in it that I don't have trained word vecto...
Indirection asked 9/2, 2018 at 1:51

1

Solved

I know that in gensims KeyedVectors-model, one can access the embedding matrix by the attribute model.syn0. There is also a syn0norm, which doesn't seem to work for the glove model I recently loade...
Maryjomaryl asked 14/11, 2018 at 13:56

2

Solved

I want to use spacy to tokenize sentences to get a sequence of integer token-ids that I can use for downstream tasks. I expect to use it something like below. Please fill in ??? import spacy # Loa...
Heptameter asked 8/11, 2018 at 16:45

1

I am using Gensim wrapper to obtain wordRank embeddings (I am following their tutorial to do this) as follows. from gensim.models.wrappers import Wordrank model = Wordrank.train(wr_path = "models...
Hoard asked 27/10, 2017 at 8:11

2

Solved

According to https://code.google.com/archive/p/word2vec/: It was recently shown that the word vectors capture many linguistic regularities, for example vector operations vector('Paris') - vec...
Azaria asked 17/9, 2018 at 9:30

© 2022 - 2024 — McMap. All rights reserved.