Doc2vec: Only 10 docvecs in gensim doc2vec model?

I used gensim fit a doc2vec model, with tagged document (length>10) as training data. The target is to get doc vectors of all training docs, but only 10 vectors can be found in model.docvecs.

The example of training data (length>10)

docs = ['This is a sentence', 'This is another sentence', ....]

with some pre-treatment

doc_=[d.strip().split(" ") for d in doc]
doc_tagged = []
for i in range(len(doc_)):
  tagd = TaggedDocument(doc_[i],str(i))
  doc_tagged.append(tagd)

tagged docs

TaggedDocument(words=array(['a', 'b', 'c', ..., ],
  dtype='<U32'), tags='117')

fit a doc2vec model

model = Doc2Vec(min_count=1, window=10, size=100, sample=1e-4, negative=5, workers=8)
model.build_vocab(doc_tagged)
model.train(doc_tagged, total_examples= model.corpus_count, epochs= model.iter)

then i get the final model

len(model.docvecs)

the result is 10...

I tried other datasets (length>100, 1000) and got same result of len(model.docvecs). So, my question is: How to use model.docvecs to get full vectors? (without using model.infer_vector) Is model.docvecs designed to provide all training docvecs?

Recommended topics

Hot tags