CalledProcessError: Returned non-zero exit status 1
Asked Answered
C

2

1

When I try to run:

def remove_stopwords(texts):
    return [[word for word in simple_preprocess(str(doc)) if word not in stop_words] for doc in texts]

def make_bigrams(texts):
    return [bigram_mod1[doc] for doc in texts]

# Remove Stop Words
data_words_nostops1 = remove_stopwords(data_words1)

# Form Bigrams
data_words_bigrams1 = make_bigrams(data_words_nostops1)    
# Create Dictionary
    id2word1 = corpora.Dictionary(data_words_bigrams1)

# Create Corpus
texts1 = data_words_bigrams1

# Term Document Frequency
corpus1 = [id2word1.doc2bow(text) for text in texts1]

mallet_path = 'T:Python/Mallet/mallet-2.0.8/bin/mallet'

ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus1, num_topics=15, id2word=id2word1)

I get the following error:

CalledProcessError: Command 'T:/Python/Mallet/mallet-2.0.8/bin/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input C:\Users\E26E5~1.RIJ\AppData\Local\Temp\3\a66fc0_corpus.txt --output C:\Users\E26E5~1.RIJ\AppData\Local\Temp\3\a66fc0_corpus.mallet' returned non-zero exit status 1.

What can I do in my code specifically to make it work?

Furthermore, the question on this error has been asked a few times before. However, each answer seems so specific to a particular case, that I don't see what I can change on my code now so that it will work. Can someone elaborate on the meaning of this problem?

Cockburn answered 15/5, 2019 at 11:44 Comment(1)
Unfortunately I think that happens for any kind of Java error. Do you have access to the Java stacktrace?Cambium
C
0

Make sure you have:

  • Java Developers Kit downloaded JDK
  • Mallet unzipped Mallet

And have your env in the correct folder, otherwise update it e.g.:

  • import os
  • os.environ.update({'MALLET_PATH':r'Python/Mallet/mallet-2.0.8/bin'})
Cuboid answered 13/4, 2020 at 5:10 Comment(0)
B
0

Just to add in detailed to above answer

Install gensim 3.8.3 version to use wrapper class

!pip install gensim==3.8.3

Download mallet from this page http://mallet.cs.umass.edu/dist/mallet-2.0.8.zip

unzip and put your mallet folder in any drive or document and copy the path

lets say

path = '/Users/name/Document/mallet/mallet-2.0.8/' #for Macos and linux

path = 'C:/mallet/mallet-2.0.8/' # for Windows

import os

from gensim.models.wrappers import LdaMallet

os.environ.update({'MALLET_HOME':r'/Users/name/document/mallet/mallet-2.0.8/'})

mallet_path = '/Users/name/Documents/mallet-2.0.8/bin/mallet'import gensim

## Train LDA with mallet

ldamallet = gensim.models.wrappers.LdaMallet(mallet_path, corpus=corpus, num_topics=20, id2word=dictionary)

enter image description here

Beutler answered 6/6, 2022 at 8:10 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.