Asked 26/12, 2014 at 14:35 Answered 29/12, 2023 at 13:24

Solved python python-2.7 ubuntu nltk spyder

I am experimenting NLTK package using Python. I tried to downloaded NLTK using nltk.download(). I got this kind of error message. How to solve this problem? Thanks.

The system I used is Ubuntu installed under VMware. The IDE is Spyder.

enter image description here

After using nltk.download('all'), it can download some packages, but it gets error message when downloading oanc_masc

enter image description here

Engel answered 26/12, 2014 at 14:35 Comment(6)

Why aren't you installing python-nltk using apt-get? – Herodotus 26/12, 2014 at 14:41

@CristianCiupitu you can install nltk however you want, but you then use nltk.download() to download the corpus data after you've installed it. – Supernaturalism 26/12, 2014 at 14:43

@Ffisegydd, do you have any solution to solve this problem? Thanks. – Engel 26/12, 2014 at 15:34

Have you modified any of the settings at all? – Supernaturalism 26/12, 2014 at 15:35

I did not modify any setting. Just install ubuntu under vmware. The host system is windows 7. – Engel 26/12, 2014 at 17:56

Looks like something is wrong with their server. Can't download nltk data as well. The server doesn't answer – Vicenta 19/10, 2015 at 16:4

To download a particular dataset/models, use the nltk.download() function, e.g. if you are looking to download the punkt sentence tokenizer, use:

$ python3
>>> import nltk
>>> nltk.download('punkt')

If you're unsure of which data/model you need, you can start out with the basic list of data + models with:

>>> import nltk
>>> nltk.download('popular')

It will download a list of "popular" resources.

Ensure that you've the latest version of NLTK because it's always improving and constantly maintain:

$ pip install --upgrade nltk

EDITED

In case anyone is avoiding errors from downloading larger datasets from nltk, from https://mcmap.net/q/263475/-nltk-panlex_lite-giving-me-error

$ rm /Users/<your_username>/nltk_data/corpora/panlex_lite.zip
$ rm -r /Users/<your_username>/nltk_data/corpora/panlex_lite
$ python

>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
>>> dler.download('popular')

And if anyone wants to find nltk_data directory, see https://mcmap.net/q/211871/-nltk-doesn-39-t-add-nltk_data-to-search-path

And to config nltk_data path, see https://mcmap.net/q/209709/-how-to-config-nltk-data-directory-from-code

Emplacement answered 26/12, 2014 at 18:4 Comment(4)

thanks for the reply. I tried nltk.download('all'), it successfully proceeded with downloading some packages, but it got stuck when downloading sth related to oasc_masc, I included the related screenshot in the original post. – Engel 26/12, 2014 at 18:48

what is your nltk version? what is the output of this on your terminal: python -c "import nltk; print nltk.__version__"? – Emplacement 26/12, 2014 at 18:50

Hi there @Emplacement I'm having similar issues using nltk.download('all') on Ubuntu, except I get HTTP Error 404: Not Found in both IDLE and command line. My NLTK version is 2.0b9. Do you have any idea what might be going on? – Insurgent 5/12, 2015 at 23:36

@Joansy, Please update your NLTK. sudo pip install nltk or sudo apt-get install python-nltk. Once it's updated the problem should resolve itself. Otherwise, you would have to set the url manually. Try updating NLTK first, if it doesn't work, then come back again =) – Emplacement 6/12, 2015 at 0:20

From command line, after importing nltk, try

nltk.download('popular', halt_on_error=False)

After an error it will ask to retry broken package, just decline with n and it will continue with proper packages.

Dilettantism answered 24/11, 2016 at 20:38 Comment(1)

I had several UnicodeDecodeError, and I had to launch this command several times in order to download everything, but it worked in the end. Thanks ! – Cauline 31/5, 2017 at 8:30

a) in OSX either run

sudo /Applications/Python\ 3.6/Install\ Certificates.command

b) switch to admin user (the one you have set up with administrator privileges)

and type at command line:

/Applications/Python\ 3.6/Install\ Certificates.command

Notes:

"\" are necessary because they escape blank characters in file names.
This procedure worked if you have python 3.6 installed, otherwise change it in order to match your install python version... for this execute:

ls /Applications

and look at the python directory name you have there.

Measly answered 10/2, 2020 at 20:19 Comment(0)

An easy(hard) way to get over this error is to do the process manually. Just go to the website https://www.nltk.org/nltk_data/ and download the required zip file and extract the contents.

In Windows, go to user/AppData/local/Programs/Python/Python(version)/lib and create a folder nltk_data. Then create the respective folder. As an example, for 'punkt' create the folder tokenizers and add the folder 'punkt' inside the extracted folder to it. This info is mostly given by the terminal itself.

Run your program. Cheers!

EDIT 1: Of course, downloading all files can be time-consuming, but it's the only option if the "urlopen error" persists.

EDIT 2 It is also mostly your router or network at fault that you are not able to download nltk files. Try changing your network and that should help.

Officialdom answered 18/12, 2022 at 21:35 Comment(0)

Try using a VPN it worked for me.

Enterprising answered 17/10, 2023 at 8:53 Comment(1)

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center. – Eraser 18/10, 2023 at 14:10

First download it manually from https://www.nltk.org/nltk_data/ using an id and use a vpn for download and save it in, as shown path in error and then append /nltk_data/corpora folder and save your downloaded file in unzip format, then turn off VPN and again run same code, it works.

Theresiatheresina answered 27/6, 2023 at 4:22 Comment(0)

I tried to install the data via commandline in my Ubuntu machine:

nlp$ python -m nltk.downloader all
/usr/lib/python3.10/runpy.py:126: RuntimeWarning: 'nltk.downloader' found in sys.modules after import of package 'nltk', but prior to execution of 'nltk.downloader'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
[nltk_data] Error loading all: <urlopen error [Errno 104] Connection
[nltk_data]     reset by peer>
Error installing package. Retry? [n/y/e]

Then tried manual installation:

mkdir /usr/local/share/nltk_data
cd /usr/local/share/nltk_data
mkdir tokenizers # Other folders: chunkers, grammars, misc, sentiment, taggers, corpora, help, models, stemmers
cd tokenizers # I want to download a tokenizer
wget [http://URL/punkt.zip] # go to NLTK Corpora and get the data URL you want.
unzip punkt.zip
# rm punkt.zip

I got an error Unable to establish SSL connection with https, so I went with http.

Checkrein answered 20/10, 2023 at 4:38 Comment(0)

Download link for NLTK is broken and throws a connection error

Download the files manually using the following steps

unzip gh-pages.zip
mkdir ~/nltk_data
mv nltk_data-gh-pages/nltk_data-gh-pages/packages/* nltk_data/
cd nltk_data/tokenizers/
unzip punkt.zip```

Anonym answered 29/12, 2023 at 13:24 Comment(0)

-3

I had this error:

Resource punkt not found. Please use the NLTK Downloader to obtain the resource: import nltk nltk.download('punkt')

When I tried to solve by writing:

import nltk

nltk.download()

my computer shut downs suddenly and anaconda also closed. When I try to open it always shows an error.

I solved the problem by writing:

import nltk

nltk.download('punkt')

Irregular answered 31/1, 2020 at 20:56 Comment(1)

This probably won't help. His problem was unable to nltk.download('all'), more likely only unable to nltk.download('oanc_masc') – Sheet 31/1, 2020 at 22:1

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

EDITED

Recommended topics

Hot tags