In addition to the command line option already mentioned, you can programmatically install NLTK data in your Python script by adding an argument to the download()
function.
See the help(nltk.download)
text, specifically:
Individual packages can be downloaded by calling the ``download()``
function with a single argument, giving the package identifier for the
package that should be downloaded:
>>> download('treebank') # doctest: +SKIP
[nltk_data] Downloading package 'treebank'...
[nltk_data] Unzipping corpora/treebank.zip.
I can confirm that this works for downloading one package at a time, or when passed a list
or tuple
.
>>> import nltk
>>> nltk.download('wordnet')
[nltk_data] Downloading package 'wordnet' to
[nltk_data] C:\Users\_my-username_\AppData\Roaming\nltk_data...
[nltk_data] Unzipping corpora\wordnet.zip.
True
You may also try to download an already downloaded package without problems:
>>> nltk.download('wordnet')
[nltk_data] Downloading package 'wordnet' to
[nltk_data] C:\Users\_my-username_\AppData\Roaming\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
True
Also, it appears the function returns a boolean value that you can use to see whether or not the download succeeded:
>>> nltk.download('not-a-real-name')
[nltk_data] Error loading not-a-real-name: Package 'not-a-real-name'
[nltk_data] not found in index
False