Saving nltk drawn parse tree to image file
Asked Answered
G

4

19

enter image description here

Is there any way to save the draw image from tree.draw() to an image file programmatically? I tried looking through the documentation, but I couldn't find anything.

Guarantor answered 2/5, 2014 at 13:16 Comment(1)
Maybe print_to_file, for CanvasFrames.Onagraceous
M
13

I had exactly the same need, and looking into the source code of nltk.draw.tree I found a solution:

from nltk import Tree
from nltk.draw.util import CanvasFrame
from nltk.draw import TreeWidget

cf = CanvasFrame()
t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
tc = TreeWidget(cf.canvas(),t)
cf.add_widget(tc,10,10) # (10,10) offsets
cf.print_to_file('tree.ps')
cf.destroy()

The output file is a postscript, and you can convert it to an image file using ImageMagick on terminal:

$ convert tree.ps tree.png

I think this is a quick and dirty solution; it could be inefficient in that it displays the canvas and destroys it later (perhaps there is an option to disable display, which I couldn't find). Please let me know if there is any better way.

Masse answered 15/7, 2014 at 1:10 Comment(3)
Nice. I think you need to use Tree.fromstring() to build the tree from a string.Leviathan
Yes: In NLTK 3, the Tree constructor no longer accepts a tree in string form. Updated.Belden
Could you give me a hind how to set Tree object from list like [Tree('kerb_NN', ['Dropped_VBN', Tree('provide_VB', ['to_TO', Tree('access_NN', [Tree('to_IN', [Tree('hardstanding_VBG', ['new_JJ', 'permeable_JJ'])])]), Tree('for_IN', [Tree('vehicle_NN', ['one_CD', 'domestic_JJ'])])])])] please? I'd like to draw it. How to cast it?Swahili
I
17

Using the nltk.draw.tree.TreeView object to create the canvas frame automatically:

>>> from nltk.tree import Tree
>>> from nltk.draw.tree import TreeView
>>> t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
>>> TreeView(t)._cframe.print_to_file('output.ps')

Then:

>>> import os
>>> os.system('convert output.ps output.png')

[output.png]:

enter image description here

Illuminance answered 31/1, 2016 at 20:2 Comment(1)
Nice! Anyway to convert a .ps file to .png in python that is agnostic of OS?Neap
M
13

I had exactly the same need, and looking into the source code of nltk.draw.tree I found a solution:

from nltk import Tree
from nltk.draw.util import CanvasFrame
from nltk.draw import TreeWidget

cf = CanvasFrame()
t = Tree.fromstring('(S (NP this tree) (VP (V is) (AdjP pretty)))')
tc = TreeWidget(cf.canvas(),t)
cf.add_widget(tc,10,10) # (10,10) offsets
cf.print_to_file('tree.ps')
cf.destroy()

The output file is a postscript, and you can convert it to an image file using ImageMagick on terminal:

$ convert tree.ps tree.png

I think this is a quick and dirty solution; it could be inefficient in that it displays the canvas and destroys it later (perhaps there is an option to disable display, which I couldn't find). Please let me know if there is any better way.

Masse answered 15/7, 2014 at 1:10 Comment(3)
Nice. I think you need to use Tree.fromstring() to build the tree from a string.Leviathan
Yes: In NLTK 3, the Tree constructor no longer accepts a tree in string form. Updated.Belden
Could you give me a hind how to set Tree object from list like [Tree('kerb_NN', ['Dropped_VBN', Tree('provide_VB', ['to_TO', Tree('access_NN', [Tree('to_IN', [Tree('hardstanding_VBG', ['new_JJ', 'permeable_JJ'])])]), Tree('for_IN', [Tree('vehicle_NN', ['one_CD', 'domestic_JJ'])])])])] please? I'd like to draw it. How to cast it?Swahili
F
7

To add to Minjoon's answer, you can change the fonts and colours of the tree to look more like the NLTK .draw() version as follows:

tc['node_font'] = 'arial 14 bold'
tc['leaf_font'] = 'arial 14'
tc['node_color'] = '#005990'
tc['leaf_color'] = '#3F8F57'
tc['line_color'] = '#175252'

Before (left) and after (right):

before after

Firebug answered 3/12, 2015 at 7:38 Comment(2)
This is helpful, not only for setting the style to match the draw() version but also for showing how it can be customized in general.Belden
Yeah, the documentation is slim and you need to look really hard through the source code to figure out what options are available. I was amazed when it actually worked.Firebug
N
1

To save a given NLTK tree to an image file (OS-agnostic), I recommend the Constituent-Treelib library, which builds on benepar, spaCy and NLTK. First, install it via pip install constituent-treelib

Then, perform the following steps:

from nltk import Tree
from constituent_treelib import ConstituentTree

# Define your sentence that should be parsed and saved to a file
sentence = "At least nine tenths of the students passed."

# Rather than a raw string you can also provide an already constructed NLTK tree
sentence = Tree('S', [Tree('NP', [Tree('NP', [Tree('QP', [Tree('ADVP', [Tree('RB', ['At']), Tree('RBS', ['least'])]), Tree('CD', ['nine'])]), Tree('NNS', ['tenths'])]), Tree('PP', [Tree('IN', ['of']), Tree('NP', [Tree('DT', ['the']), Tree('NNS', ['students'])])])]), Tree('VP', [Tree('VBD', ['passed'])]), Tree('.', ['.'])])

# Define the language that should be considered with respect to the underlying benepar and spaCy models 
language = ConstituentTree.Language.English

# You can also specify the desired model for the language ("Small" is selected by default)
spacy_model_size = ConstituentTree.SpacyModelSize.Large

# Create the neccesary NLP pipeline (required to instantiate a ConstituentTree object)
nlp = ConstituentTree.create_pipeline(language, spacy_model_size) 

# In case you haven't downloaded the required benepar an spaCy models, you can tell the method to do it automatically for you
# nlp = ConstituentTree.create_pipeline(language, spacy_model_size, download_models=True) 

# Instantiate a ConstituentTree object and pass it the sentence as well as the NLP pipeline
tree = ConstituentTree(sentence, nlp)

# Now you can export the tree to a file (e.g., a PDF)  
tree.export_tree("NLTK_parse_tree.pdf", verbose=True)

>>> PDF-file successfully saved to: NLTK_parse_tree.pdf

Result... enter image description here

Narine answered 24/1, 2023 at 20:40 Comment(2)
Would be nice, but too much dependencies and can't install constituent_treelib with Python 3.11.3Contorted
I've updated the Constituent-Treelib [1] library. The previous language detector (fasttext) caused several strange problems and unnecessarily inflated the dependency chain. Now it's more compact and also works with the python versions: 2.7 - 3.12 (tested via github action workflow). [1]: github.com/Halvani/Constituent-TreelibNarine

© 2022 - 2024 — McMap. All rights reserved.