Is there a Python equivalent to the PHP function htmlspecialchars()?
Asked Answered
B

8

20

Is there a similar or equivalent function in Python to the PHP function htmlspecialchars()? The closest thing I've found so far is htmlentitydefs.entitydefs().

Bandung answered 31/5, 2009 at 5:58 Comment(1)
It seems that there is more than one obvious way to do it! O noes!Benisch
W
12

Closest thing I know about is cgi.escape.

Waxwing answered 31/5, 2009 at 6:12 Comment(1)
it was deprecated since 3.3Epiphenomenalism
D
7
from django.utils.html import escape
print escape('<div class="q">Q & A</div>')
Disenthral answered 25/3, 2010 at 4:32 Comment(1)
I'm voting for this because I don't want to parse anything like some of the other answers, or even do a search and replace, I want a single function that does it all for me.Elapse
S
6

Building on @garlon4 answer, you can define your own htmlspecialchars(s):

def htmlspecialchars(text):
    return (
        text.replace("&", "&amp;").
        replace('"', "&quot;").
        replace("<", "&lt;").
        replace(">", "&gt;")
    )
Sludgy answered 1/3, 2016 at 17:48 Comment(5)
I think python has a fancy function named something like "translate" that you could use to make this even shorterRoesch
Too lazy right now but yeah: programiz.com/python-programming/methods/string/translateRoesch
Helpful answer, however you're passing the parameters to replace() in the wrong order. Should be: replace("string to find", "string to replace")Sonority
@Sonority no, the function works as expected (it escapes the "html special chars"). It looks for the char to escape, and replaces it by the html escape sequence for that char. Maybe you wanted to un-escape instead?Sludgy
My mistake! @Sludgy you are spot on.Sonority
C
3

You probably want xml.sax.saxutils.escape:

from xml.sax.saxutils import escape
escape(unsafe, {'"':'&quot;'}) # ENT_COMPAT
escape(unsafe, {'"':'&quot;', '\'':'&#039;'}) # ENT_QUOTES
escape(unsafe) # ENT_NOQUOTES

Have a look at xml.sax.saxutils.quoteattr, it might be more useful for you

Colloquium answered 31/5, 2009 at 6:31 Comment(0)
D
3

I think the simplest way is just to use replace:

text.replace("&", "&amp;").replace('"', "&quot;").replace("<", "&lt;").replace(">", "&gt;")

PHP only escapes those four entities with htmlspecialchars. Note that if you have ENT_QUOTES set in PHP, you need to replace quotes with &#039; rather than &quot;.

Durr answered 30/3, 2011 at 15:59 Comment(0)
S
1

The html.entities module (htmlentitydefs for python 2.x) contains a dictionary codepoint2name which should do what you need.

>>> import html.entities
>>> html.entities.codepoint2name[ord("&")]
'amp'
>>> html.entities.codepoint2name[ord('"')]
'quot'
Sargent answered 31/5, 2009 at 8:27 Comment(0)
M
1

Only five characters need to be escaped, so you can use a simple one-line function:

def htmlspecialchars(content):
    return content.replace("&", "&amp;").replace('"', "&quot;").replace("'", "&#039;").replace("<", "&lt;").replace(">", "&gt;")
Musical answered 11/6, 2021 at 7:3 Comment(0)
W
-1

If you are using django 1.0 then your template variables will already be encoded and ready for display. You also use the safe operator {{ var|safe }} if you don't want it globally turned on.

Werner answered 31/5, 2009 at 6:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.