WMD markdown editor - HTML to Markdown conversion
Asked Answered
B

4

9

I am using wmd markdown editor on a project and had a question:

When I post the form containing the markdown text area, it (as expected) posts html to the server. However, say upon server-side validation something fails and I need to send the user back to edit their entry, is there anyway to refill the textarea with just the markdown and not the html? Since as I have it set up, the server only has access to the post data (which is in the form of html) so I can't seem to think of a way to do this. Any ideas? Preferably a non-javascript based solution.

Update: I found an html to markdown converter called markdownify. I guess this might be the best solution for displaying the markdown back to the user...any better alternatives are welcome!

Update 2: I found this post on SO and I guess there is an option to send the data to the server as markdown instead of html. Are there any downsides to simply storing the data as markdown in the database? What about displaying it back to the user (outside of an editor)? Maybe it would be best to post both versions (html AND markdown) to the server...

SOLVED: I can simply use php markdown to convert the markdown to html serverside.

Bogie answered 28/7, 2009 at 20:53 Comment(0)
R
4

I would suggest that you simply send and store the text as Markdown. This seems to be what you have settled on already. IMO, storing the text as Markdown will be better because you can safely strip all HTML tags out without worrying about loss of formatting - this makes your code safer, because it will be harder to use a XSS attack (although it may still be possible though - I am only saying that this part will be safer).

Rustproof answered 29/7, 2009 at 6:24 Comment(5)
Wouldn't stripping all HTML tags cause a problem if the text contained an HTML example?Goer
Perhaps, but then it's just a matter of making sure that you don't strip any tags from inside a code block.Rustproof
Just HTML encode it. I don't think any of the markdown characters are reserved html.Gayomart
@MattDotson: Blockquotes (> some text) are a conflict and link URLs may be surrounded by angle brackets too (<http://example.com>). Separating input sanitization from Markdown handling is probably safer/saner.Aeroplane
It should also be noted that as per the official documentation, code block contents will automatically be HTML-encoded by conformant parsers.Aeroplane
I
2

One thing to consider is that WMD appears to have certain different edge cases from certain server-side Markdown implementations. I've definitely seen some quirks in the previews here that have shown up differently after submission (I believe one such case was attempting to escape a backtick surrounded by backticks). By sending the converted preview over the wire, you can ensure that the preview is accurate.

I'm not saying that should make your decision, but it's something to consider.

Invalidism answered 29/7, 2009 at 6:37 Comment(0)
S
0

Try out Pandoc. It's a little more comprehensive and reliable than Markdownify.

Supposititious answered 20/5, 2010 at 9:7 Comment(0)
M
0

The HTML you are seeing is just a preview, so it's not a good idea to store that in the database as you will run into issues when you try to edit. It's also not a good idea to store both versions (markdown and HTML) as the HTML is just an interpretation and you will have the same problems of editing and keeping both versions in synch.

So the best idea is to store the markdown in the db and then convert it server side before displaying.

You can use PHP Markdown for this purpose. However this is not 100% perfect conversion of what you are seeing on the javascript side and may need some tweaking.

The version that the Stack Exchange network is using is a C# implementation and there should be a python implementation you downloaded with the version of wmd you have.

The one thing I tweaked was the way new lines were rendered so I changed this in markdown.php to convert some new lines into <br> starting from line 626 in the version I have:

var $span_gamut = array(
#
# These are all the transformations that occur *within* block-level
# tags like paragraphs, headers, and list items.
#
    # Process character escapes, code spans, and inline HTML
    # in one shot.
    "parseSpan"           => -30,

    # Process anchor and image tags. Images must come first,
    # because ![foo][f] looks like an anchor.
    "doImages"            =>  10,
    "doAnchors"           =>  20,
    
    # Make links out of things like `<http://example.com/>`
    # Must come after doAnchors, because you can use < and >
    # delimiters in inline links like [this](<url>).
    "doAutoLinks"         =>  30,
    "encodeAmpsAndAngles" =>  40,

    "doItalicsAndBold"    =>  50,
    "doHardBreaks"        =>  60,
    "doNewLines"          =>  70,
    );

function runSpanGamut($text) {
#
# Run span gamut tranformations.
#
    foreach ($this->span_gamut as $method => $priority) {
        $text = $this->$method($text);
    }

    return $text;
}

function doNewLines($text) {
    return nl2br($text);
}
Memphis answered 9/5, 2011 at 11:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.