Edit: I recently learned about a project called CommonMark, which correctly identifies and deals with the ambiguities in the original Markdown specification. http://commonmark.org/ It has great C# library support.
You can find the syntax here.
The source that follows with the download is written in Perl, which I have no intentions of honoring. It is riddled with regular expressions, and it relies on MD5 hashes to escape certain characters. Something is just wrong about that!
I'm about to hard code a parser for Markdown. What is experience with this?
If you don't have anything meaningful to say about the actual parsing of Markdown, spare me the time. (This might sound harsh, but yes, I'm looking for insight, not a solution, that is, a third-party library).
To help a bit with the answers, regular expressions are meant to identify patterns! NOT to parse an entire grammar. That people consider doing so is foobar.
- If you think about Markdown, it's fundamentally based around the concept of paragraphs.
- As such, a reasonable approach might be to split the input into paragraphs.
- There are many kinds of paragraphs, for example, heading, text, list, blockquote, and code.
- The challenge is thus to identify these paragraphs and in what context they occur.
I'll be back with a solution, once I find it's worthy to be shared.