I'm currently working on with telegram Bot API, but I have to validate the markdown syntax to prevent parse errors. But telegram bot api's markdown doesn't follow the regular markdown syntax so I'm kind of struggling how to do it. Is there a proper way to validate it? Or is there such kind of library that I can use?
Probably not the answer OP was expecting, but still sharing so others may find this useful.
I've been trying to 'validate' the MarkDown messages to prevent the Bad Request: can't parse entities:
error received by Telegram. For example, the same issue this user encountered
Unfortunately I was unable to parse this with 100% accuracy, probably because (as you already mentioned) Telegram doesn't use the default Markdown syntax.
My 'solution' as I've implemented in quite some bots, and is working decent.
After sending a message (I've created a custom function to prevent duplicate code), check if Telegram responded with the Bad Request: can't parse entities
error, If so, send the same message again, but this time with HTML
parse_mode
, this way there won't be any parse errors.
Not the most clean solution, but it gets the message to the user, and that was my greatest concern.
I have implemented this validation function (Golang). Works pretty good for telegram's markdown:
func GetUnclosedTag(markdown string) string {
// order is important!
var tags = []string{
"```",
"`",
"*",
"_",
}
var currentTag = ""
markdownRunes := []rune(markdown)
var i = 0
outer:
for i < len(markdownRunes) {
// skip escaped characters (only outside tags)
if markdownRunes[i] == '\\' && currentTag == "" {
i += 2
continue
}
if currentTag != "" {
if strings.HasPrefix(string(markdownRunes[i:]), currentTag) {
// turn a tag off
i += len(currentTag)
currentTag = ""
continue
}
} else {
for _, tag := range tags {
if strings.HasPrefix(string(markdownRunes[i:]), tag) {
// turn a tag on
currentTag = tag
i += len(currentTag)
continue outer
}
}
}
i++
}
return currentTag
}
func IsValid(markdown string) bool {
return GetUnclosedTag(markdown) == ""
}
func FixMarkdown(markdown string) string {
tag := GetUnclosedTag(markdown)
if tag == "" {
return markdown
}
return markdown + tag
}
Better way seems to be convert markdown to html because in my case there is example "Calculation is: 15 * 10" and telegram parser fails on it... Here is typescript function that works fine for now:
export const markdownToHtml = (markdown: string): string => {
// Escape html tags
markdown = markdown.replace(/[&<>"']/g, (match) => {
switch (match) {
case '&':
return '&'
case '<':
return '<'
case '>':
return '>'
case '"':
return '"'
case "'":
return '''
default:
return match
}
})
// Combine header replacements into one regex
markdown = markdown.replace(/^(#{1,6}) (.*)$/gim, (_, __, p2) => {
return `<b>${p2}</b>`
})
// Convert code blocks
markdown = markdown.replace(/```(\w+)?\n([\s\S]*?)```/g, (_, p1, p2) => {
if (p1) {
return `<pre language="language-${p1}">${p2}</pre>`
}
return `<code>${p2}</code>`
})
// Convert inline code
markdown = markdown.replace(/`([^`]+)`/g, '<code>$1</code>')
// Combine bold, italic, underline, and strikethrough replacements into one regex
markdown = markdown.replace(/(\*\*|__)(.*?)\1/g, '<b>$2</b>') // Bold
markdown = markdown.replace(/(\*|_)(.*?)\1/g, '<em>$2</em>') // Italic
markdown = markdown.replace(/~~(.*?)~~/g, '<s>$1</s>') // Strikethrough
// Convert links
markdown = markdown.replace(/\[([^\]]+)\]\(([^)]+)\)/g, '<a href="$2">$1</a>')
// Convert images
markdown = markdown.replace(/!\[([^\]]*)\]\(([^)]+)\)/g, '<a src="$2">$1</a>')
// Return empty space if there is no string
return markdown.trim() || 'ㅤ'
}
© 2022 - 2024 — McMap. All rights reserved.