I'm using an API that sometimes truncates links inside the text that it returns and instead of "longtexthere https://fancy.link" I get "longtexthere https://fa…".
I'm trying to get to match the link only if it's complete, or in other words does not contain "…" character.
So far I am able to get links by using the following regex:
((?:https?:)?\/\/\S+\/?)
but obviously it returns every link including broken ones.
I've tried to do something like this:
((?:https?:)?\/\/(?:(?!…)\S)+\/?)
Although that started to ignore the "…" character it was still returning the link but just without including the character, so with the case of "https://fa…" it returned "https://fa" whereas I simply want it to ignore that broken link and move on.
Been fighting this for hours and just can't get my head around it. :(
Thanks for any help in advance.
(?:https?:)?\/\/[^\s…]++(?!…)\/?
– Insane\/?
at the end as it will not be matched ever. If your regex flavor is JavaScript or Python, try(?!\S+…)(?:https?:)?\/\/\S+
– Insane(?:https?:)?\/\/\S++(?<!…)
The possessive quantifier will prevent from backtracking if the lookbehind does not match. – Zorine