How to replace plain URLs with links?

Asked 1/9, 2008 at 9:58 Answered 19/4, 2021 at 15:57

492

I am using the function below to match URLs inside a given text and replace them for HTML links. The regular expression is working great, but currently I am only replacing the first match.

How I can replace all the URL? I guess I should be using the exec command, but I did not really figure how to do it.

function replaceURLWithHTMLLinks(text) {
    var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/i;
    return text.replace(exp,"<a href='$1'>$1</a>"); 
}

Interfaith answered 1/9, 2008 at 9:58 Comment(0)

402

First off, rolling your own regexp to parse URLs is a terrible idea. You must imagine this is a common enough problem that someone has written, debugged and tested a library for it, according to the RFCs. URIs are complex - check out the code for URL parsing in Node.js and the Wikipedia page on URI schemes.

There are a ton of edge cases when it comes to parsing URLs: international domain names, actual (.museum) vs. nonexistent (.etc) TLDs, weird punctuation including parentheses, punctuation at the end of the URL, IPV6 hostnames etc.

I've looked at a ton of libraries, and there are a few worth using despite some downsides:

Soapbox's linkify has seen some serious effort put into it, and a major refactor in June 2015 removed the jQuery dependency. It still has issues with IDNs.
AnchorMe is a newcomer that claims to be faster and leaner. Some IDN issues as well.
Autolinker.js lists features very specifically (e.g. "Will properly handle HTML input. The utility will not change the href attribute inside anchor () tags"). I'll thrown some tests at it when a demo becomes available.

Libraries that I've disqualified quickly for this task:

Django's urlize didn't handle certain TLDs properly (here is the official list of valid TLDs. No demo.
autolink-js wouldn't detect "www.google.com" without http://, so it's not quite suitable for autolinking "casual URLs" (without a scheme/protocol) found in plain text.
Ben Alman's linkify hasn't been maintained since 2009.

If you insist on a regular expression, the most comprehensive is the URL regexp from Component, though it will falsely detect some non-existent two-letter TLDs by looking at it.

Beeswing answered 21/2, 2014 at 4:46 Comment(10)

It's a pity the URL regexp from Component isn't commented, some explanation of what it is doing would be helpful. Autolinker.js is commented very well and has tests. The urlize.js library linked to in Vebjorn Ljosa's answer also looks featureful and well maintained, although it doesn't have tests. – Fixation 26/2, 2014 at 9:36

Regex101.com automatically "explains" the regexp, but good luck with that :) I've also quickly found a failure case with an invalid TLD (same link). – Beeswing 26/2, 2014 at 9:44

That explains what the regex is doing (which is useful) but doesn't explain what it's hoping to match in terms of URL structure, which is what I'd hope comments would document. – Fixation 26/2, 2014 at 10:47

@SamHasler: Autolinker needs to improve in the TLDs and IDNs area. Added some tests. – Beeswing 26/2, 2014 at 10:55

Curious that nobody mentioned John Gruber's efforts in maintaining a URL regex pattern. It's not the only/ideal solution to the problem, but in any case worth investigating, if you're rolling your own solution. Just wanted to add this as a reference. – Surtax 10/6, 2014 at 11:25

Despite the name, jQuery linkify is not closely integrated with jQuery; it just provides jQuery support for convenience. The linkified.js source file works well on it's own: Linkified.linkify('text that might include a url') – Gerthagerti 23/12, 2014 at 19:36

@DanDascalescu Take a look at this markdown-it.github.io/linkify-it . This library is focused exactly on one task - detecting link patterns in text. But i hope, it does it well. For example, it has correct unicode support, including astral characters. And it supports international TLDs. – Malvern 12/2, 2015 at 19:2

Linkify is good. I used it. It may be an overkill, but it does the job very well, with some nice customization options. – Shelley 22/5, 2015 at 9:12

plus 1 for Autolinker.js, easy to implement, quick solution if your looking for just that. thanks – Ann 21/3, 2016 at 14:1

For anyone reading this in 2017 and beyond, anchorme has resolved the IDN issues and it can properly handle URLs that are emojis or non-latin text. – Citrus 3/3, 2017 at 9:23

287

Replacing URLs with links (Answer to the General Problem)

The regular expression in the question misses a lot of edge cases. When detecting URLs, it's always better to use a specialized library that handles international domain names, new TLDs like .museum, parentheses and other punctuation within and at the end of the URL, and many other edge cases. See the Jeff Atwood's blog post The Problem With URLs for an explanation of some of the other issues.

The best summary of URL matching libraries is in Dan Dascalescu's Answer
(as of Feb 2014)

"Make a regular expression replace more than one match" (Answer to the specific problem)

Add a "g" to the end of the regular expression to enable global matching:

/ig;

But that only fixes the problem in the question where the regular expression was only replacing the first match. Do not use that code.

Fixation answered 1/9, 2008 at 10:0 Comment(0)

178

I've made some small modifications to Travis's code (just to avoid any unnecessary redeclaration - but it's working great for my needs, so nice job!):

function linkify(inputText) {
    var replacedText, replacePattern1, replacePattern2, replacePattern3;

    //URLs starting with http://, https://, or ftp://
    replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
    replacedText = inputText.replace(replacePattern1, '<a href="$1" target="_blank">$1</a>');

    //URLs starting with "www." (without // before it, or it'd re-link the ones done above).
    replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
    replacedText = replacedText.replace(replacePattern2, '$1<a href="http://$2" target="_blank">$2</a>');

    //Change email addresses to mailto:: links.
    replacePattern3 = /(([a-zA-Z0-9\-\_\.])+@[a-zA-Z\_]+?(\.[a-zA-Z]{2,6})+)/gim;
    replacedText = replacedText.replace(replacePattern3, '<a href="mailto:$1">$1</a>');

    return replacedText;
}

Gilbertogilbertson answered 8/10, 2010 at 11:50 Comment(13)

how do edit this code to not to harm embedded objects and iframes.. (youtube embedded objects and iframes) – Spastic 10/12, 2010 at 20:54

There's a bug in the code that matches email addresses here. [a-zA-Z]{2,6} should read something along the lines of (?:[a-zA-Z]{2,6})+ in order to match more complicated domain names, i.e. [email protected]. – Prognostic 19/8, 2011 at 15:7

I've run into some problems; first just http:// or http:// www (without space www even SO parses this wrong apparently) will create a link. And links with http:// www . domain . com (without spaces) will create one empty link and then one with an attached anchor closing tag in the href field. – Saint 18/10, 2011 at 21:36

What about URLs without http:// or www? Will this work for those kind of URLs? – Uxmal 1/12, 2011 at 19:41

Great code! There are some slight problems. Like Roshambo mentioned it cannot handle .co.uk in mailto links, also a <br /> just before www link (without the http://) will confuse it. It will insert the br tag inside the link for some reason. My regex skills are not enough to fix it, luckily the second issue isn't really a problem in my use-case and I don't really need the mailto :) – Heterosporous 21/5, 2013 at 9:19

I tried to edit the original post to fix the mailto problem, but I have to add at least 6 characters to make an edit. But if you change this line: replacePattern3 = /(\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,6})/gim; with this replacePattern3 = /(\w+@[a-zA-Z_]+?(\.[a-zA-Z]{2,6})+)/gim; that fixes the mailto problem :) – Bridgetbridgetown 14/6, 2013 at 18:17

This answer has been updated since @yourdeveloperfriend's comment and now includes a valid email regex pattern. – Hemicellulose 5/8, 2014 at 16:14

Runs into problems with links containing emails IE: http://[email protected] – Newmark 18/12, 2014 at 23:31

This one doesn't work if the link is preceded by or proceeded by a <br/> tag. how could it be solved? – Beatnik 5/12, 2015 at 18:0

@cloud8421, love this, but found an issue with urls like [www.google.com] , which was working fine in the replacePattern1, but not in replacePattern2, so there is an upgrade to the script - whoever wants to check it out: jsfiddle.net/9zc8yq04 – Morganite 20/4, 2016 at 14:4

I think the regular expression doesn't work when there is a * in the URL which I believe is allowed. It can be fixed by adding \*. – Nipa 27/5, 2016 at 4:30

Sorry to downvote, but this does not work for urls like youtube.com/watch?v=MBPdKxlazD0 – Wisteria 24/7, 2018 at 13:22

Your function treats this Microsoft URL as an email address. Would be good to avoid processing as an email if one of the prior http or www checks has already been applied. outlook.office365.com/owa/calendar/… – Seema 10/2, 2021 at 3:42

Made some optimizations to Travis' Linkify() code above. I also fixed a bug where email addresses with subdomain type formats would not be matched (i.e. [email protected]).

In addition, I changed the implementation to prototype the String class so that items can be matched like so:

var text = '[email protected]';
text.linkify();

'http://stackoverflow.com/'.linkify();

Anyway, here's the script:

if(!String.linkify) {
    String.prototype.linkify = function() {

        // http://, https://, ftp://
        var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&@#\/%?=~_|!:,.;]*[a-z0-9-+&@#\/%=~_|]/gim;

        // www. sans http:// or https://
        var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;

        // Email addresses
        var emailAddressPattern = /[\w.]+@[a-zA-Z_-]+?(?:\.[a-zA-Z]{2,6})+/gim;

        return this
            .replace(urlPattern, '<a href="$&">$&</a>')
            .replace(pseudoUrlPattern, '$1<a href="http://$2">$2</a>')
            .replace(emailAddressPattern, '<a href="mailto:$&">$&</a>');
    };
}

Prognostic answered 19/8, 2011 at 15:3 Comment(5)

The best in my opinion, as Prototype functions make things so much cleaner :) – Silveira 25/1, 2014 at 15:35

it seems it doesn't work with such email addresses: [email protected] [email protected] etc.. – Albaalbacete 7/10, 2014 at 11:24

This doesn't work for the string "git clone [email protected]/ooo/bbb-cc-dd.git". It broke the string into chunks and created multiple anchors like this "git clone <a href="https://<a href="mailto:[email protected]">[email protected]</a>/ooo/bbb-cc-dd.git">https://<a href="mailto:[email protected]">[email protected]</a>/ooo/bbb-cc-dd.git</a>" – Rhapsodize 29/10, 2015 at 7:51

It doesn't work with + in email usernames, such as [email protected]. I fixed it with email pattern /[\w.+]+@[a-zA-Z_-]+?(?:\.[a-zA-Z]{2,6})+/gim (note the + in the first brackets), but I don't know if that breaks something else. – Passifloraceous 7/1, 2016 at 6:30

Thanks, this was very helpful. I also wanted something that would link things that looked like a URL -- as a basic requirement, it'd link something like www.yahoo.com, even if the http:// protocol prefix was not present. So basically, if "www." is present, it'll link it and assume it's http://. I also wanted emails to turn into mailto: links. EXAMPLE: www.yahoo.com would be converted to www.yahoo.com

Here's the code I ended up with (combination of code from this page and other stuff I found online, and other stuff I did on my own):

function Linkify(inputText) {
    //URLs starting with http://, https://, or ftp://
    var replacePattern1 = /(\b(https?|ftp):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gim;
    var replacedText = inputText.replace(replacePattern1, '<a href="$1" target="_blank">$1</a>');

    //URLs starting with www. (without // before it, or it'd re-link the ones done above)
    var replacePattern2 = /(^|[^\/])(www\.[\S]+(\b|$))/gim;
    var replacedText = replacedText.replace(replacePattern2, '$1<a href="http://$2" target="_blank">$2</a>');

    //Change email addresses to mailto:: links
    var replacePattern3 = /(\w+@[a-zA-Z_]+?\.[a-zA-Z]{2,6})/gim;
    var replacedText = replacedText.replace(replacePattern3, '<a href="mailto:$1">$1</a>');

    return replacedText
}

In the 2nd replace, the (^|[^/]) part is only replacing www.whatever.com if it's not already prefixed by // -- to avoid double-linking if a URL was already linked in the first replace. Also, it's possible that www.whatever.com might be at the beginning of the string, which is the first "or" condition in that part of the regex.

This could be integrated as a jQuery plugin as Jesse P illustrated above -- but I specifically wanted a regular function that wasn't acting on an existing DOM element, because I'm taking text I have and then adding it to the DOM, and I want the text to be "linkified" before I add it, so I pass the text through this function. Works great.

Lineup answered 29/1, 2010 at 23:55 Comment(3)

There's a problem with the 2nd pattern, which matches plain "www.domain.com" all by itself. The problem exists when url has some sort of referrer in it, like: &location=http%3A%2F%2Fwww.amazon.com%2FNeil-Young%2Fe%2FB000APYJWA%3Fqid%3D1280679945%26sr%3D8-2-ent&tag=tra0c7-20&linkCode=ur2&camp=1789&creative=9325 - in which case the link auto linked again. A quick fix is to add the character "f" after the negated list that contains "/". So the expression is: replacePattern2 = /(^|[^\/f])(www\.[\S]+(\b|$))/gim – Sikkim 19/11, 2012 at 4:39

The code above will fail a lot of tests for edge cases. When detecting URLs, it's better to rely on a specialized library. Here's why. – Beeswing 21/2, 2014 at 11:15

I just ran it on a string where some of the web links do already have a href links on them. In this case it fails messing up the existing working links. – Katherinkatherina 9/4, 2014 at 15:2

Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript:

https://github.com/ljosa/urlize.js

An example:

urlize('Go to SO (stackoverflow.com) and ask. <grin>', 
       {nofollow: true, autoescape: true})
=> "Go to SO (<a href="http://stackoverflow.com" rel="nofollow">stackoverflow.com</a>) and ask. &lt;grin&gt;"

The second argument, if true, causes rel="nofollow" to be inserted. The third argument, if true, escapes characters that have special meaning in HTML. See the README file.

Plaza answered 8/5, 2012 at 12:2 Comment(4)

Also works with html source like: www.web.com < a href = " https :// github . com " > url < / a > some text – Smooth 25/5, 2012 at 14:50

@Paulius: if you set the option django_compatible to false, it will handle that use case a little better. – Plaza 26/5, 2012 at 11:29

Django's urlize doesn't support TLDs properly (at least not the JS port on GitHub). A library that handles TLDs properly is Ben Alman's JavaScript Linkify. – Beeswing 21/2, 2014 at 2:18

Support for detecting URLs with additional top-level domains even when the URL does not start with "http" or "www" has been added. – Plaza 21/2, 2014 at 14:34

I searched on google for anything newer and ran across this one:

$('p').each(function(){
   $(this).html( $(this).html().replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '<a href="$1">$1</a> ') );
});

demo: http://jsfiddle.net/kachibito/hEgvc/1/

Works really well for normal links.

Licking answered 24/3, 2016 at 14:19 Comment(3)

What is "Normal links" here? Look at fork of your demo here: jsfiddle.net/hEgvc/27 People would cover uncovered and would make this in easy way. URI is not easy thing as per RFC3986 and if you would like to cover "Normal links" only, I suggest to follow this regexp at least: ^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))? – Measurable 25/3, 2016 at 8:31

I meant anything in the format http://example.com/folder/folder/folder/ or https://example.org/blah etc - just your typical non-crazy URL format that will match 95-99% of use cases out there. I am using this for an internal administrative area, so I don't need anything fancy to catch edge-cases or hashlinks. – Licking 25/3, 2016 at 18:6

Thanks yours finally helped me with what I needed! I just had to amend it a bit: /(?:^|[^"'>])((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/gi – Afferent 23/4, 2021 at 9:31

I made a change to Roshambo String.linkify() to the emailAddressPattern to recognize [email protected] addresses

if(!String.linkify) {
    String.prototype.linkify = function() {

        // http://, https://, ftp://
        var urlPattern = /\b(?:https?|ftp):\/\/[a-z0-9-+&@#\/%?=~_|!:,.;]*[a-z0-9-+&@#\/%=~_|]/gim;

        // www. sans http:// or https://
        var pseudoUrlPattern = /(^|[^\/])(www\.[\S]+(\b|$))/gim;

        // Email addresses *** here I've changed the expression ***
        var emailAddressPattern = /(([a-zA-Z0-9_\-\.]+)@[a-zA-Z_]+?(?:\.[a-zA-Z]{2,6}))+/gim;

        return this
            .replace(urlPattern, '<a target="_blank" href="$&">$&</a>')
            .replace(pseudoUrlPattern, '$1<a target="_blank" href="http://$2">$2</a>')
            .replace(emailAddressPattern, '<a target="_blank" href="mailto:$1">$1</a>');
    };
}

Winifredwinikka answered 21/8, 2011 at 14:15 Comment(1)

The code above will fail a lot of tests for edge cases. When detecting URLs, it's better to rely on a specialized library. Here's why. – Beeswing 21/2, 2014 at 11:16

/**
 * Convert URLs in a string to anchor buttons
 * @param {!string} string
 * @returns {!string}
 */

function URLify(string){
  var urls = string.match(/(((ftp|https?):\/\/)[\-\w@:%_\+.~#?,&\/\/=]+)/g);
  if (urls) {
    urls.forEach(function (url) {
      string = string.replace(url, '<a target="_blank" href="' + url + '">' + url + "</a>");
    });
  }
  return string.replace("(", "<br/>(");
}

simple example

Dietary answered 8/4, 2019 at 14:21 Comment(0)

The best script to do this: http://benalman.com/projects/javascript-linkify-process-lin/

Cofferdam answered 25/6, 2010 at 5:18 Comment(1)

Too bad the author hasn't maintained it since 2009. I'm summarizing URL parsing alternatives. – Beeswing 21/2, 2014 at 5:43

This solution works like many of the others, and in fact uses the same regex as one of them, however in stead of returning a HTML String this will return a document fragment containing the A element and any applicable text nodes.

 function make_link(string) {
    var words = string.split(' '),
        ret = document.createDocumentFragment();
    for (var i = 0, l = words.length; i < l; i++) {
        if (words[i].match(/[-a-zA-Z0-9@:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9@:%_\+.~#?&//=]*)?/gi)) {
            var elm = document.createElement('a');
            elm.href = words[i];
            elm.textContent = words[i];
            if (ret.childNodes.length > 0) {
                ret.lastChild.textContent += ' ';
            }
            ret.appendChild(elm);
        } else {
            if (ret.lastChild && ret.lastChild.nodeType === 3) {
                ret.lastChild.textContent += ' ' + words[i];
            } else {
                ret.appendChild(document.createTextNode(' ' + words[i]));
            }
        }
    }
    return ret;
}

There are some caveats, namely with older IE and textContent support.

here is a demo.

Osteoma answered 22/11, 2012 at 19:3 Comment(5)

@DanDascalescu Instead of blanket downvoting the lot maybe provide your said edge cases. – Osteoma 21/2, 2014 at 11:58

so there are edge cases. wonderful. these answers still may be useful to others and blanket downvoting them seems like overkill. The other answers you've commented on and seemingly downvoted do contain useful information (as well as your answer). not everyone will come against said cases, and not everyone will want to use a library. – Osteoma 21/2, 2014 at 12:5

Exactly. Those who don't understand the limitations of regexps are those who will happily skim the first regexp from the most upvoted answer and run with it. Those are the people who should use libraries the most. – Beeswing 21/2, 2014 at 12:8

But how is that justification to down vote every answer with non-your-prefered-solutions regexp? – Osteoma 21/2, 2014 at 12:11

So that an actually useful answer bubbles up towards the top. People's attention span is short, and the paradox of choice indicates that they'll stop looking into answer beyond the Nth. – Beeswing 21/2, 2014 at 12:17

If you need to show shorter link (only domain), but with same long URL, you can try my modification of Sam Hasler's code version posted above

function replaceURLWithHTMLLinks(text) {
    var exp = /(\b(https?|ftp|file):\/\/([-A-Z0-9+&@#%?=~_|!:,.;]*)([-A-Z0-9+&@#%?\/=~_|!:,.;]*)[-A-Z0-9+&@#\/%=~_|])/ig;
    return text.replace(exp, "<a href='$1' target='_blank'>$3</a>");
}

Gagliano answered 9/12, 2011 at 8:42 Comment(0)

Reg Ex: /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*)/ig

function UriphiMe(text) {
      var exp = /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|]*)/ig; 
      return text.replace(exp,"<a href='$1'>$1</a>");
}

Below are some tested string:

Find me on to www.google.com
www
Find me on to www.http://www.com
Follow me on : http://www.nishantwork.wordpress.com
http://www.nishantwork.wordpress.com
Follow me on : http://www.nishantwork.wordpress.com
https://stackoverflow.com/users/430803/nishant

Note: If you don't want to pass www as valid one just use below reg ex: /(\b((https?|ftp|file):\/\/|(www))[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig

Naoma answered 30/1, 2014 at 11:58 Comment(1)

The code above will fail a lot of tests for edge cases. When detecting URLs, it's ALWAYS better to rely on a specialized library. Here's why. – Beeswing 21/2, 2014 at 5:31

The warnings about URI complexity should be noted, but the simple answer to your question is:
To replace every match you need to add the /g flag to the end of the RegEx:
/(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/gi

Buccal answered 2/5, 2016 at 18:11 Comment(0)

Try the below function :

function anchorify(text){
  var exp = /(\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])/ig;
  var text1=text.replace(exp, "<a href='$1'>$1</a>");
  var exp2 =/(^|[^\/])(www\.[\S]+(\b|$))/gim;
  return text1.replace(exp2, '$1<a target="_blank" href="http://$2">$2</a>');
}

alert(anchorify("Hola amigo! https://www.sharda.ac.in/academics/"));

Wiatt answered 12/3, 2019 at 4:51 Comment(1)

Works great with https:// https://www. http:// http://www. www. – Preestablish 25/5, 2021 at 12:59

Keep it simple! Say what you cannot have, rather than what you can have :)

As mentioned above, URLs can be quite complex, especially after the '?', and not all of them start with a 'www.' e.g. maps.bing.com/something?key=!"£$%^*()&lat=65&lon&lon=20

So, rather than have a complex regex that wont meet all edge cases, and will be hard to maintain, how about this much simpler one, which works well for me in practise.

Match

http(s):// (anything but a space)+

www. (anything but a space)+

Where 'anything' is [^'"<>\s] ... basically a greedy match, carrying on to you meet a space, quote, angle bracket, or end of line

Also:

Remember to check that it is not already in URL format, e.g. the text contains href="..." or src="..."

Add ref=nofollow (if appropriate)

This solution isn't as "good" as the libraries mentioned above, but is much simpler, and works well in practise.

if html.match( /(href)|(src)/i )) {
    return html; // text already has a hyper link in it
    }

html = html.replace( 
            /\b(https?:\/\/[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='$1'>$1</a>" 
            );

html = html.replace( 
            /\s(www\.[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='http://$1'>$1</a>" 
            );

html = html.replace( 
             /^(www\.[^\s\(\)\'\"\<\>]+)/ig, 
            "<a ref='nofollow' href='http://$1'>$1</a>" 
            );

return html;

Max answered 27/5, 2014 at 10:58 Comment(0)

Correct URL detection with international domains & astral characters support is not trivial thing. linkify-it library builds regex from many conditions, and final size is about 6 kilobytes :) . It's more accurate than all libs, currently referenced in accepted answer.

See linkify-it demo to check live all edge cases and test your ones.

If you need to linkify HTML source, you should parse it first, and iterate each text token separately.

Malvern answered 16/5, 2015 at 19:50 Comment(0)

I've wrote yet another JavaScript library, it might be better for you since it's very sensitive with the least possible false positives, fast and small in size. I'm currently actively maintaining it so please do test it in the demo page and see how it would work for you.

link: https://github.com/alexcorvi/anchorme.js

Citrus answered 2/3, 2016 at 21:26 Comment(1)

What's the name of the npm package @Alex C – Fussbudget 22/9, 2022 at 21:50

I had to do the opposite, and make html links into just the URL, but I modified your regex and it works like a charm, thanks :)

var exp = /<a\s.*href=['"](\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0-9+&@#\/%=~_|])['"].*>.*<\/a>/ig;

source = source.replace(exp,"$1");

Creodont answered 27/4, 2009 at 3:20 Comment(2)

I don't see the point of your regex. It matches everything replacing everything with everything. In effect your code does nothing. – Zamia 27/4, 2009 at 3:24

I guess I should wait to comment to allow for people to finish editing. sorry. – Zamia 27/4, 2009 at 3:27

The e-mail detection in Travitron's answer above did not work for me, so I extended/replaced it with the following (C# code).

// Change e-mail addresses to mailto: links.
const RegexOptions o = RegexOptions.Multiline | RegexOptions.IgnoreCase;
const string pat3 = @"([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,6})";
const string rep3 = @"<a href=""mailto:$1@$2.$3"">$1@$2.$3</a>";
text = Regex.Replace(text, pat3, rep3, o);

This allows for e-mail addresses like "[email protected]".

Summation answered 12/2, 2010 at 8:2 Comment(2)

The code above will fail a lot of tests for edge cases. When detecting URLs, it's ALWAYS better to rely on a specialized library. Here's why. – Beeswing 21/2, 2014 at 5:32

Thanks, @DanDascalescu Usually, it is always better to over-generalize. – Summation 21/2, 2014 at 5:58

After input from several sources I've now a solution that works well. It had to do with writing your own replacement code.

Answer.

Fiddle.

function replaceURLWithHTMLLinks(text) {
    var re = /(\(.*?)?\b((?:https?|ftp|file):\/\/[-a-z0-9+&@#\/%?=~_()|!:,.;]*[-a-z0-9+&@#\/%=~_()|])/ig;
    return text.replace(re, function(match, lParens, url) {
        var rParens = '';
        lParens = lParens || '';

        // Try to strip the same number of right parens from url
        // as there are left parens.  Here, lParenCounter must be
        // a RegExp object.  You cannot use a literal
        //     while (/\(/g.exec(lParens)) { ... }
        // because an object is needed to store the lastIndex state.
        var lParenCounter = /\(/g;
        while (lParenCounter.exec(lParens)) {
            var m;
            // We want m[1] to be greedy, unless a period precedes the
            // right parenthesis.  These tests cannot be simplified as
            //     /(.*)(\.?\).*)/.exec(url)
            // because if (.*) is greedy then \.? never gets a chance.
            if (m = /(.*)(\.\).*)/.exec(url) ||
                    /(.*)(\).*)/.exec(url)) {
                url = m[1];
                rParens = m[2] + rParens;
            }
        }
        return lParens + "<a href='" + url + "'>" + url + "</a>" + rParens;
    });
}

Pile answered 4/11, 2013 at 16:59 Comment(2)

The code above (and most regular expressions in general) will fail a lot of tests for edge cases. When detecting URLs, it's better to rely on a specialized library. Here's why. – Beeswing 21/2, 2014 at 11:17

Dan, Is there such a library? Though in this case we'd still be matching the above regex so that the code can never output garbage when something garbage like(even if another library certifies the garbage as a valid URL/URI) is used as input. – Pile 12/1, 2015 at 9:33

Here's my solution:

var content = "Visit https://wwww.google.com or watch this video: https://www.youtube.com/watch?v=0T4DQYgsazo and news at http://www.bbc.com";
content = replaceUrlsWithLinks(content, "http://");
content = replaceUrlsWithLinks(content, "https://");

function replaceUrlsWithLinks(content, protocol) {
    var startPos = 0;
    var s = 0;

    while (s < content.length) {
        startPos = content.indexOf(protocol, s);

        if (startPos < 0)
            return content;

        let endPos = content.indexOf(" ", startPos + 1);

        if (endPos < 0)
            endPos = content.length;

        let url = content.substr(startPos, endPos - startPos);

        if (url.endsWith(".") || url.endsWith("?") || url.endsWith(",")) {
            url = url.substr(0, url.length - 1);
            endPos--;
        }

        if (ROOTNS.utils.stringsHelper.validUrl(url)) {
            let link = "<a href='" + url + "'>" + url + "</a>";
            content = content.substr(0, startPos) + link + content.substr(endPos);
            s = startPos + link.length;
        } else {
            s = endPos + 1;
        }
    }

    return content;
}

function validUrl(url) {
    try {
        new URL(url);
        return true;
    } catch (e) {
        return false;
    }
}

Wisteria answered 24/7, 2018 at 13:29 Comment(0)

Try Below Solution

function replaceLinkClickableLink(url = '') {
let pattern = new RegExp('^(https?:\\/\\/)?'+
        '((([a-z\\d]([a-z\\d-]*[a-z\\d])*)\\.?)+[a-z]{2,}|'+
        '((\\d{1,3}\\.){3}\\d{1,3}))'+
        '(\\:\\d+)?(\\/[-a-z\\d%_.~+]*)*'+
        '(\\?[;&a-z\\d%_.~+=-]*)?'+
        '(\\#[-a-z\\d_]*)?$','i');

let isUrl = pattern.test(url);
if (isUrl) {
    return `<a href="${url}" target="_blank">${url}</a>`;
}
return url;
}

Theriot answered 8/4, 2019 at 9:37 Comment(0)

Replace URLs in text with HTML links, ignore the URLs within a href/pre tag. https://github.com/JimLiu/auto-link

Cruiser answered 11/6, 2015 at 21:31 Comment(0)

worked for me :

var urlRegex =/(\b((https?|ftp|file):\/\/)?((([a-z\d]([a-z\d-]*[a-z\d])*)\.)+[a-z]{2,}|((\d{1,3}\.){3}\d{1,3}))(\:\d+)?(\/[-a-z\d%_.~+]*)*(\?[;&a-z\d%_.~+=-]*)?(\#[-a-z\d_]*)?)/ig;

return text.replace(urlRegex, function(url) {
    var newUrl = url.indexOf("http") === -1 ? "http://" + url : url;
    return '<a href="' + newUrl + '">' + url + '</a>';
});

Garlan answered 19/4, 2021 at 15:57 Comment(1)

this one does not work if url has = sign in it – Julius 14/10, 2022 at 6:10

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Replacing URLs with links (Answer to the General Problem)

"Make a regular expression replace more than one match" (Answer to the specific problem)

Recommended topics

Hot tags