URL Regular Expression & JavaScript Link Shortener

Detecting URLs in text strings is something that you will probably need sometime when parsing some text in your applications, and their shortening can also be quite handy (like Twitter does for example).

Matching URLs is not so difficult, but matching them and grouping all URL fragments is. Especially if you want to do that in one match. So, here is one URL Regular Expression to rule them all.


Regular expression (or RegEx) is a form of a parsing language which purpose is to perform the matching of particular words, patterns or characters within the text string.

Someone once said, RegEx is like a language of its own. At first it may look like bunch of random characters, but actually it’s a pretty useful technique for both simple and complex matching of whatever you need within text string. It is supported by all major programming languages (PHP, Perl, JavaScript, Java, .NET, etc.)


We all know how URL looks like. Here you can see URL anathomy (along with good SEO practices), or just search across the web for it.

URL Regular Expression

In order to match and group all the fragments of URL, and all complex situations of different URL variations, our RegEx is a little-bit longer (324 chars), but it captures and groups all parts nicely (tested it in PHP and JavaScript).

/* URL-RegEx 1.2 */


It captures 8 groups (plus zero group that contains entire URL). If some of them don’t exist in URL, that group will return empty.

  1. Entire URL – url being parsed
  2. Protocol – http, https, ftp
  3. Userinfo – username:password
  4. Domain – www.mydomain.com, mydomain.com,, localhost…
  5. Port – 80
  6. Path / Folders – /folder/dir/
  7. Page / Filename – eg. index
  8. File extension – .html, .php…
  9. Query – item=value&item2=value2
  10. Anchor – #home

You can see it in action, and also test your own URLs:
Test RegEx
There is just one little catch. Because brackets () are allowed in URLs, lets imagine someone put the URL inside brackets that are not part of the URL.

Our blog (http://someweblog.com) is awesome, isn't it?

So, in order not to mix up some bracket that is not part of the URL, we are capturing the brackets before and after URL as well, and all you have to do after matching is to check if the both brackets exist and remove them. Something like this (in JavaScript):

if ( str.charAt(0) == '(' && str.charAt( str.length-1 ) == ')' ) {
    str = str.slice(1,-1);

How does it work?

If you really want to know how, or have some uncertainties, just write below in comments. I just couldn’t get myself to write this right now, but I will if someone is interested.

JavaScript link shortener

Now here comes the fun part, once we’ve matched an URL, we can do with it whatever we like. For example, you no longer have to worry about long links messing up your text. Here is Twitter like way of doing it:

var shortenUrl = function(url,protocol,host,port,path,filename,ext,query,fragment) {
    // set url length limit
    var limit = 20,
	show_www = false;
    // remove brackets if URL inside them
    if ( url.charAt(0) == '(' && url.charAt( url.length-1 ) == ')' ) {
        url = url.slice(1,-1);
    // add protocol if doesn't exist
    if ( !protocol ) {
        url = 'http://' + url;
    // create new url to show
    var domain = show_www ? host : host.replace(/www\./gi, '');
    var visibleUrl = domain + (path || '/') + (filename || '') + (ext || '') + (query ? '?'+query : '') + (fragment || '');
    // shorten URL if bigger than limit
    if ( visibleUrl.length > limit && domain.length < limit ) {
        visibleUrl = visibleUrl.slice(0, domain.length + (limit - domain.length)) + '...';
    return '' + visibleUrl + '';

// our URL RegRx
var urlRegex = /\(?(?:(http|https|ftp):\/\/)?(?:((?:[^\W\s]|\.|-|[:]{1})+)@{1})?((?:www.)?(?:[^\W\s]|\.|-)+[\.][^\W\s]{2,4}|localhost(?=\/)|\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})(?::(\d*))?([\/]?[^\s\?]*[\/]{1})*(?:\/?([^\s\n\?\[\]\{\}\#]*(?:(?=\.)){1}|[^\s\n\?\[\]\{\}\.\#]*)?([\.]{1}[^\s\?\#]*)?)?(?:\?{1}([^\s\n\#\[\]]*))?([\#][^\s\n]*)?\)?/gi;

// some text with link
var text = 'Awesome tune, check it out! http://www.youtube.com/watch?v=hVW9eH_PUi8';

// magic
text = text.replace(urlRegex, shortenUrl);

The result: Awesome trance-dubstep tune, check it out! youtube.com/watch?…

We hope that you find this texhnique useful. If you notice any bugs, or have some suggestions about RegEx, please feel free to write below in the comments.


  1. Scott Feinstein

    Very nice! I like that you provide a variety of test cases. I notice the regex doesn’t match:

    Without protocol, without slash but with query string params
    www test. test.com?foo=bar

    1. Some Web Guy

      Yeah, haven’t tried that case. I will fix it as soon as possible.
      EDIT: Fixed ;)

    1. Chris

      ^ it removed the bracket. Anyway so it accept the bracket: “[ID]”;

  2. mode 2013 facebook

    Hi! Someone in my Myspace group shared this website with us so I came to check it out.
    I’m definitely loving the information. I’m book-marking and will be tweeting this to my followers!
    Excellent blog and great design.

  3. Alexander Griffioen

    Ha! Many, many thanks! Was hoping to find a reg ex for URL detection, but you gave me *exactly* what I needed it for :)

  4. David

    a couple of cases I found that don’t seem to work as expected:

    thrivehive.com does not match anything (trailing slash is required)
    http://thrivehive.com does not match anything (trailing slash is required)
    http://thrivehive.co.uk returns .uk as the file extension (unless you have a trailing slash)

    1. Some Web Guy

      Works fine for me, try it in “test regex” link, you’ll see they work.

      1. David

        It works in the “test regex” link, but I think that’s because they’re surrounded by empty space. When I pass a url to match as a string it says that “.uk” is the file extension.

        For example, in javascript, this works: ” http://example.com “.match(pattern), but this doesn’t: “http://example.com”.match(pattern)

          1. David

            Awesome, thank you so much. This is an awesome regex, really!

            Have you seen John Gruber’s “An Improved Liberal, Accurate Regex Pattern for Matching URLs”? It’s pretty good. I did a write up comparing yours to his, here: http://bit.ly/15xStKW One thing missing from your regex is support for special characters, like ✪. Would you consider including this?

          2. David

            Also, I don’t know if this is a use case you care about, but links that provide a username and password (e.g. for ftp servers) fail to match. For example, “ftp://user:pass@perlide.org/pub/Makefile.PL”

          3. Some Web Guy

            Nice article :)

            Yeah, I’ll definitelly update it to support special characters, I’ve missed that.

            Also, I’ll improve ftp match, thanks for pointing it out.

            And those brackets (), have to check why it didn’t pick them up.

            So expect a new version in a day or two ;)

  5. Bernadine

    I tend not to leave a response, but after browsing through a
    ton of remarks here URL Regular Expression & JavaScript Link Shortener | Some Web Log.
    I actually do have 2 questions for you if it’s okay.
    Could it be simply me or does it appear like a few of these remarks come across as if they are left by brain dead people?
    :-P And, if you are writing on other sites, I would like
    to follow everything fresh you have to post. Could
    you post a list of every one of your social sites like your Facebook page,
    twitter feed, or linkedin profile?

  6. Jaya

    Good One! Just failing this case:
    var string=’test‘;
    I want to get only url from this string but its giving me wrong match http://www.google.com“>test

  7. fdsfsdf

    Ask a “main street” American how the economy is doing and they’ll say ‘we’re in a recession, if not a depression. This can be done in a number of ways including regular press releases and blog posts. You might have tried some of these but not been able to find what you were looking or found that the returns they found were not relevant to the subject for which you were searching.

  8. wbnd2015ru О этом, что десерты недужны чтобы типа видит каждая современная отроковица.
    Однако шоколад сли купить на рассматриваемом случае изложение удастся относительно умопомрачительной
    производстве нынешних виновников способов ко снижать значительного веса.
    Шоколад Слим на избавления от лишней массы тела – кушанье, от коего возможно стяжать подобное удовольствие,
    нужно заметить, что не без гигантскою пользой в пользу лица.
    Стало быть, тот, кто именно нашей фирме
    смешивает, тот покупатель для нас и еще обеспечит?
    Елико действенность станет такая сладостная порцион?

  9. conspiracy news

    I am really inspired together with your writig abilities as neatly as with
    thee structure in your blog. Is ths a paid theme or
    did you modify it your self? Either way keep up thee nice hig quality writing, iit is uncommon to peer a great
    blog like this one today..


    this is awesome, and working for me,

    Regards –
    sonu dhakar

  11. Magnificent site. Plenty of helpful info here.
    I am sending it to some friends ans also sharing in delicious.
    And obviously, thanks on your effort!

  12. organic food

    Unquestionably imagine that which you stated. Your favorite justification seemed to be on the web the easiest factor to be
    mindful of. I say to you, I certainly get annoyed whilst folks think
    about issues that they plainly don’t know about.

    You managed to hit the nail upon the top and also defined out the whole
    thing without having side-effects , other people can take a signal.
    Will probably be again to get more. Thank you

Leave a Reply

Your email address will not be published. Required fields are marked *