Validate IPv4, IPv6 and hostname
Asked Answered
S

4

21

I'm working on a project in .net , that can connect to different machines by an IP address that the user inputs.
I'm trying to validate the inputted IP Address using a regular expression. I've searched the internet for some time now, and i cannot get a proper regex.

I've wrote a little program to test the regex, see here,(the IP addresses were generated randomly, I'm sorry if some of the IP Addresse belongs to someone)

Can you help me find a viable solution, in validating on client side the user input ?! (it can be an IPv4, IPv6 or hostname, the port is not included in the address)

Thanks.

Shouse answered 9/2, 2012 at 10:5 Comment(5)
The regexp for validating IPv4 address can be found in this questionMercenary
The updated test case is wrong, 1.2.3.4 should definitely pass as it's valid IPv4, isn't it?Impetus
@MikulasDite yes you are right, been copy-pasting test cases :DShouse
i like it that the OP pointed out initially he is working in .NET and all the answers people posted were in JavaScript :DBlockhouse
Please use my answer from this post: #23484355, it's the most accurate one so far. Bear in mind it won't validate hostname, but you could add that / modify it if you want to.Lothair
S
38

I managed to put together a regex that matches IPv6, IPv4 and Hostname that i can think of, unfortunately what seems to be invalid IP Address is valid Hostname, in some cases, but i guess that is ok.
So here's the regex :) the test program can be found here

(^\s*((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))\s*$)|(^\s*((?=.{1,255}$)[0-9A-Za-z](?:(?:[0-9A-Za-z]|\b-){0,61}[0-9A-Za-z])?(?:\.[0-9A-Za-z](?:(?:[0-9A-Za-z]|\b-){0,61}[0-9A-Za-z])?)*\.?)\s*$)|(^\s*((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?\s*$)


 (
   ^ 
    \s*( //IPv4
        (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)
    )\s* 
   $
 )
 |
 (
   ^
    \s*( //Hostname RFC 1123
         (?=.{1,255}$)[0-9A-Za-z](?:(?:[0-9A-Za-z]|\b-){0,61}[0-9A-Za-z])?(?:\.[0-9A-Za-z](?:(?:[0-9A-Za-z]|\b-){0,61}[0-9A-Za-z])?)*\.?
    )\s* 
   $
 )
 |
 (
   ^
    \s*( //IPv6
      (([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))
    )(%.+)?\s*
   $
 )

see also:
Regular expression to match DNS hostname or IP Address?
RFC 1123
IPv6 Validator

Shouse answered 10/2, 2012 at 0:26 Comment(2)
According to tools.ietf.org/html/rfc3696#section-2 top-level domain names not be all-numeric.Faultfinding
since you have already written a JS function, you could create a NPM moduleDudeen
I
35

I've nailed it: http://jsfiddle.net/AJEzQ/

^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$|^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$|^(?:(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){6})(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:::(?:(?:(?:[0-9a-fA-F]{1,4})):){5})(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})))?::(?:(?:(?:[0-9a-fA-F]{1,4})):){4})(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,1}(?:(?:[0-9a-fA-F]{1,4})))?::(?:(?:(?:[0-9a-fA-F]{1,4})):){3})(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,2}(?:(?:[0-9a-fA-F]{1,4})))?::(?:(?:(?:[0-9a-fA-F]{1,4})):){2})(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,3}(?:(?:[0-9a-fA-F]{1,4})))?::(?:(?:[0-9a-fA-F]{1,4})):)(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,4}(?:(?:[0-9a-fA-F]{1,4})))?::)(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9]))\.){3}(?:(?:25[0-5]|(?:[1-9]|1[0-9]|2[0-4])?[0-9])))))))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,5}(?:(?:[0-9a-fA-F]{1,4})))?::)(?:(?:[0-9a-fA-F]{1,4})))|(?:(?:(?:(?:(?:(?:[0-9a-fA-F]{1,4})):){0,6}(?:(?:[0-9a-fA-F]{1,4})))?::))))$
Impetus answered 9/2, 2012 at 11:5 Comment(10)
That's a good way to guarantee job security! (I'm joking but I really think that if another developer were to come along and want to change this validation at a later date, they would have a tough time or would have to start from scratch. Would you consider using three separate Regex? One for IPv4, one for IPv6 and another for hostnames)Trencherman
@Trencherman Oh that for sure, but nevertheless, even three regexes with comments are just not the best solution. It's never readable when regexes are used.Impetus
@Assert that is actually a valid local url, isn't it?Impetus
@MikulasDite Thanks! I just figured this out myself as well, was just about to post this. It is valid :)Assert
As @yuri has said this is not accurate. Which is why a regex is not the ideal solution for this problem if you actually want 100% accuracy.Serigraph
@mikulasDite this is awesome. I have changed it to just match IPv6: jsfiddle.net/usmanajmal/AJEzQ/108. Seems to work for all tests. Not sure if second last test (fe80::4413:c8ae:2821:5852%10) is a valid IPv6. Is it?Mirador
@MikulasDite regex.test("PO2018SS0043-15") = trueSheedy
@Trencherman i know 8 years later, but here you go: https://mcmap.net/q/245065/-validate-ipv4-ipv6-and-hostname (regex constructed programmatically and thus modular & comprehensively)Blockhouse
@MikulasDite I don't agree with " It's never readable when regexes are used" ;P although I agree this is a quite realistic assesment of most code where regexs are used, it doesn't have to be like that :DBlockhouse
failed with embeded ipv4 2001:db8:3333:4444:5555:6666:7777:1.2.3.4Uphroe
R
6

In node.js you can use the built in module net, which has net.isIP(ip), net.isIPv4(ip) and net.isIPv6(ip).

https://nodejs.org/api/net.html

Rawboned answered 25/11, 2020 at 14:7 Comment(0)
B
2

Construct validation regex programmatically

As to be found here in the formidable library ippaddr.js [2]:

https://github.com/whitequark/ipaddr.js/blob/8c18488416e20f2d624ab6f727638673018a2a46/lib/ipaddr.js#L6-L30

A comprehensive JS listing - (as opposed to the RegEx battlefield of the older answers :)) - to construct regular expressions programmatically in a modular way.

This allows to break down complexity of these regexes into more easily to grasp essential parts. It also lets you save on code size :)

Note: This is for validating only IP address versions 4 & 6 (not hostnames or other URI RFC related stuff):

    // A list of regular expressions that match arbitrary IPv4 addresses,
    // for which a number of weird notations exist.
    // Note that an address like 0010.0xa5.1.1 is considered legal.
    const ipv4Part = '(0?\\d+|0x[a-f0-9]+)';
    const ipv4Regexes = {
        fourOctet: new RegExp(`^${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}$`, 'i'),
        threeOctet: new RegExp(`^${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}$`, 'i'),
        twoOctet: new RegExp(`^${ipv4Part}\\.${ipv4Part}$`, 'i'),
        longValue: new RegExp(`^${ipv4Part}$`, 'i')
    };

    // Regular Expression for checking Octal numbers
    const octalRegex = new RegExp(`^0[0-7]+$`, 'i');
    const hexRegex = new RegExp(`^0x[a-f0-9]+$`, 'i');

    const zoneIndex = '%[0-9a-z]{1,}';

    // IPv6-matching regular expressions.
    // For IPv6, the task is simpler: it is enough to match the colon-delimited
    // hexadecimal IPv6 and a transitional variant with dotted-decimal IPv4 at
    // the end.
    const ipv6Part = '(?:[0-9a-f]+::?)+';
    const ipv6Regexes = {
        zoneIndex: new RegExp(zoneIndex, 'i'),
        'native': new RegExp(`^(::)?(${ipv6Part})?([0-9a-f]+)?(::)?(${zoneIndex})?$`, 'i'),
        deprecatedTransitional: new RegExp(`^(?:::)(${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}(${zoneIndex})?)$`, 'i'),
        transitional: new RegExp(`^((?:${ipv6Part})|(?:::)(?:${ipv6Part})?)${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}\\.${ipv4Part}(${zoneIndex})?$`, 'i')
    };

Cost of simplicity, need for more parsing logic

This tidiness of the reg-ex part comes with a price, the parsing logic needed is more "forky" :)

Please check the respective parsing methods here:

IPv4.parser: https://github.com/whitequark/ipaddr.js/blob/8c18488416e20f2d624ab6f727638673018a2a46/lib/ipaddr.js#L405

IPv6.parser: https://github.com/whitequark/ipaddr.js/blob/8c18488416e20f2d624ab6f727638673018a2a46/lib/ipaddr.js#L799

Sufficient vs necessary conditions

The regular expressions combined with parsing logic above match by sufficient condition for assertion on either address type (as do the straight-forward matching on the mega-regexes of previous answers).

OTOH there can be a number of necessary conditions for every address type. We can use these for asserting the opposite (that an input is not of either type): Checking for lacking : characters is a way to assert an address is definitely not IP v6. Which can come in handy when wanting to simply differentiate (i.e categorize) inputs in their kind in an optimal way. Running the whole IPv6-regex on an input that doesn't contain a colon in the first place would be an overhead.

It is likewise notable that the aforementioned library also implements the differences between sufficient versus necessary conditions when doing IPv6 validation (or especially intending to differentiate between inputs of both address kinds) [1]:

    ipaddr.IPv6.isValid = function (string) {

        // Since IPv6.isValid is always called first, this shortcut
        // provides a substantial performance gain.
        if (typeof string === 'string' && string.indexOf(':') === -1) {
            return false;
        }

        try {
            const addr = this.parser(string);
            new this(addr.parts, addr.zoneId);
            return true;
        } catch (e) {
            return false;
        }
    };

Differentiating between v4 and v6 using ipaddr.js:

function getIpVersionNum(addr) {
  try {
    const parse_addr = ipaddr.parse(addr);
    const kind = parse_addr.kind();

    if (kind === 'ipv4') {
      return 4; //IPv4
    } else if (kind === 'ipv6') {
      return 6; //IPv6
    } else {
      throw new Error('unexpected return value');
    }

  // parse() will throw an error when address passes neither validation
  } catch (err) { 
    return 0; //not 4 or 6
  }
}

[1] https://github.com/whitequark/ipaddr.js/blob/master/lib/ipaddr.js#L750-L765 [2] https://www.npmjs.com/package/ipaddr.js

Blockhouse answered 6/10, 2020 at 22:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.