Regular expression to match DNS hostname or IP Address?
Asked Answered
I

23

411

Does anyone have a regular expression handy that will match any legal DNS hostname or IP address?

It's easy to write one that works 95% of the time, but I'm hoping to get something that's well tested to exactly match the latest RFC specs for DNS hostnames.

Ivy answered 19/9, 2008 at 22:38 Comment(1)
Be aware: It's possible to find out if a string is a valid IPv4 address and to find out if it's a valid hostname. But: It's not possible to find out if a string is either a valid IPv4 address or a valid hostname. The reason: Any string that is matched as a valid IPv4 address would also be a valid hostname that could be resolved to a different IP address by the DNS server.Contractual
L
592

You can use the following regular expressions separately or by combining them in a joint OR expression.

ValidIpAddressRegex = "^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$";

ValidHostnameRegex = "^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9])$";

ValidIpAddressRegex matches valid IP addresses and ValidHostnameRegex valid host names. Depending on the language you use \ could have to be escaped with \.


ValidHostnameRegex is valid as per RFC 1123. Originally, RFC 952 specified that hostname segments could not start with a digit.

http://en.wikipedia.org/wiki/Hostname

The original specification of hostnames in RFC 952, mandated that labels could not start with a digit or with a hyphen, and must not end with a hyphen. However, a subsequent specification (RFC 1123) permitted hostname labels to start with digits.

Valid952HostnameRegex = "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$";
Livvie answered 19/9, 2008 at 22:45 Comment(37)
Your hostname regex is pretty good and looks like it matches everything. You should change your answer so it doesn't have the double escaping for periods and hyphens, and the sz which makes it look like some Microsoft language.Sharpe
Here: stackoverflow.com/questions/4645126/… - I explain that names that start with a digit are considered as valid as well. Also, only one dot is questionable issue. Would be great to have more feedback on that.Michelinamicheline
You might want to add IPv6. The OP didn't specify what type of address. (By the way, it can be found here)Accolade
could you please provide a single regular expressions to test both the conditions i.e. hostname and ip?Lemma
@ZainShaikh You can put them together as (<expr1>)|(<expr2>). That's what he says at the top: "by combining them in a joint OR expression".Millimicron
At least in Javascript, this regexp evaluates greedily and matches only the first number of the last octet if it's > 9. Reversing the order of the capture groups of the last segment allows it to properly match full range of IP's.Dogface
I've been using the ValidHostnameRegex to pull domains out of unstructured strings, and it seems that as written this regex in Python only captures the first character of the TLD. Adjusting it to this corrects the issue: ((([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\\-]*[a-zA-Z0-9])\\.)*([A-Za-z0-9][A-Za-z0-9\\-]*[A-Za-z0-9]))Barbed
Before people blindly use this in their code, note that it is not completely accurate. It ignores RFC2181: "The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators)."Calculous
And what about non-latin host names?Keli
I think there is something wrong with ValidIpAddressRegex . regexr.com?35830 since regular expression engines are eager at the end of the first match it sees 2 and thinks a match.So in the solution I did I am doing the reverse order regexr.com?35833 . ((((25[0-5])|(2[0-4]\d)|([01]?\d?\d)))\.){3}((((25[0-5])|(2[0-4]\d)|([01]?\d?\d))))Feleciafeledy
-1, because while it's goodish it doesn't adhere to the RFCs as it claims to be.Cant
@UserControl: Non-latin (Punycoded) hostnames must be converted to ASCII form first (éxämplè.com = xn--xmpl-loa1ab.com) and then validated.Cant
Your IP regex disallows leading 0's e.g. 127.000.000.001 (which I have seen though it's daft) or 127.0.0.0000001 (which is even more daft. Is this deliberate? Personally I would consider it valid (and ping on OS X does too).Gurkha
why so many up voted this answer I think this is bad Regex, it will only match if you have clean IP list.Choplogic
regarding ValidHostnameRegex: according to ietf.org/rfc/rfc1034.txt, section 3.1 page 7, trailing dots are valid (e.g.: "poneria.ISI.EDU." is a valid host name) - which is not accounted for in this regex. In fact this makes the regex even simpler: "^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\\-]*[a-zA-Z0-9])\\.?)+"Longinus
@AlixAxel do you have code to do the conversion of Non-latin hostnames?Counterpunch
@Shebuka: I would just use something like idn_to_ascii() in PHP.Cant
Maybe you should match FQDN too. Please add an optional period to the end of all domain name regexes.Agrostology
@Partly Cloudy: Leading zeroes are allowed but are interpreted differently. If there is a leading zero in a component, that component is interpreted as octal notation. This is unexpected by most users.Haviland
to support trailing dots there could be \.? added at the end, there is a absolute represenation used by DNS described in RFC 1034, see dns-sd.org/TrailingDotsInDomainNames.htmlBroomfield
Also perhaps consider single-letter hostnames: serverfault.com/questions/162038/…Knew
@JonTrauntvein, I have seen many places where leadin zeros are admitted in ip addresses in dot decimal notation, but not meaning octal meaning, just plain decimal, as in 192.168.000.028 being equivalent to 192.168.0.28. Why to write a regexp for ipv4 addresses/hostnames when internet is migrating to ipv6 ?Happygolucky
Your hostname expression is matching some invalid values: I tried 123.456.789.0 and it says it's a valid hostname.Rattletrap
Seems that your suggested solution accepts IP addresses that starts with zero. I suggest to refactor your solution to: (([1-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){1}(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){2}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])Puleo
There is a small mistake in Valid952HostnameRegex, I have corrected it: "^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)+([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$"Dauntless
@Rattletrap see Alban's answer below for a realistic regexBlowup
As suggested in other answer. this is working as well as expected: ^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$Lali
underscore should be a valid character but I don't think this solution accounts for it.Cresset
That IP regex isn't very good tbh, might want to use mine instead:((1?\d\d?|2[0-4]\d|25[0-5])\.){3}(1?\d\d?|2[0-4]\d|25[0-5])Rightness
@MilosGavrilov You're the best! Thanks for fixing it! Combined both (IP and Hostname) into a single regexp. See: regex101.com/r/0WMysi/2Disgraceful
I don't know why it shoud be [a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-] rather than [a-zA-Z0-9][a-zA-Z0-9\-].Headspring
a is a valid hostname?Zoezoeller
but hostname not accepting @ because you just want the top level domain name. Thank you if any help.Inartistic
@Disgraceful your combined expression does not match localhostStila
@Zoezoeller I thought the same. Here's my suggestion: '^(([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9]{1,63})\.)+([A-Za-z0-9]|[A-Za-z0-9][A-Za-z0-9\-]*[A-Za-z0-9]){2,6}$'Transitive
@Stila Thnx, fixed it! See: regex101.com/r/0WMysi/19Disgraceful
why it matches 0.1.2.3? is not valid! check this demo.Dainedainty
M
76

The hostname regex of smink does not observe the limitation on the length of individual labels within a hostname. Each label within a valid hostname may be no more than 63 octets long.

ValidHostnameRegex="^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])\
(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$"

Note that the backslash at the end of the first line (above) is Unix shell syntax for splitting the long line. It's not a part of the regular expression itself.

Here's just the regular expression alone on a single line:

^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))*$

You should also check separately that the total length of the hostname must not exceed 255 characters. For more information, please consult RFC-952 and RFC-1123.

Maltz answered 29/9, 2010 at 17:9 Comment(2)
Excellent host pattern. It probably depends on one's language's regex implementation, but for JS it can be adjusted slightly to be briefer without losing anything: /^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$/iTranshumance
This is what i want but the "@" symbol to allow only this special character for root hostname? i am new in dns and regex :(Inartistic
B
35

To match a valid IP address use the following regex:

(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}

instead of:

([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])(\.([01]?[0-9][0-9]?|2[0-4][0-9]|25[0-5])){3}

Explanation

Many regex engine match the first possibility in the OR sequence. For instance, try the following regex:

10.48.0.200

Test

Test the difference between good vs bad

Brawley answered 22/1, 2013 at 7:37 Comment(5)
Do not forget start ^ and end $ or something like 0.0.0.999 or 999.0.0.0 will match too. ;)Outwear
yes to valid a string start ^ and end $ are required, but if you are searching an IP into a text do not use it.Brawley
The unintended 'non-greedyness' that you identify applies to the other host name solutions as well. It would be worth adding this to your answer as the others will not match the full hostname. e.g. ([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])(\.([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]))* versus ([a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9]|[a-zA-Z0-9])(\.([a-zA-Z0-9][a-zA-Z0-9\-]{0,61}[a-zA-Z0-9])|[a-zA-Z0-9]))*Banal
EDIT: In the above, use + at the end instead of * to see the failure.Banal
The first regex works, but has some bad behavior, you can check this demo.Dainedainty
I
6

I don't seem to be able to edit the top post, so I'll add my answer here.

For hostname - easy answer, on egrep example here -- http: //www.linuxinsight.com/how_to_grep_for_ip_addresses_using_the_gnu_egrep_utility.html

egrep '([[:digit:]]{1,3}\.){3}[[:digit:]]{1,3}'

Though the case doesn't account for values like 0 in the fist octet, and values greater than 254 (ip addres) or 255 (netmask). Maybe an additional if statement would help.

As for legal dns hostname, provided that you are checking for internet hostnames only (and not intranet), I wrote the following snipped, a mix of shell/php but it should be applicable as any regular expression.

first go to ietf website, download and parse a list of legal level 1 domain names:

tld=$(curl -s http://data.iana.org/TLD/tlds-alpha-by-domain.txt |  sed 1d  | cut -f1 -d'-' | tr '\n' '|' | sed 's/\(.*\)./\1/')
echo "($tld)"

That should give you a nice piece of re code that checks for legality of top domain name, like .com .org or .ca

Then add first part of the expression according to guidelines found here -- http: //www.domainit.com/support/faq.mhtml?category=Domain_FAQ&question=9 (any alphanumeric combination and '-' symbol, dash should not be in the beginning or end of an octet.

(([a-z0-9]+|([a-z0-9]+[-]+[a-z0-9]+))[.])+

Then put it all together (PHP preg_match example):

$pattern = '/^(([a-z0-9]+|([a-z0-9]+[-]+[a-z0-9]+))[.])+(AC|AD|AE|AERO|AF|AG|AI|AL|AM|AN|AO|AQ|AR|ARPA|AS|ASIA|AT|AU|AW|AX|AZ|BA|BB|BD|BE|BF|BG|BH|BI|BIZ|BJ|BM|BN|BO|BR|BS|BT|BV|BW|BY|BZ|CA|CAT|CC|CD|CF|CG|CH|CI|CK|CL|CM|CN|CO|COM|COOP|CR|CU|CV|CX|CY|CZ|DE|DJ|DK|DM|DO|DZ|EC|EDU|EE|EG|ER|ES|ET|EU|FI|FJ|FK|FM|FO|FR|GA|GB|GD|GE|GF|GG|GH|GI|GL|GM|GN|GOV|GP|GQ|GR|GS|GT|GU|GW|GY|HK|HM|HN|HR|HT|HU|ID|IE|IL|IM|IN|INFO|INT|IO|IQ|IR|IS|IT|JE|JM|JO|JOBS|JP|KE|KG|KH|KI|KM|KN|KP|KR|KW|KY|KZ|LA|LB|LC|LI|LK|LR|LS|LT|LU|LV|LY|MA|MC|MD|ME|MG|MH|MIL|MK|ML|MM|MN|MO|MOBI|MP|MQ|MR|MS|MT|MU|MUSEUM|MV|MW|MX|MY|MZ|NA|NAME|NC|NE|NET|NF|NG|NI|NL|NO|NP|NR|NU|NZ|OM|ORG|PA|PE|PF|PG|PH|PK|PL|PM|PN|PR|PRO|PS|PT|PW|PY|QA|RE|RO|RS|RU|RW|SA|SB|SC|SD|SE|SG|SH|SI|SJ|SK|SL|SM|SN|SO|SR|ST|SU|SV|SY|SZ|TC|TD|TEL|TF|TG|TH|TJ|TK|TL|TM|TN|TO|TP|TR|TRAVEL|TT|TV|TW|TZ|UA|UG|UK|US|UY|UZ|VA|VC|VE|VG|VI|VN|VU|WF|WS|XN|XN|XN|XN|XN|XN|XN|XN|XN|XN|XN|YE|YT|YU|ZA|ZM|ZW)[.]?$/i';

    if (preg_match, $pattern, $matching_string){
    ... do stuff
    }

You may also want to add an if statement to check that string that you checking is shorter than 256 characters -- http://www.ops.ietf.org/lists/namedroppers/namedroppers.2003/msg00964.html

Ironic answered 3/3, 2010 at 21:23 Comment(3)
-1 because this matches bogus IP addresses like “999.999.999.999”.Immingle
"Though the case doesn't account for values like 0 in the fist octet, and values greater than 254 (ip addres) or 255 (netmask)."Ironic
I saw that you qualified your answer, yes. I downvoted because that part of your answer is still not useful.Immingle
N
4

It's worth noting that there are libraries for most languages that do this for you, often built into the standard library. And those libraries are likely to get updated a lot more often than code that you copied off a Stack Overflow answer four years ago and forgot about. And of course they'll also generally parse the address into some usable form, rather than just giving you a match with a bunch of groups.

For example, detecting and parsing IPv4 in (POSIX) C:

#include <arpa/inet.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  for (int i=1; i!=argc; ++i) {
    struct in_addr addr = {0};
    printf("%s: ", argv[i]);
    if (inet_pton(AF_INET, argv[i], &addr) != 1)
      printf("invalid\n");
    else
      printf("%u\n", addr.s_addr);
  }
  return 0;
}

Obviously, such functions won't work if you're trying to, e.g., find all valid addresses in a chat message—but even there, it may be easier to use a simple but overzealous regex to find potential matches, and then use the library to parse them.

For example, in Python:

>>> import ipaddress
>>> import re
>>> msg = "My address is 192.168.0.42; 192.168.0.420 is not an address"
>>> for maybeip in re.findall(r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}', msg):
...     try:
...         print(ipaddress.ip_address(maybeip))
...     except ValueError:
...         pass
Noway answered 5/5, 2018 at 18:42 Comment(0)
C
2
def isValidHostname(hostname):

    if len(hostname) > 255:
        return False
    if hostname[-1:] == ".":
        hostname = hostname[:-1]   # strip exactly one dot from the right,
                                   #  if present
    allowed = re.compile("(?!-)[A-Z\d-]{1,63}(?<!-)$", re.IGNORECASE)
    return all(allowed.match(x) for x in hostname.split("."))
Caren answered 14/6, 2011 at 10:56 Comment(2)
Could you explain this regex? Exactly, what do (?!-), (?<!-) mean?Lucier
@Scit, those make sure it does not start or end with a "-" character if your regex engine allow their use. For example, from Python or from Perl.Eyesore
S
2

I think this is the best Ip validation regex. please check it once!!!

^(([01]?[0-9]?[0-9]|2([0-4][0-9]|5[0-5]))\.){3}([01]?[0-9]?[0-9]|2([0-4][0-9]|5[0-5]))$
Siegbahn answered 12/2, 2012 at 17:21 Comment(0)
H
1
/^(?:[a-zA-Z0-9]+|[a-zA-Z0-9][-a-zA-Z0-9]+[a-zA-Z0-9])(?:\.[a-zA-Z0-9]+|[a-zA-Z0-9][-a-zA-Z0-9]+[a-zA-Z0-9])?$/
Hyperphysical answered 21/4, 2013 at 18:37 Comment(0)
E
1
"^((\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])\.){3}(\\d{1,2}|1\\d{2}|2[0-4]\\d|25[0-5])$"
Exhalant answered 3/3, 2014 at 2:38 Comment(0)
C
1

This works for valid IP addresses:

regex = '^([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])[.]([0-9]|[1-9][0-9]|[1][0-9][0-9]|[2][0-5][0-5])$'
Chairmanship answered 30/1, 2015 at 6:17 Comment(0)
U
1
>>> my_hostname = "testhostn.ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
True
>>> my_hostname = "testhostn....ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
False
>>> my_hostname = "testhostn.A.ame"
>>> print bool(re.match("^(([a-zA-Z]|[a-zA-Z][a-zA-Z0-9\-]*[a-zA-Z0-9])\.)*([A-Za-z]|[A-Za-z][A-Za-z0-9\-]*[A-Za-z0-9])$", my_hostname))
True
Unbeknown answered 12/4, 2016 at 5:46 Comment(0)
A
1

The new Network framework has failable initializers for struct IPv4Address and struct IPv6Address which handle the IP address portion very easily. Doing this in IPv6 with a regex is tough with all the shortening rules.

Unfortunately I don't have an elegant answer for hostname.

Note that Network framework is recent, so it may force you to compile for recent OS versions.

import Network
let tests = ["192.168.4.4","fkjhwojfw","192.168.4.4.4","2620:3","2620::33"]

for test in tests {
    if let _ = IPv4Address(test) {
        debugPrint("\(test) is valid ipv4 address")
    } else if let _ = IPv6Address(test) {
        debugPrint("\(test) is valid ipv6 address")
    } else {
        debugPrint("\(test) is not a valid IP address")
    }
}

output:
"192.168.4.4 is valid ipv4 address"
"fkjhwojfw is not a valid IP address"
"192.168.4.4.4 is not a valid IP address"
"2620:3 is not a valid IP address"
"2620::33 is valid ipv6 address"
Aldosterone answered 2/3, 2019 at 1:50 Comment(0)
S
0

Here is a regex that I used in Ant to obtain a proxy host IP or hostname out of ANT_OPTS. This was used to obtain the proxy IP so that I could run an Ant "isreachable" test before configuring a proxy for a forked JVM.

^.*-Dhttp\.proxyHost=(\w{1,}\.\w{1,}\.\w{1,}\.*\w{0,})\s.*$
Schnitzel answered 19/2, 2010 at 14:19 Comment(1)
That's a \w right there, it won't capture IP, only hostname at certain situations.Sidsida
B
0

I found this works pretty well for IP addresses. It validates like the top answer but it also makes sure the ip is isolated so no text or more numbers/decimals are after or before the ip.

(?<!\S)(?:(?:\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\b|.\b){7}(?!\S)

Blackstock answered 24/9, 2013 at 21:48 Comment(1)
I tried a lot but I could not understand 2 things here. 1. \b specifies word boundary Why are we using \b ? which is the boundary? and 2. Why does it work only for {7} From what I understood, I think it should be {4} but, it is not working. Optionally, you could tell about why are you using a non-capturing blocks.Outgoings
G
0
AddressRegex = "^(ftp|http|https):\/\/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}:[0-9]{1,5})$";

HostnameRegex =  /^(ftp|http|https):\/\/([a-z0-9]+\.)?[a-z0-9][a-z0-9-]*((\.[a-z]{2,6})|(\.[a-z]{2,6})(\.[a-z]{2,6}))$/i

this re are used only for for this type validation

work only if http://www.kk.com http://www.kk.co.in

not works for

http://www.kk.com/ http://www.kk.co.in.kk

http://www.kk.com/dfas http://www.kk.co.in/

Gaynellegayner answered 25/6, 2014 at 7:12 Comment(0)
R
0

try this:

((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)

it works in my case.

Roderich answered 27/5, 2015 at 2:58 Comment(0)
C
0

Regarding IP addresses, it appears that there is some debate on whether to include leading zeros. It was once the common practice and is generally accepted, so I would argue that they should be flagged as valid regardless of the current preference. There is also some ambiguity over whether text before and after the string should be validated and, again, I think it should. 1.2.3.4 is a valid IP but 1.2.3.4.5 is not and neither the 1.2.3.4 portion nor the 2.3.4.5 portion should result in a match. Some of the concerns can be handled with this expression:

grep -E '(^|[^[:alnum:]+)(([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])\.){3}([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])([^[:alnum:]]|$)' 

The unfortunate part here is the fact that the regex portion that validates an octet is repeated as is true in many offered solutions. Although this is better than for instances of the pattern, the repetition can be eliminated entirely if subroutines are supported in the regex being used. The next example enables those functions with the -P switch of grep and also takes advantage of lookahead and lookbehind functionality. (The function name I selected is 'o' for octet. I could have used 'octet' as the name but wanted to be terse.)

grep -P '(?<![\d\w\.])(?<o>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<o>){3}(?![\d\w\.])'

The handling of the dot might actually create a false negatives if IP addresses are in a file with text in the form of sentences since the a period could follow without it being part of the dotted notation. A variant of the above would fix that:

grep -P '(?<![\d\w\.])(?<x>([0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(\.\g<x>){3}(?!([\d\w]|\.\d))'
Closet answered 24/7, 2015 at 1:57 Comment(0)
N
0

There's a further nuance here that's missing.

It's true that a HOSTNAME should match, basically, what's been given above.

What's missing is that a REFERENCE TO a hostname can be the same, plus an optional period on the end.

For example, with a trailing period, ping foo.bar.svc.cluster.local. will ping that hostname only, and not attempt any DNS search options in resolv.conf.

tldr - If you provide an input box to receive a hostname, what's entered does not actually need to be a valid hostname.

Negate answered 15/2, 2023 at 12:14 Comment(0)
D
0

Here is a brief regex to match a valid hostname: /^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$/i that works in most regex engine:

Don't forget to use with ignore case flag i for ignore case / case unsensitive:

JavaScript:

const pattern = /^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$/i

Python:

pattern = re.compile(r"^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$", re.IGNORECASE)

PHP:

$pattern = '/^[a-z\d]([a-z\d\-]{0,61}[a-z\d])?(\.[a-z\d]([a-z\d\-]{0,61}[a-z\d])?)*$/i'

Regex DEMO.

Dainedainty answered 26/1 at 15:29 Comment(0)
R
-1

how about this?

([0-9]{1,3}\.){3}[0-9]{1,3}
Ribbonfish answered 8/4, 2013 at 19:24 Comment(2)
And so is 9999999999.0.0.9999999999 :) But for most programmers, this short approach will suffice.Outwear
-1 because this matches nonsense IP addresses (as @Counterpunch notes).Immingle
J
-1

on php: filter_var(gethostbyname($dns), FILTER_VALIDATE_IP) == true ? 'ip' : 'not ip'

Jeep answered 11/1, 2016 at 12:17 Comment(2)
While this code may answer the question, generally explanation alongside code makes an answer much more useful. Please edit your answer and provide some context and explanation.Ineducation
And, unless I'm mistaken, FILTER_VALIDATE_IP is a PHP only value.Ivy
M
-2

Checking for host names like... mywebsite.co.in, thangaraj.name, 18thangaraj.in, thangaraj106.in etc.,

[a-z\d+].*?\\.\w{2,4}$
Microminiaturization answered 11/1, 2012 at 11:14 Comment(3)
-1. The OP asked for something “well tested to exactly match the latest RFC specs”, but this does not match e.g. *.museum, while it will match *.foo. Here’s a list of valid TLDs.Immingle
I'm not sure it's a good idea to put the plus inside the character class (square brackets), furthermore, there are TLDs with 5 letters (.expert for example).Sidsida
Best way to accomplish with RFC is to use the system/language functions. inet_aton is good enough.Essex
A
-2

I thought about this simple regex matching pattern for IP address matching \d+[.]\d+[.]\d+[.]\d+

Agglutinative answered 10/11, 2015 at 6:28 Comment(2)
1111.1.1.1 is not a valid ip. There's no way to really test an ip format if you don't take care about subnets. You should at least take care about the number of appearances with something like ^\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3} and of course that will not be the correct way. If you have a languaje to write script, for sure you'll have access to it's network functions. Best way to check an REAL ip it's to tell the system to convert and ip to it's right format then check for true/false. In case of Python i use socket.inet_aton(ip). Case of PHP u need inet_aton($ip).Essex
Python users can take a look here: gist.github.com/erm3nda/f25439bba66931d3ca9699b2816e796cEssex

© 2022 - 2024 — McMap. All rights reserved.