How to validate an email address in PHP
Asked Answered
U

14

295

I have this function to validate an email addresses:

function validateEMAIL($EMAIL) {
    $v = "/[a-zA-Z0-9_-.+]+@[a-zA-Z0-9-]+.[a-zA-Z]+/";

    return (bool)preg_match($v, $EMAIL);
}

Is this okay for checking if the email address is valid or not?

Unsearchable answered 19/8, 2012 at 13:29 Comment(10)
If it works it works. You can't really make it better, it's too small. Only thing that's not good is style. validateEmail would be corret, as well as passing $email, not $EMAIL.Whitman
Just wanted to make sure I didn't have any major problems in the code that's all :)Unsearchable
See also #201823 for more about how and how not to use regular expressions to validate email addresses.Schmaltzy
That would fail to validate many valid email addresses. For example *@example.com or '@example.com or me@[127.0.0.1] or you@[ipv6:08B0:1123:AAAA::1234]Unheardof
https://mcmap.net/q/102014/-how-consistent-is-filter_validate_emailDorcas
@jcoder, not that I'm recommending that regex, but at least we can hope anyone using such addresses for sing up etc wouldn't complain when it fails :)Triumvir
#28026560Ki
See complete answer here: for PHP 5.x #19522592Tew
If you really want to use that regex you should move the "-" to the beginning of the character set to avoid errors: [-a-zA-Z0-9_.+] (if it's not at the beginning the "-" is interpreted as range).Luggage
It failes even on [email protected], strongly not recommend this to use!Buncombe
B
710

The easiest and safest way to check whether an email address is well-formed is to use the filter_var() function:

if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
    // invalid emailaddress
}

Additionally you can check whether the domain defines an MX record:

if (!checkdnsrr($domain, 'MX')) {
    // domain is not valid
}

But this still doesn't guarantee that the mail exists. The only way to find that out is by sending a confirmation mail.


Now that you have your easy answer feel free to read on about email address validation if you care to learn or otherwise just use the fast answer and move on. No hard feelings.

Trying to validate an email address using a regex is an "impossible" task. I would go as far as to say that that regex you have made is useless. There are three rfc's regarding emailaddresses and writing a regex to catch wrong emailadresses and at the same time don't have false positives is something no mortal can do. Check out this list for tests (both failed and succeeded) of the regex used by PHP's filter_var() function.

Even the built-in PHP functions, email clients or servers don't get it right. Still in most cases filter_var is the best option.

If you want to know which regex pattern PHP (currently) uses to validate email addresses see the PHP source.

If you want to learn more about email addresses I suggest you to start reading the specs, but I have to warn you it is not an easy read by any stretch:

Bacciform answered 19/8, 2012 at 13:32 Comment(16)
Good answer, but according this link: haacked.com/archive/2007/08/21/… the user name o locally part can be quoted-string, but the FILTER_VALIDATE_EMAIL do not accept it.Aerostat
It does not work for all emailaddresses as stated. Also see the list of failed tests in my answer to see that some quoted strings do work and others not.Bacciform
@PeeHaa, probably, it would be better to use regex that would check for non-standard, UTF-8 accepted email addresses: https://mcmap.net/q/102017/-how-to-validate-non-english-utf-8-encoded-email-address-in-javascript-and-phpBeaner
Nope, too many failed tests on that pattern emailtester.pieterhordijk.com/test-pattern/MTAz :-)Bacciform
This pattern is extremely complex in case you need to use it with function like "preg_match_all" over big text string with emails inside. If any of you has simpler please share. I mean if you want to: preg_match_all($pattern, $text_string, $matches); then this complex pattern will overload the server if you need to parse really big text.Trend
filter_var will not validate emails like; уникум@из.рф even if they are correct!Math
This will validate: $email = '"><script>alert(1);</script>"@test.com'; echo filter_var($email, FILTER_VALIDATE_EMAIL)Nigrify
What is your point @sergio? Again as stated @Nigrify there are several "failures".Bacciform
Need take into account that it can cause an XSSNigrify
XSS prevention has nothing to do with this answer.Bacciform
filter_var is not really an option in "modern" applications - it does not support UTF8, which has been allowed for some time in the local part of email addresses.Leftwich
@Leftwich do you have a list of said "modern" mail servers that do support utf8 email addresses by any chance?Bacciform
@PeeHaa: Postfix 3.0 supports it for almost two years now: postfix.org/SMTPUTF8_README.html , and it is included in Ubuntu 16.04 and will be included in the next Debian release, for example. Exim has experimental support. Webmail providers like Gmail have also added support for sending/receiving such emails, although you cannot yet create unicode accounts. Widespread use and support is within reach, and filter_var will lag behind by quite some time, even if they change it right now (I have posted a bug report).Leftwich
@JaydeepChauhan that is just nonsense 3v4l.org/p1MP5Bacciform
An MX entry is not necessary to receive emails. If none is present, the A entry will be used. See serverfault.com/questions/470649/…Neuro
$email_domain = "dsfsdfdsfdsfds2xx.nothing"; if (checkdnsrr($email_domain, 'MX')) { echo "$email_domain valid MX DNS record";} Returns: dsfsdfdsfdsfds2xx.nothing valid DNS record dig:Adiaphorous
A
53

You can use filter_var for this.

<?php
   function validateEmail($email) {
      return filter_var($email, FILTER_VALIDATE_EMAIL);
   }
?>
Affected answered 19/8, 2012 at 13:33 Comment(7)
stop adding this function as this does not validate domains. if you are adding some@address this is valid. and it's not!Alviani
What's with all the one line functions containing one line functions? I am seeing them everywhere. When did this become a "thing"? (rhetorical). This needs to stop.Limpkin
@Limpkin I think it makes sense, if you, one year later with 100 usages of it in your project and you want to improve the way you validate the emails.... then it's going to be faster to edit 1 function than a hundred places.Personification
@HerrNentu' whats wrong with some@address? It is a perfectly valid email address. Like root@localhost is one. You are just doing the wrong thing. You are syntactically validating the form of the email address, and some@address is valid according to the RFC. But what you want to do is validate that an address is reachable. some@address is only reachable if the host address is known in your network. To validate reachability, you can check the DNS (check the host exists) or use SMTP (check the mailbox exists).Neuro
@ChristopherK. the problem is that it validates the email address without a domain.Alviani
@HerrNentu' actually it's debatable, here is a validation examples 3v4l.org/Vo8K1 so intranet + localhost domains were not accepted, which is a bit unexpected, as both of them a valid domains for intranet systemsClassy
Since 2012, you can use international characters above U+007F, encoded as UTF-8. filter_var() filters unicode characters out. See: https://mcmap.net/q/102018/-are-email-addresses-allowed-to-contain-non-alphanumeric-characters and php.net/manual/en/filter.filters.sanitize.phpAsseverate
R
19

In my experience, regex solutions have too many false positives and filter_var() solutions have false negatives (especially with all of the newer TLDs).

Instead, it's better to make sure the address has all of the required parts of an email address (user, "@" symbol, and domain), then verify that the domain itself exists.

There is no way to determine (server side) if an email user exists for an external domain.

This is a method I created in a Utility class:

public static function validateEmail(string $email): bool {

    // SET INITIAL RETURN VARIABLE
    // ENSURE -> EMAIL ISN'T EMPTY | AN @ SYMBOL IS PRESENT 

        $emailIsValid = FALSE;

        if (
            !empty($email) &&
            strpos($email, '@') !== FALSE
        ) {

            // GET EMAIL PARTS

                $email  = explode('@', $email);
                $user   = $email[0];
                $domain = $email[1];

            // VALIDATE EMAIL ADDRESS

                if (
                    count($email) === 2 &&
                    !empty($user) &&
                    !empty($domain) &&
                    checkdnsrr($domain)
                ) {
                    $emailIsValid = TRUE;
                }
        }

    // RETURN RESULT

        return $emailIsValid;
}
Realty answered 4/2, 2017 at 7:3 Comment(7)
Neverbounce claims their API is able to validate to 97% delivery. As long as you don't mind handing over your contacts database, of course.Des
stristr will fail to get the domain if there are multiple @ signs. Better to explode('@',$email) and check that sizeof($array)==2Dissuade
@AaronGillion While you are correct as far as a better way to get domain parts, the method would still return false as checkdnsrr() would return false if there were an @ sign in the domain.Realty
filter_var checks for valid DNS name parts and does not allowlist TLDs, so it should be robust (nowadays). What false positives were you seeing? Did the PHP project fix them when this was reported? - I would rather trust a centralised check that everyone uses and reports issues against than roll my own. Until seeing security bugs like CVE-2023-7028, I also used a custom (liberal) solution like in this answer, but this will accept the value [email protected], [email protected] as valid and mail() will email both, potentially allowing an attacker to access/manage the user's data.Kenspeckle
@Kenspeckle If those are false email addresses (which wouldn't be associated with a valid account), how would an attacker use them to access a user's data? Additionally, FILTER_VALIDATE_EMAIL and FILTER_VALIDATE_DOMAIN only check syntax. They do not check if the domain name actually exists as my solution does.Realty
Domain checking (and perhaps other checks) can still be done additionally indeed, I'm just saying that filter_var is not bad to use :). As for how the exploit worked exactly, I haven't read the details of that CVE but my understanding is that it involved entering multiple email addresses in one field because the mail server parses that string and emails everyone. I've noticed an impact on my own websites as well: I had set a limit of one registration email per IP address per X time (with a small burst), but an attacker could enter 20 email addresses and thus (per IP address) send 20× the limitKenspeckle
@Kenspeckle If an attacker enters 20 email addresses into a single email field, it should fail validation using filter_var(). It for sure will fail validation with the method I posted above, as both count($email) === 2 and checkdnsrr($domain) will equal FALSE. I will agree with one of your points though... filter_var() isn't "bad" per se. It's just not the most thorough way to validate email addresses.Realty
S
13

I think you might be better off using PHP's inbuilt filters - in this particular case:

It can return a true or false when supplied with the FILTER_VALIDATE_EMAIL param.

Soup answered 19/8, 2012 at 13:32 Comment(0)
T
12

This will not only validate your email, but also sanitize it for unexpected characters:

$email  = $_POST['email'];
$emailB = filter_var($email, FILTER_SANITIZE_EMAIL);

if (filter_var($emailB, FILTER_VALIDATE_EMAIL) === false ||
    $emailB != $email
) {
    echo "This email adress isn't valid!";
    exit(0);
}
Tenderloin answered 23/3, 2017 at 7:26 Comment(2)
It considered error`@gmail.com as valid email. Note that it contains `.Exponential
Since 2012, you can use international characters above U+007F, encoded as UTF-8. filter_var() filters unicode characters out. See: https://mcmap.net/q/102018/-are-email-addresses-allowed-to-contain-non-alphanumeric-characters and php.net/manual/en/filter.filters.sanitize.phpAsseverate
D
7

After reading the answers here, this is what I ended up with:

public static function isValidEmail(string $email) : bool
{
    if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
        return false;
    }

    //Get host name from email and check if it is valid
    $email_host = array_slice(explode("@", $email), -1)[0];

    // Check if valid IP (v4 or v6). If it is we can't do a DNS lookup
    if (!filter_var($email_host,FILTER_VALIDATE_IP, [
        'flags' => FILTER_FLAG_NO_PRIV_RANGE | FILTER_FLAG_NO_RES_RANGE,
    ])) {
        //Add a dot to the end of the host name to make a fully qualified domain name
        // and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        // Then convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        $email_host = idn_to_ascii($email_host.'.');

        //Check for MX pointers in DNS (if there are no MX pointers the domain cannot receive emails)
        if (!checkdnsrr($email_host, "MX")) {
            return false;
        }
    }

    return true;
}
Dispirited answered 26/4, 2019 at 8:46 Comment(5)
Is there any reason for the array_slice? Why don't you just use explode("@", $email)[1]? Can @ characters appear in the user part of the email address?Gait
@Gait I think it was for backwards compatibility. Accessing the return type directly like that is not supported before PHP 5.4 (I think). However, that is a pretty old and unsupported version by now so I would probably do as you suggest.Dispirited
I just tested it, and you are actually right. From the perspective of someone who started coding a couple of years ago, it's unbelievable what programmers had to deal with to achieve the simplest things.Gait
An MX entry is not necessary to receive emails. If none is present, the A entry will be used. See serverfault.com/questions/470649/…Neuro
@ChristopherK. Oh, that was interesting. I have used a check like this in various projects and have probably validated over a million email addresses, and this has never been a problem. I think it's a pretty good check to make to make sure the domain is actually pointed somewhere. Maybe a fallback check for an A pointer could be used, but that might do more harm than good even it seems like a more correct check.Dispirited
V
7

Use below code:

// Variable to check
$email = "[email protected]";

// Remove all illegal characters from email
$email = filter_var($email, FILTER_SANITIZE_EMAIL);


// Validate e-mail
if (filter_var($email, FILTER_VALIDATE_EMAIL)) {
  echo("Email is a valid email address");
}
Vega answered 2/10, 2020 at 17:9 Comment(1)
In most cases, you probably don't want to strip illegal characters like that when validating. If you check an email address with illegal characters, that should not validate.Dispirited
S
5

Answered this in 'top question' about emails verification https://mcmap.net/q/18715/-how-can-i-validate-an-email-address-using-a-regular-expression

For me the right way for checking emails is:

  1. Check that symbol @ exists, and before and after it there are some non-@ symbols: /^[^@]+@[^@]+$/
  2. Try to send an email to this address with some "activation code".
  3. When the user "activated" his email address, we will see that all is right.

Of course, you can show some warning or tooltip in front-end when user typed "strange" email to help him to avoid common mistakes, like no dot in domain part or spaces in name without quoting and so on. But you must accept the address "hello@world" if user really want it.

Also, you must remember that email address standard was and can evolute, so you can't just type some "standard-valid" regexp once and for all times. And you must remember that some concrete internet servers can fail some details of common standard and in fact work with own "modified standard".

So, just check @, hint user on frontend and send verification emails on given address.

Sculpin answered 13/12, 2016 at 20:21 Comment(7)
Your regex does check for @, but it doesn't really check that it's valid per any of the RFCs that govern email. It also doesn't work as written. I ran it through regex101.com and it failed to match valid addressesDoodle
Do you read only regex or the whole answer? Fully disagree with you. Just say me please, according what RFC the gmail.com server assumes that [email protected] and [email protected] is the same address? There are lot of servers which works not by standards or not by FRESH standards. But thay serve emails of their users. If you type some regexp once, and validate only by that, you have no guarantee that it will stay right in future and your future users will not fail with their "new-way" emails. So, my position is the same: main point if you want to verify email address - just send activation email.Sculpin
@Doodle but thanks for bugreport in regexp, i fixed it from /^[^@]+@[^@+]$/ to /^[^@]+@[^@]+$/Sculpin
Props to you for fixing the regex, but how does that improve over the filter_var method? It doesn't fix the problem of it accepting badly formatted addresses either. Your regex will happily accept joe@domain as a valid email address, when it's notDoodle
@Machavity, well, for example, there's an concrete version of PHP on your server and you can't update it to newest. For example, you have php 5.5.15 . In 2018 standard of valid emails was extended. It will realized in php 7.3.10 soon. And there'll good-working function filter_var($email, FILTER_VALIDATE_EMAIL, $newOptions). But you have old function on server, you cant update in some cases. And you will loose clients with some new valid emails. Also, once more I notice, that not all email-serving severs works strictly accordingly to common and modern standard of email adresses.Sculpin
And the other - admin@localhost is valid email address. What about "little internets" - intranets? There can be addresses like joe@matesnet.Sculpin
Maybe [^@]+?@[^@]+?\.[^@]+Yestreen
B
4

If you want to check if provided domain from email address is valid, use something like:

/*
* Check for valid MX record for given email domain
*/
if(!function_exists('check_email_domain')){
    function check_email_domain($email) {
        //Get host name from email and check if it is valid
        $email_host = explode("@", $email);     
        //Add a dot to the end of the host name to make a fully qualified domain name and get last array element because an escaped @ is allowed in the local part (RFC 5322)
        $host = end($email_host) . "."; 
        //Convert to ascii (http://us.php.net/manual/en/function.idn-to-ascii.php)
        return checkdnsrr(idn_to_ascii($host), "MX"); //(bool)       
    }
}

This is handy way to filter a lot of invalid email addresses, along with standart email validation, because valid email format does not mean valid email.

Note that idn_to_ascii() (or his sister function idn_to_utf8()) function may not be available in your PHP installation, it requires extensions PECL intl >= 1.0.2 and PECL idn >= 0.1.

Also keep in mind that IPv4 or IPv6 as domain part in email (for example user@[IPv6:2001:db8::1]) cannot be validated, only named hosts can.

See more here.

Bosley answered 2/5, 2018 at 8:44 Comment(2)
I don't think it will work if the host portion of the email address is in IP address in IPv6 formatRegurgitate
An MX entry is not necessary to receive emails. If none is present, the A entry will be used. See serverfault.com/questions/470649/…Neuro
W
2

If you're just looking for an actual regex that allows for various dots, underscores and dashes, it as follows: [a-zA-z0-9.-]+\@[a-zA-z0-9.-]+.[a-zA-Z]+. That will allow a fairly stupid looking email like tom_anderson.1-neo@my-mail_matrix.com to be validated.

Whitelaw answered 28/12, 2016 at 18:19 Comment(0)
R
2
/(?![[:alnum:]]|@|-|_|\.)./

Nowadays, if you use a HTML5 form with type=email then you're already by 80% safe since browser engines have their own validator. To complement it, add this regex to your preg_match_all() and negate it:

if (!preg_match_all("/(?![[:alnum:]]|@|-|_|\.)./",$email)) { .. }

Find the regex used by HTML5 forms for validation
https://regex101.com/r/mPEKmy/1

Referee answered 14/8, 2017 at 12:32 Comment(3)
I hate downvotes too w/o explanation. Well I guess he might say: Browser email check (client side) is not secure at all. Anyone can send anything to a server by changing the code. So it's obvious and the most secure way to do the check (again) server side. The question here is based on PHP, so its obvious Cameron was looking for a server solution and not for a client solution.Erlindaerline
This answer may not fully PHP related, but is HTML suggestion covers the "standard" user using just a phone/PC. Also the user gets an info directly in "his" browser while using the site. Real checks on server side are not covered with this, sure. Btw, @Referee mentioned a PHP change, so his comment is related IMHO.Southerly
It probably received down votes due the the assumption that you're "80% safe since browser engines have their own validator". There are many other ways to send http requests than through a browser, so you can't assume that any request is safe...even if you check the browser agent.Realty
N
1

theres is a better regex built in FILTER_VALIDATE_EMAIL but any regex can give bad results.

For example..

// "not an email" is invalid so its false.
php > var_export(filter_var("not an email", FILTER_VALIDATE_EMAIL));
false
// "[email protected]" looks like an email, so it passes even though its not real.
php > var_export(filter_var("[email protected]", FILTER_VALIDATE_EMAIL));
'[email protected]'
// "[email protected]" passes, gmail is a valid email server,
//  but gmail require more than 3 letters for the address.
var_export(filter_var("[email protected]", FILTER_VALIDATE_EMAIL));
'[email protected]'

You might want to consider using an API like Real Email which can does in depth mailbox inspections to check if the email is real.

A bit like ..

$email = "[email protected]";
$api_key = ???;

$request_context = stream_context_create(array(
    'http' => array(
        'header'  => "Authorization: Bearer " . $api_key
    )
));

$result_json = file_get_contents("https://isitarealemail.com/api/email/validate?email=" . $email, false, $request_context);

if (json_decode($result_json, true)['status'] == "valid") {
    echo("email is valid");
} else if (json_decode($result_json, true)['status'] == "invalid") {
    echo("email is invalid");
} else {
  echo("email was unknown");
}
Novak answered 13/9, 2021 at 9:7 Comment(0)
R
1

There are three RFCs that lay down the foundation for the "Internet Message Format".

  1. RFC 822
  2. RFC 2822 (Supersedes RFC 822)
  3. RFC 5322 (Supersedes RFC 2822)

The RFC 5322, however, defines the e-mail IDs and their naming structure in the most technical manner. That is more suitable laying down the foundation an Internet Standard that liberal enough to allow all the use-cases yet, conservative enough to bind it in some formalism.

However, the e-mail validation requirement from the software developer community, has the following needs -

  • to stave off unwanted spammers
  • to ensure the user does not make inadvertent mistake
  • to ensure that the e-mail ID belongs to the actual person inputting it

They are not exactly interested in implementing a technically all-encompassing definition that allows all the forms (IP addresses, including port IDs and all) of e-mail id. The solution suitable for their use-case is expected to solely ensure that all the legitimate e-mail holders should be able to get through. The definition of "legitimate" differs vastly from technical stand-point (RFC 5322 way) to usability stand-point(this solution). The usability aspect of the validation aims to ensure that all the e-mail IDs validated by the validation mechanism belong to actual people, using them for their communication purposes. This, thus introduces another angle to the validation process, ensuring an actually "in-use" e-mail ID, a requirement for which RFC-5322 definition is clearly not sufficient.

Thus, on practical grounds, the actual requirements boil down to this -

  1. To ensure some very basic validation checks
  2. To ensure that the inputted e-mail is in use

Second requirement typically involves, sending a standard response seeking e-mail to the inputted e-mail ID and authenticating the user based on the action delineated in the response mechanism. This is the most widely used mechanism to ensure the second requirement of validating an "in use" e-mail ID. This does involve round-tripping from the back-end server implementation and is not a straight-forward single-screen implementaion, however, one cannot do away with this.

The first requirement, stems from the need that the developers do not want totally "non e-mail like" strings to pass as an e-mail. This typically involves blanks, strings without "@" sign or without a domain name. Given the punycode representations of the domain names, if one needs to enable domain validation, they need to engage in full-fledged implementation that ensures a valid domain name. Thus, given the basic nature of requirement in this regard, validating for "<something>@<something>.<something>" is the only apt way of satisfying the requirement.

A typical regex that can satisfy this requirement is: ^[^@\s]+@[^@\s.]+.[^@\s.]+$ The above regex, follows the standard Perl regular-expression standard, widely followed by majority of the programming languages. The validation statement is: <anything except whitespaces and "@" sign>@<anything except whitespaces and "@" sign>.<anything except whitespaces, @ sign and dot>

For those who want to go one step deeper into the more relevant implementations, they can follow the following validation methodology. <e-mail local part>@<domain name>

For <e-mail local part> - Follow the guidelines by the "Universal Acceptance Steering Group" - UASG-026 For <domain name>, you can follow any domain validation methodology using standard libraries, depending on your programming language. For the recent studies on the subject, follow the document UASG-018A.

Those who are interested to know the overall process, challenges and issues one may come across while implementing the Internationalized Email Solution, they can also go through the following RFCs:

RFC 6530 (Overview and Framework for Internationalized Email) RFC 6531 (SMTP Extension for Internationalized Email) RFC 6532 (Internationalized Email Headers) RFC 6533 (Internationalized Delivery Status and Disposition Notifications) RFC 6855 (IMAP Support for UTF-8) RFC 6856 (Post Office Protocol Version 3 (POP3) Support for UTF-8) RFC 6857 (Post-Delivery Message Downgrading for Internationalized Email Messages) RFC 6858 (Simplified POP and IMAP Downgrading for Internationalized Email).

Rust answered 21/3, 2022 at 17:20 Comment(0)
E
0

I have prepared a function that checks email validity:

function isValidEmail($email)
{
    $re = '/([\w\-\.]+)@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([\w-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(\]?)/m';
    preg_match_all($re, $email, $matches, PREG_SET_ORDER, 0);
    if(count($matches) > 0) return $matches[0][0] === $email;
    return false;
}

The problem with FILTER_VALIDATE_EMAIL is that it considers even invalid emails as valid.

Following are example:

if(isValidEmail("[email protected]")) echo "valid";
if(!isValidEmail("fo^[email protected]")) echo "invalid";
Exponential answered 17/7, 2022 at 16:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.