This week our Developer Jason from the Pentest People team is here to talk to you about Email Address validation.
There are many ways to validate an email address, so let’s have a look and talk about which method is best.
As with all user input, a user’s email address needs to be validated, not only for security but to make sure the email address is correct.
Email addresses are often used as the user’s username for an application. So if an incorrect email address is used the user will have problems at a later date.
So how do you validate an email address to make sure they’re correct? Well there are a few options you can use.
First of all an email address needs to be unique in your system, so it goes without saying, do not allow for duplicate email address to be stored against different users.
A quick way to ensure an email address is correct it to use the type attribute in the input field. Like so:
<input type=”email” >
This field is great as it also tells a mobile keyboard to use show the ‘@’ symbol when keyboard first loads. This input type by definition should allow for only valid email addresses to be entered.
However, this is not completely true, the email address ‘.test@test.com’ will also pass this validation. We can see this is not a valid email address. Same with ‘ j.test@gmail.com’ the space at the start of the string will not be caught, allowing for an invalid email address to be entered.
If we take this technically valid email address, ‘a@s.ff’ this will pass the validation provided when using the email type in the input field. This email address looks correct, but that Top-Level Domain is not valid, and will still pass this validation check.
An alternative method is to use a regular expression to validate the email address. The following regular expression:
^((?!\.)[\w-_.]*[^.])(@\w+)(\.\w+(\.\w+)?[^.\W])$
will not match ‘.test@test.com’ or an email address with a space at the start of the string. But the invalid TLD will still be a problem.
So what can we do here? Simple, the expression will need to be updated to match only valid TLDs like so:
^((?!\.)[\w-_.]*[^.])(@\w+)(\.co\.uk|\.com|\.net)$
This would work but can you see the problem? There are a lot of TLDs, as of June 2019 there are currently 1500 TLDs, I do not recommend writing a regular expression to match all the possibilities individually.
If you’re looking for more information then this is a great website to learn about regular expression.
We can go one further, by allowing the user to submit an ‘email’ address and we can check the MX record. But this is only going to check if the domain is valid. It will help if something like ‘j.test@gnail.com’ is used (assuming gnail.com is not a domain) we can reject the email address. But this will not stop a user entering a typo with a valid email address. ‘j.tst@gmail.com’ will still pass this type of validation.
As this method will not grantee the email address is valid, I would not recommend implementing it. Instead, I would just use the user if the email address is correct.
The simplest way to validate that an email address is correct is to simply ask the user.
Once a user has entered their email address a simple confirmation email sent to the address will suffice. By using a one time confirmation email the user has confirmed their email address is correct and in fact they have signed up to the application/service in question.
I am not saying do not use any validation, but using a combination of the input type attribute and the simple regEx validation alongside a user confirmation email is the best solution for the problem.
By using a combination of methods, you are removing the possibilities of typos and the input we accept will technically be a valid email address. Which will then be confirmed by the user.
If a user has used an incorrect address they will let you know.
Click here to check our Email Phishing Assessment.