We have all encountered them: Password Requirements. They are a well-intended attempt to make sure that your personal information stays hidden, and only you can access your websites.
However, there are still many very annoying password requirements, and attempts at visualizing a “password strength” to help you choose a good password. However, I find that many are still misguided, and in the worst cases makes the user choose weaker passwords.
Let’s go through what makes a good password, and how you should relate to that. For this exercise we assume that:
- You use bcrypt or similar with key derivation. Not some crap salt+SHAx/MDx. We are in 2015.
- The worst has happened: Your database has been leaked.
The main advantage of key-derivation encrypted passwords are that they are expensive to test. This means that to test a single password CPU time will have to be used, and we want to make it unreasonably expensive to get a password. Let’s look at some example cost times of bcrypt (source):
| Cost | Iterations | Duration | |------|-------------------|-------------| | 10 | 1,024 iterations | 152.4 ms | | 11 | 2,048 iterations | 296.6 ms | | 12 | 4,096 iterations | 594.3 ms | | 13 | 8,192 iterations | 1,169.5 ms | | 14 | 16,384 iterations | 2,338.8 ms |
I will not go into selecting the correct cost – for that you can see the source of the numbers.
Let’s assume you choose a Cost of 12. This means that you can test roughly 4.4 Million passwords in a month. If we have 1000 8 core machines (and assume that things scale linearly), that would enable us to test a little more than 35 billion passwords. This is approximately 35 bits. With 5 bits of entropy (32 values) per character, that is close to a password of 7 characters using ‘a’ to ‘z’ and ‘0’ to ‘9’. For your reference, with current prices for AWS EC2 Instances (1000 x m4.2xlarge), that would cost $368,930.00.
That is the price of getting a single password. Of course you should always assume “the enemy” has better tools available, but even if you assume they can reduce the time/price by two orders of magnitude, a price of $3689.30 per password is still very high, and likely more than their expected ROI upon getting access to the account.
Enforced Passwords
From these numbers it would appear that simply enforcing a password is a solution. However, in my opinion that is a bad idea for the same reason that regular password changes are bad: Your users cannot remember the passwords, so they write them down in various places, or re-use other passwords and you are actually worse off than you were before.
The only good usability place for enforced passwords I have found is for gmail, that allow you to create one-time-passwords to use to access your email via POP3/IMAP. In this scenario you don’t use generated passwords as your regular password, but enter it once.
Users need to be able to enter their own passwords, otherwise you will likely end up with worse security.
What is a good password?
The hackers are resourceful, so if they know that users have chosen their own passwords, they know that there are these bad password habits:
- Some users choose too short passwords
- Some users re-use passwords for different services
- Some users choose similar ‘bad’ passwords
- Some users choose personal information as passwords
So, how can they take advantage of these facts? The common way to approach this is to use dictionary attacks instead of brute-force attacks. So they select a dictionary of probable passwords and test these first. A dictionary is by many assumed to be a list of English (and other) words, but in fact the best dictionary to use is a dictionary of passwords from previous leaks. This is good, because this takes advantage of all the facts about bad password habits above.
In a leak from a few years back, is was possible to retrieve 18.2% of all passwords. This skews the statistics and price per password a HUGE amount. If we know that we can get roughly 18% of all the passwords by only testing a dictionary with 65 million entries, our ROI improves significantly.
This is also why password requirements actually help the hackers. The hacker can easily find out your password requirements, and they can use this information to eliminate possible passwords from the dictionary. If you require 1 number, one special character, etc, they can eliminate a lot of passwords as possible options when searching through the dictionary.
Notice that the length and complexity of the passwords doesn’t influence the cost of finding a password. If they are in a dictionary they will be found. A password like `ASDqwe!23` might tick all your requirement boxes, but look at your keyboard to see why it is actually a bad password, and why it is in the dictionary.
So by now it should be clear that the best possible way to guard against this is to check passwords against a dictionary, and that is actually my main recommendation. The main challenge in this is to explain WHY you reject a password, that is perfectly valid anywhere else. Here is a proposal for what you could write.
Your password was rejected because it is in a list of common passwords. Since we don’t want other people to easily guess your password, please choose another one.
If you have a better proposal please put it in a comment below.
What else?
There are a lot of topics not covered here, but you should research anyway. I have not covered which key derivation function you should use, how costly it should be. An important thing is also protecting yourself against login-DDOS when using expensive password checks. Things like using 3rd party login and two-factor authorization is also something you should look in to. I may cover these in another article – if you would like that, let me know in the comments.
TLDR;
- Don’t have special character requirements for passwords
- Test passwords against a dictionary
- Use a key derivation function to encode passwords
- Have a minimal length, that combined with the cost makes brute force too expensive
- Favor expensive key derivation over long passwords
- Test that the password isn’t something that is something you know about the user (username, email, etc)
- Assume that your opponent can check passwords 100 times more efficient than you.
Finally, you can check this go package I am currently working on that will help you check passwords against a dictionary.
I hope this helps you build secure logins for your application!
Disclaimer: I am not a crypto-expert. I look at this from a pragmatic view after researching for a former project. If I made mistakes or false assumptions, feel free to comment.