I was approached recently by a friend (you can call him Jon) that had a security story to share with me. It was an interesting scenario that could have been dire for them had they:
- Not discovered the problem
- Been breached
Wanting to improve the security of their application, Jon’s team decided to implement a password entropy feature to encourage their users to use strong passwords. This is a useful feature that companies like Dropbox, Twitter, and eBay have implemented. Jon’s team decided to use the Zxcvbn library to implement their entropy generator. Again nothing wrong here.
Where things went off the rails is when Jon’s team decided to store the entropy in the database. The rationale was that as technology progressed and password cracking became easier, users could be contacted to update their password. The trouble is that this significantly weakens password hashing algorithms (Jon’s team was using BCrypt) and decreases the time it takes to attack/brute force hashed passwords.
Thankfully this story has a happy ending and Jon’s team discovered the issue during a security audit and yanked it immediately.
The rest of this post will examine how storing entropy completely destroys your hash-password security hurtrealbad
Using Zxcvbn to Calculate Entropy
Above you can see an image of Dropbox’s password meter. It’s a series of bars/colors to show relative password strength. The underlying calculations are done with the Zxcvbn library. And the code to use it is quite simple:
1 2 3 4
For Jon’s application, when a user entered in their password, the app would calculate and store the entropy value along with the BCrypt hashed password. Below is a simplified version of the Users table.
If you don’t think about security on a regular basis, this probably looks normal. The next question to ask, is how can an adversary with the Users table exploit the entropy value and crack passwords?
Why Storing Entropy is a Terrible Idea!
You see entropy is information leaking. When it comes to passwords (and secure systems in general) you want to leak as little information as possible. Otherwise an attacker has information they can use to their advantage. To leverage this information you need to understand a few things about hashing speeds.
Above you can see, BCrypt takes a long time to do 10,000 password hashes compared to MD5/SHA1. BCrypt was designed to make brute forcing password hashes expensive whereas MD5 and SHA1 weren’t designed with that consideration. Which leads us to ask: how long does it take to calculate Zxcvbn values?
You can see from the above image that Zxcvbn takes significantly less time to run 10,000 iterations than BCrypt – 124x faster. The implication then is that you can input passwords into Zxcvbn and generate a subset of candidate passwords which can then be hashed with BCrypt. The algorithm is going to look like this:
- Get a list of common passwords (the bigger the better)
- Run the common passwords through Zxcvbn to get an entropy value
- Use entropy as a hash key, store the password as a value in an array
- Sort Users table by lowest to highest entropy
- Iterate the Users table and use the entropy column to index your hash
- If a hash key is present:
- The value array are candidate passwords
- BCrypt the candidate passwords and compare to database hash
- If a hash key is present:
Writing a Cracker
Let’s write the algorithm described above. Starting with the first 3 steps:
1 2 3 4 5 6 7 8 9 10 11 12 13
The result of the above is a hash of arrays with the key being the entropy for the passwords in the value array:
1 2 3 4 5 6 7 8
The next part of the algorithm loads the database, sorts by lowest to highest entropy (easiest to hardest), and tries to crack the password:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
And that’s it. There’s nothing overly difficult in this algorithm. It’s basically 3 loops and a couple of hash functions, but what drops out is non-trivial. If this were a real database, you’d have email and password combinations flying at you. Which is a real problem that means you’re losing your user’s information!
As I mentioned, this was a good news story for Jon and his team when they discovered this issue without their database getting compromised and seeing their names in the news. However, that was not the case for Ashley Madison. That’s right! Ashley Madison made a similar error when they stored the MD5 hash of their user’s passwords in their database alongside the BCrypted hashes. This lead researchers to crack almost 1/3 of their 30 million password hashes!
While it’s nice to feel smug and laugh at Ashley Madison’s peril, there is a real possibility that you’ve
done something equally
stupid compromising in your own database. I’ve posted
the code and an example database on github,
take a look at the code and see if you can write your own cracker.