Why should I always use salt in hashes

May 1, 2010 / Maciej Mensfeld

Most people already heard that, however it's worth mentioning (via wiki):

A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value. The data to be encoded is often called the "message," and the hash value is sometimes called the message digest or simply digest.

But, what does it really mean? It's quite simple - we have and input string (or some other data), we "insert" it into our algorithm and on output we will have a new "shortcuted" string. This operation is one-sided, so you cannot turn it back (to be honest you can but it is really hard). If you use SHA-2 hash function, the output looks similar to this:

4e2ecff8f8be5a7d4d8821266d956d844aa5b8eebd5983edbaaa6fa7fc9bc9e21
de42d443f50d8608a79f6507b7e95c6d4a913615c85710f86a40bc23cdc5d5d

When we store users passwords in our systems (databases, files, etc), they should be safe. If we get hacked and our database will get stolen, passwords should be protected. No one should be able to read them. Most users have one password for all their web-activities, so if this password get stolen, probably cracker will be able to log in into victim Facebook, Twitter and any other web accounts.

If we store not a pure password but its hash shortcut - even if it get stolen, cracker will not be able to use it to authorize into any type of account.

When using cryptographic hash function, we must remember about some rules:

MD5 should not be used for critical functions such as hashing passwords
Every hash function with "open" algorithm can be "broken" using brute-force attack
Every brute-force attack can be speeded up by using rainbow tables
Allowing users to create simple passwords is also not recommended

Remember this and you will be safe.

First of all, lets select one of hash functions. MD5 is old (and weak), also SHA1 has some vulnerabilities. The most common safe hash function is SHA2 and it is recommended when hashing password.

But what about brute-force attacks? Any password should be validated before use. They should not be to short or two simple. We can do it by using regular expression like this one:

^(?=.*\d)(?=.*([a-z]|[A-Z]))([\x20-\x7E]){8,40}$

Regexp presented above will ensure has minimum 8 chars, minimum one big letter and minimum one digit. Using this type of regular expressions will ensure that none user will have password like "abc" or any similar. But still, if we have rainbow tables and a lot of password hashes, we can extract at least some of them. How to protect ourself against attacks based on rainbow tables? Use salt.

What is salt? Salt consists of random bits, creating one of the inputs to a one-way hash function.In a typical usage for password authentication, the salt is stored along with the output of the one-way function, sometimes along with the number of iterations to be used in generating the output (for key stretching). After mixing salt into password any rainbow table will be meaningless.

How tu generate and use salt? The easiest way is to use one, global salt. Example:

# only small letters and digits
Password: "123qwerty"
# small and big letters, special chars and digits
Salt: "%^&amp;*(#@$@K:JKBJVCHKB@QRU)+{KMF  er23"
# password+salt
Hash: sha2

As you can see above - using salt will dramatically increase password power. One global salt has one major and really big disadvantage. If two users have same password they will also have same output hash. So, if we have a lot of users and some of them have same hashed password, we need to figure out only one hash and we will have access to accounts of the rest of users with same hash. We can also generate our own rainbow table dedicated for our cryptographic hash function and salt.

To protect against such behaviours we should use uniq per user salt. How to generate such salt? Combine some per user data and some random stuff. Example:

salt = user.login+user.created_at+rand(10**5)+'65241770q_  E9089u(&amp;'

We store salt with password hash. Don't worry - it is safe. Since each user has his own uniq hash, there does not exist any general rainbow table. Mix password, dynamic and static salt and you will be safe. Furthermore, when mixing salts and password in a uniq way - until cracker steals database and source codes, he will not know how to generate rainbow tables. Example:

hashed_pass = SHA2(user.login+user.password+salt+static_salt)

Categories: Security

Tags: cryptography, hash, hashes, md5, safety, salt, sha1, sha2

2 Comments

Piotr
November 25, 2010 — 21:32

1. Nie polecam zmuszać użytkownika do stosowania bezpiecznego hasła – wystarczy wskaźnik. Tym bardziej, że zaproponowanego testu (wyrażenia regularnego) nie zdadzą znacznie bezpieczniejsze hasła.

Generalnie należy unikać walidacji danych jeśli nie jest to niezbędne np. gdy nie będą przetwarzane automatycznie
Jeśli przykładowo aplikacja potrzebuje prawdziwych adresów email to lepiej spróbować wysłać na nie link aktywacyjny niż próbować samemu weryfikować ich poprawność (tym bardziej, że napisanie parsera jest niebanalnym zadaniem.) Dla wygody użytkownika (na wypadek pomyłki) wystarczy sprawdzić czy zawiera znak @ i choć jedną kropkę.

2. Nie polecam kombinować z mieszaniem soli z hasłem, ani ze stałą solą – security through obscurity zwykle jest stratą czasu (jeśli wraz z danymi wycieknie kod to jego zaciemnianie wpłynie tylko na nasze koszty.)

Sól musi spełniać jedynie dwa warunki: być możliwie długa i unikatowa dla każdego hasła np:
sól = Odcisk( login czas_utworzenia )
(zaletą tego rozwiązania jest konieczność dodatkowego przechowywania jedynie czasu bo login już mamy)

3. Lepiej tworzyć skróty wielokrotnie powtarzając operacje:

skrót = Odcisk( sól hasło )
powtórz 5..10 razy: skrót = Odcisk( skrót hasło )

4. Funkcja MD5 jak najbardziej wciąż nadaje się do tego celu o ile zastosuje się prawidłowe solenie (nie nadaje się natomiast do weryfikacji autentyczności dokumentów ze względu na znane algorytmy tworzenia kolizji.)

Oczywiście jeśli istnieje możliwość zastosowania funkcji generującej dłuższe skróty to należy z niej skorzystać.
admin (Post author)
December 27, 2010 — 02:16

1. Podstawowa walidacja powinna być – samo info (np w JS) może sprawić, że ktoś bez włączonego JS ominie to i utworzy za proste hasło. Czasem nie jest to najbezpieczniejsze. E-mail – regexp do podstawowej walidacji składni e-maila nie jest ani duży ani ciężki a uniknie się wysyłania maili na nieprawidłowe adresy (zwłaszcza jeśli dużo kont tworzonych jest przez kiepskie boty).
2. Wydzielajmy funkcjonalności. Timestampy nie są zawsze potrzebne w modelu usera. Mając sól służącą tylko do solenia – otrzymujemy kolumnę która służy do jednego jasno zdefiniowanego celu.
3. Kolizje, kolizje.
4. Sam Ronald Rivest już odradza i widocznie ma ku temu swoje powody.