Tag: safety

Why should I always use salt in hashes

Most people already heard that, however it's worth mentioning (via wiki):

A cryptographic hash function is a deterministic procedure that takes an arbitrary block of data and returns a fixed-size bit string, the (cryptographic) hash value, such that an accidental or intentional change to the data will change the hash value. The data to be encoded is often called the "message," and the hash value is sometimes called the message digest or simply digest.

But, what does it really mean? It's quite simple - we have and input string (or some other data), we "insert" it into our algorithm and on output we will have a new "shortcuted" string. This operation is one-sided, so you cannot turn it back (to be honest you can but it is really hard). If you use SHA-2 hash function, the output looks similar to this:

4e2ecff8f8be5a7d4d8821266d956d844aa5b8eebd5983edbaaa6fa7fc9bc9e21
de42d443f50d8608a79f6507b7e95c6d4a913615c85710f86a40bc23cdc5d5d

When we store users passwords in our systems (databases, files, etc), they should be safe. If we get hacked and our database will get stolen, passwords should be protected. No one should be able to read them. Most users have one password for all their web-activities, so if this password get stolen, probably cracker will be able to log in into victim Facebook, Twitter and any other web accounts.

If we store not a pure password but its hash shortcut - even if it get stolen, cracker will not be able to use it to authorize into any type of account.

When using cryptographic hash function, we must remember about some rules:

  • MD5 should not be used for critical functions such as hashing passwords
  • Every hash function with "open" algorithm can be "broken" using brute-force attack
  • Every brute-force attack can be speeded up by using rainbow tables
  • Allowing users to create simple passwords is also not recommended

Remember this and you will be safe.

First of all, lets select one of hash functions. MD5 is old (and weak), also SHA1 has some vulnerabilities. The most common safe hash function is SHA2 and it is recommended when hashing password.

But what about brute-force attacks? Any password should be validated before use. They should not be to short or two simple. We can do it by using regular expression like this one:

^(?=.*\d)(?=.*([a-z]|[A-Z]))([\x20-\x7E]){8,40}$

Regexp presented above will ensure has minimum 8 chars, minimum one big letter and minimum one digit. Using this type of regular expressions will ensure that none user will have password like "abc" or any similar. But still, if we have rainbow tables and a lot of password hashes, we can extract at least some of them. How to protect ourself against attacks based on rainbow tables? Use salt.

What is salt? Salt consists of random bits, creating one of the inputs to a one-way hash function.In a typical usage for password authentication, the salt is stored along with the output of the one-way function, sometimes along with the number of iterations to be used in generating the output (for key stretching). After mixing salt into password any rainbow table will be meaningless.

How tu generate and use salt? The easiest way is to use one, global salt. Example:

# only small letters and digits
Password: "123qwerty"
# small and big letters, special chars and digits
Salt: "%^&*(#@$@K:JKBJVCHKB@QRU)+{KMF  er23"
# password+salt
Hash: sha2

As you can see above - using salt will dramatically increase password power. One global salt has one major and really big disadvantage. If two users have same password they will also have same output hash. So, if we have a lot of users and some of them have same hashed password, we need to figure out only one hash and we will have access to accounts of the rest of users with same hash. We can also generate our own rainbow table dedicated for our cryptographic hash function and salt.

To protect against such behaviours we should use uniq per user salt. How to generate such salt? Combine some per user data and some random stuff. Example:

salt = user.login+user.created_at+rand(10**5)+'65241770q_  E9089u(&'

We store salt with password hash. Don't worry - it is safe. Since each user has his own uniq hash, there does not exist any general rainbow table. Mix password, dynamic and static salt and you will be safe. Furthermore, when mixing salts and password in a uniq way - until cracker steals database and source codes, he will not know how to generate rainbow tables. Example:

hashed_pass = SHA2(user.login+user.password+salt+static_salt)

Basic incoming data verification in PHP

Why validate?

Incoming data validation is "must have". Always be aware, that user will send you corrupted (or wrong) data. Sometimes by mistake, sometimes on purpose.

SQL injection and XSS (Cross-site scripting) are possible due to lack of data verification.

What should I validate?

E-v-e-r-y-t-h-i-n-g. Every piece of incoming data.

While we can assume that administrator may be able (and sometimes he should be) to place the HTML or Javascript in a message content, we can't say this about standard user posting message in a guestbook.

What can we lose?

Data. Logins, passwords and other stuff.In commercial products that could be a disaster.

Let's mess things up!

I've created simple test "site" (click), where you can play with content injection. Here you can download source code.

Inject h1 tag and some content:

</pre>
<h1>Injected content</h1>
<pre>

It's not hard to guess, what we will see after sending this message. H1 tag inserted into page content. Not so bad. We can inject HTML so why not CSS and Javascript? Let's try:

<pre>
<div style="text-decoration: underline;">
message
</div>
<script>alert('tadam!')</script>
</pre>

It works! Strip_tags alone is not enough. PHP has some other nice methods: addslashes i stripslashes. The add (and remove) slashes before "dangerous" signs like '"'. Thanks to those methods we will be protected against SQL Injection - because after this, even "special" chars will be treated as standard ones. If we send out Jacascript and we use strip_tags_addslashes, we will see in source code - that javascript tag has been removed and other tags are escaped.

Why we need stripslashes?

We protect ourself by adding slashes - but before we show output to user - we should remove them.

Buuu - inconvenient!

You think so? Yeah - you're right. It is a lil bit inconvienient, but do not worry, there is another option. If you use Mysql database, you can use mysql_real_escape_string. It detects dangerous stuff and "add and strip" slashes without our help.

But, what about HTML?

Hmm but what about situation when we would like to show a html code as a text? Using previous methods - we would remove all html from content. There is one method that should help use - htmlspecialchars . It will change "living" HTML into its entity equal (but safe) version, so you can display it and it will not be interpreted.

Conclusion

I've tried to show the "basics of basics" of data validation. Read, test, read, test and never feel safe :)

Copyright © 2024 Closer to Code

Theme by Anders NorenUp ↑