dbTalk Databases Forums  

Re: Escaping the results of a function on insert?

comp.databases.mysql comp.databases.mysql


Discuss Re: Escaping the results of a function on insert? in the comp.databases.mysql forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Erick T. Barkhuis
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-09-2012 , 11:28 AM






Jonas Smith:

Quote:
Hello

I inherited a user database held passwords in clear text.

I'm trying to harden the system so I want to store hashes of
passwords instead....
Any pointers? Do I really have to do it with PHP?
You may want to use MySql's PASSWORD()-function


--
Erick

Reply With Quote
  #2  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-09-2012 , 01:56 PM






On 7/9/2012 11:45 AM, Jonas Smith wrote:
Quote:
Hello

I inherited a user database held passwords in clear text.

I'm trying to harden the system so I want to store hashes of passwords
instead. The user table has few million rows and running a PHP script to
do the hashing is the last option.

I have this update statement:

UPDATE `users` SET `hashedPass` = UNHEX(SHA2(CONCAT(`oldPassword`,
'somesalt'), 256))

I want to store the hashes in a binary column in order to save space
(the table could reach few tens of millions of rows).

However, the above query is not generating the desired results. After
running the query, the hasedPass column contains seemingly random
characters - usually few odd characters and sometime completely empty.

I think I need to escape the results of the UNHEX function. QUOTE()
didn't seem to work.

Doing it with php works.

$hashed = hash("SHA256", $row['oldPassword']."somesalt", TRUE);

$query = "UPDATE `users` SET `hashedPass` = '" . escapeStr($hashed) . "'
WHERE `user_id` = {$row['user_id']}";

escapeStr uses mysql_real_escape_string() to return a properly escaped
string.

and doing a SELECT UNHEX(SHA2(CONCAT(`oldPassword`, 'somesalt'), 256))

returns the same string as the $hashed contents.

Only on UPDATE it fails.

Any pointers? Do I really have to do it with PHP?

TIA

Jonas,

I wouldn't worry about UNHEX'ing the SHA2 string. You're trading disk
space for the additional CPU cycles required to add an additional
conversion on every access to the password.

Disk space is much cheaper than CPU cycles, even for a few tens of
millions of rows.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #3  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-09-2012 , 07:46 PM



On 7/9/2012 5:52 PM, Jonas Smith wrote:
Quote:
On Mon, 9 Jul 2012 14:56:52 -0400, Jerry Stuckle wrote
(in article <jtf9ho$okp$1 (AT) dont-email (DOT) me>):

I wouldn't worry about UNHEX'ing the SHA2 string. You're trading disk
space for the additional CPU cycles required to add an additional
conversion on every access to the password.

Disk space is much cheaper than CPU cycles, even for a few tens of
millions of rows.

When accessing the records it will be with PHP hashing. the hash
function is faster when returning binaries as it doesn't have to
re-encode them.

So the benefit of unhexing is twofold.

Incorrect. Whether PHP or MySQL does the hashing, it will take CPU
cycles. And the hash function will be SLOWER because you have to
convert the SHA2 to binary before returning the value.

The difference between MySQL (or PHP) returning the SHA2 value vs. the
binary (UNHEX'd) value is so small as to be immaterial.

You are only hurting yourself.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #4  
Old   
Denis McMahon
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-10-2012 , 08:01 PM



On Mon, 09 Jul 2012 11:45:34 -0400, Jonas Smith wrote:

Quote:
I have this update statement:

UPDATE `users` SET `hashedPass` = UNHEX(SHA2(CONCAT(`oldPassword`,
'somesalt'), 256))

I want to store the hashes in a binary column in order to save space
(the table could reach few tens of millions of rows).

However, the above query is not generating the desired results. After
running the query, the hasedPass column contains seemingly random
characters - usually few odd characters and sometime completely empty.
When I have this sort of issue, the first thing I try and do is reduce
the complexity of the calculations.

So, create a test table with maybe 10 entries. Give it columns oldpw
varchar 99, hash1 char 64, hash2 binary 32

Populate oldpw with some values, then run the following queries:

update `test` set `hash1` = sha2( concat( `oldpw`, 'salt' ), 256);
update `test` set `hash2` = unhex( `hash1` );

The problem you now have is that when you look at the data in hash2
directly, it will contain non printing characters in just about any
character set that you use, so unless you have a means of displaying it
in eg hex, you won't see it in a meaningful manner:

select hex( `hash2` ) from test;

In other words, the problem may not be the data in "hashedpass" after the
update runs, but rather the limitations of the mechanisms you are
subsequently using to inspect it!

Rgds

Denis McMahon

Reply With Quote
  #5  
Old   
Denis McMahon
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-11-2012 , 08:45 AM



On Mon, 09 Jul 2012 11:45:34 -0400, Jonas Smith wrote:

Quote:
The user table has few million rows and running a PHP script to
do the hashing is the last option.

I want to store the hashes in a binary column in order to save space
(the table could reach few tens of millions of rows).
Following up from Jerry's comments earlier about trading cpu cycles for
space. The following contains some very "back of the envelope" math and
some blatant assumptions:

You're suggesting a few tens of millions of users, there are 86400
seconds in a day, every ten million users logging in daily represents 115
user logins per second. Possibly 3 or 4 times that many. And quite
possibly not at a steady rate, but with some very busy periods and some
very quiet ones. In my experience peak transaction rates can be 2 orders
of magnitude higher than the calculated average. That suggests to me a
potential 40,000 logins in a peak traffic second.

I don't know what else the processor is going to be doing, but an extra
40,000 "hex(`hashedPass`)" conversions per second could have a
significant effect on system performance at peak load times.

Rgds

Denis McMahon

Reply With Quote
  #6  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-11-2012 , 03:11 PM



On 7/11/2012 3:17 PM, Jonas Smith wrote:
Quote:
On Tue, 10 Jul 2012 21:01:45 -0400, Denis McMahon wrote
(in article <jtij9p$9fn$1 (AT) dont-email (DOT) me>):

In other words, the problem may not be the data in "hashedpass" after the
update runs, but rather the limitations of the mechanisms you are
subsequently using to inspect it!

I know how to compare binary data.

Anyway, I've solved the problem.

The source of the problem turned out to be the character set used to
store the binary data. It was set to utf8_bin and that somehow screwed
things up. By changing it to BINARY ascii, the results are now
consistent.

The confusing thing was that when inserting an escaped binary hash
(generated by PHP) into the field it worked flawlessly. But when letting
Mysql generate the hash, it broke. So it's not utf8_bin's unsuitability
for the binary data that was the problem. Something is wrong with
mysql's handling of this issue.

For everybody else chiming in:

Binaries are always, always faster than hex. The only time that binary
data is not good is when a human is trying to read it visually. In all
other cases, binary data is better and faster (computers calculate in
binary, right?).

Remember, each time that the system logs in a user, the password data
needs to be compared. Which is faster? comparing two 64 char strings or
two 32 char strings? Repeat a minimum of a million times per day.

Try the hash() function in PHP. When it returns binary data it's faster.
Not by much (2%), but faster nontheless.

Which is faster:

1:
store long hex password hash
get password entry
generate long hex hash from entry
compare two long hex hashes

or
2:
store short binary password hash
get password entry
generate short binary hash from entry
compare two short binary hashes

?

All of you were focused on the 'UNHEX()' function in mysql because the
SHA2() function returns hex data only.

So to you it's an additional function. But, in reality, every hash
function produces a binary hash, and for some, then it hexes it. In PHP,
you can make the hash() function skip the hexing of the binary string
before it returns it, hence saving time.

Every comparison to be made later, whether I do it with PHP or with
Mysql will be faster if I'm comparing short binary strings than long hex
ones.

The only time that Mysql would be doing the hashing (and wasting time
hexing and then unhexing the binary hash) is on the initial transfer of
passwords from plain text to hashes. After that it's PHP all the way.

To show how schizophrenic mysql is about these things, the SHA2()
function returns hex in lower case ascii and the HEX() function returns
upper case ascii hex.


Comparing two 32 bit hashes are faster, obviously.

However, it takes longer for PHP to unhex the SHA2 hash than it takes to
compare 64 (or even 256) characters.

This is the point you are ignoring - and it is a very pertinent point.

And of course, SHA2() is quite free to return lower case ASCII and HEX()
is quite free to return uppercase ASCII. They are, after all, two
entirely different functions - with two entirely different purposes.

But then you're already the expert so why are you asking us for help?

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #7  
Old   
Denis McMahon
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-11-2012 , 04:06 PM



On Wed, 11 Jul 2012 15:17:17 -0400, Jonas Smith wrote:

Quote:
All of you were focused on the 'UNHEX()' function in mysql because the
SHA2() function returns hex data only.
Sorry, I focused on the unhex because you introduced it as an additional
step, and it creates a further additional step later.

If you store the sha2, when you compare the password given by someone
logging in, you only have to create the sha2 of the entered pw and
compare it to the stored hash.

If you store the unhexed value, you need to do one of two extra things
when people log in:

a) unhex the sha2'd supplied p/w for the login and do a 32 byte binary
compare with the stored unhexed sha2 of the original p/w; or

b) hex the stored unhexed sha2 and do a 64 byte string compare with the
sha2'd supplied p/w for the login.

Both of these use CPU cycles. You've given numbers that, as you describe
it, could easily lead to peaks of the order of 40,000 login transactions
per second.

At 40,000 login transactions per second, the extra cpu cycles involved in
handling an additional hex or unhex will be noticeably more expensive
than the extra time cost of fetching a 64 instead of a 32 byte value from
the disc, and disc space is a lot cheaper than processing effort.

In terms of fetching data from the disc, 32 and 64 bytes are *both* much
smaller than the minimum sector size on an hdd these days, so at the
hardware io level there's minimal extra overhead! A whole sector (at
least) will be read into memory anyway.

Rgds

Denis McMahon

Reply With Quote
  #8  
Old   
Jerry Stuckle
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-11-2012 , 11:04 PM



On 7/11/2012 8:38 PM, Jonas Smith wrote:
Quote:
On Wed, 11 Jul 2012 16:11:38 -0400, Jerry Stuckle wrote
(in article <jtkmlu$ahg$1 (AT) dont-email (DOT) me>):

However, it takes longer for PHP to unhex the SHA2 hash than it takes to
compare 64 (or even 256) characters.

Why would I have to unhex the sha2 hash in PHP? in PHP, you can give it
the 'TRUE' option and it would return a binary hash instead of a hexed
hash (obviates the need to unhex). When you use the binary return option
in the hash() function, in PHP, it is faster.

Because that's how you said you are storing it in your database. If you
want to compare the SHA2 string from your database (which is UNHEXed)
with one from PHP, you either need to UNHEX the PHP string or HEX the
database string. Both require numerous CPU cycles.

Quote:
This is the point you are ignoring - and it is a very pertinent point.

You don't seem to have read my post closely enough. I had said that the
hash() function in PHP is faster when returning a binary string. So the
point is simply wrong.

I read your post quite carefully. And it shows you have learned a
little - but have understood less.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex (AT) attglobal (DOT) net
==================

Reply With Quote
  #9  
Old   
Axel Schwenke
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-12-2012 , 04:23 AM



Jonas Smith <jonas.smith (AT) yahoo (DOT) com> wrote:

Quote:
Which is faster:

1:
store long hex password hash
get password entry
generate long hex hash from entry
compare two long hex hashes

or
2:
store short binary password hash
get password entry
generate short binary hash from entry
compare two short binary hashes
Of course 2. is faster because it

a) transfers fewer data
b) omits the "convert hash to hex string" step, and
c) compares fewer data

The questions is however: how much is it faster?
And the answer is: probably only insignificantly faster.

The reason is the hash function itself. For password hashing,
you need a cryptographically "hard" hash function. Those are
typically much slower than simple hash functions like those
used for hash tables in PHP or elsewhere.

But even for cryptographic hashes there is variance in speed.
I.e. MD4, MD5, SHA1, SHA2, SHA256 become computationally more
expensive from left to right.

Now for password hashing one *wants* to use a "slow" hash
function. The reason is, that a slow hash function raises the
cost for a brute force attack.

This is however a tradeoff, because the cost for legal login
attempts is also raised. So it depends on the estimated number
of legal password checks, how expensive you can make it.

As an example: LUKS (the disk excryption container for Linux)
does some 100000 hash rounds to transform the user key before
it checks if the key opens a slot. This takes ~1 second even
on potent hardware. But it is expected to be run at most once
a day, so it does not matter.

A MySQL server OTOH may have to check a few hundert logins
each second. Thus it uses a much leaner approach. And it
cheats by preloading auth tables into memory hash tables.


XL

Reply With Quote
  #10  
Old   
Denis McMahon
 
Posts: n/a

Default Re: Escaping the results of a function on insert? - 07-12-2012 , 11:48 AM



On Thu, 12 Jul 2012 10:49:39 -0400, Jerry Stuckle wrote:

Quote:
He is generating an SHA2 hash, then unhexing it before storing it in his
database. When a user signs in, he generates another SHA2 string.
He's not generating a *hex* SHA2 string when the user logs in. He's
passing the optional "raw output" flag to the hash function to return the
binary string instead.

http://us3.php.net/manual/en/function.hash.php

string hash ( string $algo , string $data [, bool $raw_output = false ] )

His assumption is that hash creates the binary string faster than it
creates the hex string.

My tests suggest his assumption may be wrong.

Quote:
But that is a hex string
No, it's a binary string, because he passes the "raw output" flag to the
hash function.

Quote:
I'd like to see the code you used to generate your test.
I included the php in my post, the input data files are a list of 100
words of between 3 and 12 letters (character names from a mud I play),
and a salt that is generated by copying 4k of /dev/random to a file and
then uuencoding it.

Rgds

Denis McMahon

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.