dbTalk Databases Forums  

Good checksum component, anyone?

microsoft.public.sqlserver.dts microsoft.public.sqlserver.dts


Discuss Good checksum component, anyone? in the microsoft.public.sqlserver.dts forum.



Reply
 
Thread Tools Display Modes
  #31  
Old   
Gorm Braarvig
 
Posts: n/a

Default Re: Good checksum component, anyone? - 12-25-2008 , 05:15 PM






Greetings,

thanks for the swift reply and the links, but I am not closer to a solution,

what konesans check sum component does is: you can put a component in your
data flow and map up columns, etc. and get out a nice checksum you can add
to you data, later you compare the checksum to your previous checksum to
evaluate if any of the fields you care about has changed. Good idea.

however...

- the component, while having the right interface has an algorithm that
gives
the same checksum to different data, because it (if I guess correctly) uses
crc-32.

- md5 is overkill for my use, so I don't want to store a binary field or two
bigint's.
I want to use an algorithm that gives me a 64-bit checksum and I thought
using md5 and then combining it to half the size was a nice strategy (no???)

- If I was to implement this myself, (which I am capable of, but can't do,
since the structure of the projects and politics keeps me away from it), I
would simply use .NET crypto routines in a custom component. If this was
infeasable, I would attempt to use Microsoft Base Cryptographic Provider,
which I have had exposure
to in a previous gig...

my task is unsolved: a delta detection mechanism that works "good enough"
(2^32 is not, 2^64 is) while not taking up unnecessary space.

I wonder: has anyone else solved this in the data flow (in an easy way
without using a script component)?

thanks,
Gorm


"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote

Quote:
Gorm,

Sorry, I don't know what to say. It strikes me as odd to feel it
necessary to use MD5, then to XOR the first 8 bytes on the second 8 bytes.
I would think that that certainly has to do something negative to the
hash.

Maybe one of these will be useful to you. Google: MD5 Component

http://xstandard.com/en/documentation/xmd5/
http://www.download.com/MD5-Com-Comp...-10449255.html
http://www.traction-software.co.uk/devmd5/index.html

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:%236jgVdrZJHA.5272 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
Hi Russel!

Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for
most (all?) applications.

Thanks,
Gorm

"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote in message
news:#7P7HCIZJHA.5312 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Gorm,

Starting with SQL Server 2005 there is a HASHBYTES keyword that allows
you to choose from a set of methods and returns VARBINARY. It can be
cast into BIGINT, but it is not reversable.

The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.

See the output of the following:

SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF

Results:
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF

Does that help?

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:C9564050-4CC3-45FC-B4C0-BBD263E9D924 (AT) microsoft (DOT) com...
Hi!

I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?

regards,
Gorm Braarvig



Reply With Quote
  #32  
Old   
Gorm Braarvig
 
Posts: n/a

Default Re: Good checksum component, anyone? - 12-25-2008 , 05:15 PM






Greetings,

thanks for the swift reply and the links, but I am not closer to a solution,

what konesans check sum component does is: you can put a component in your
data flow and map up columns, etc. and get out a nice checksum you can add
to you data, later you compare the checksum to your previous checksum to
evaluate if any of the fields you care about has changed. Good idea.

however...

- the component, while having the right interface has an algorithm that
gives
the same checksum to different data, because it (if I guess correctly) uses
crc-32.

- md5 is overkill for my use, so I don't want to store a binary field or two
bigint's.
I want to use an algorithm that gives me a 64-bit checksum and I thought
using md5 and then combining it to half the size was a nice strategy (no???)

- If I was to implement this myself, (which I am capable of, but can't do,
since the structure of the projects and politics keeps me away from it), I
would simply use .NET crypto routines in a custom component. If this was
infeasable, I would attempt to use Microsoft Base Cryptographic Provider,
which I have had exposure
to in a previous gig...

my task is unsolved: a delta detection mechanism that works "good enough"
(2^32 is not, 2^64 is) while not taking up unnecessary space.

I wonder: has anyone else solved this in the data flow (in an easy way
without using a script component)?

thanks,
Gorm


"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote

Quote:
Gorm,

Sorry, I don't know what to say. It strikes me as odd to feel it
necessary to use MD5, then to XOR the first 8 bytes on the second 8 bytes.
I would think that that certainly has to do something negative to the
hash.

Maybe one of these will be useful to you. Google: MD5 Component

http://xstandard.com/en/documentation/xmd5/
http://www.download.com/MD5-Com-Comp...-10449255.html
http://www.traction-software.co.uk/devmd5/index.html

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:%236jgVdrZJHA.5272 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
Hi Russel!

Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for
most (all?) applications.

Thanks,
Gorm

"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote in message
news:#7P7HCIZJHA.5312 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Gorm,

Starting with SQL Server 2005 there is a HASHBYTES keyword that allows
you to choose from a set of methods and returns VARBINARY. It can be
cast into BIGINT, but it is not reversable.

The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.

See the output of the following:

SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF

Results:
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF

Does that help?

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:C9564050-4CC3-45FC-B4C0-BBD263E9D924 (AT) microsoft (DOT) com...
Hi!

I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?

regards,
Gorm Braarvig



Reply With Quote
  #33  
Old   
Gorm Braarvig
 
Posts: n/a

Default Re: Good checksum component, anyone? - 12-25-2008 , 05:15 PM



Greetings,

thanks for the swift reply and the links, but I am not closer to a solution,

what konesans check sum component does is: you can put a component in your
data flow and map up columns, etc. and get out a nice checksum you can add
to you data, later you compare the checksum to your previous checksum to
evaluate if any of the fields you care about has changed. Good idea.

however...

- the component, while having the right interface has an algorithm that
gives
the same checksum to different data, because it (if I guess correctly) uses
crc-32.

- md5 is overkill for my use, so I don't want to store a binary field or two
bigint's.
I want to use an algorithm that gives me a 64-bit checksum and I thought
using md5 and then combining it to half the size was a nice strategy (no???)

- If I was to implement this myself, (which I am capable of, but can't do,
since the structure of the projects and politics keeps me away from it), I
would simply use .NET crypto routines in a custom component. If this was
infeasable, I would attempt to use Microsoft Base Cryptographic Provider,
which I have had exposure
to in a previous gig...

my task is unsolved: a delta detection mechanism that works "good enough"
(2^32 is not, 2^64 is) while not taking up unnecessary space.

I wonder: has anyone else solved this in the data flow (in an easy way
without using a script component)?

thanks,
Gorm


"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote

Quote:
Gorm,

Sorry, I don't know what to say. It strikes me as odd to feel it
necessary to use MD5, then to XOR the first 8 bytes on the second 8 bytes.
I would think that that certainly has to do something negative to the
hash.

Maybe one of these will be useful to you. Google: MD5 Component

http://xstandard.com/en/documentation/xmd5/
http://www.download.com/MD5-Com-Comp...-10449255.html
http://www.traction-software.co.uk/devmd5/index.html

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:%236jgVdrZJHA.5272 (AT) TK2MSFTNGP04 (DOT) phx.gbl...
Hi Russel!

Can't use it, need to do it in the flow. I was thinking of merging the 16
bytes into 8 by xoring the two Int64 together, would be sufficient for
most (all?) applications.

Thanks,
Gorm

"Russell Fields" <russellfields (AT) nomail (DOT) com> wrote in message
news:#7P7HCIZJHA.5312 (AT) TK2MSFTNGP02 (DOT) phx.gbl...
Gorm,

Starting with SQL Server 2005 there is a HASHBYTES keyword that allows
you to choose from a set of methods and returns VARBINARY. It can be
cast into BIGINT, but it is not reversable.

The MD5 method returns 16 bytes, but BIGINT is only 8 bytes.

See the output of the following:

SELECT HASHBYTES('MD5','Gorm Braarvig') AS BINMD5,
CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT) AS BIGINTMD5,
CAST (CAST (HASHBYTES('MD5','Gorm Braarvig') AS BIGINT)
AS VARBINARY(8000)) AS BINAGAINMDF

Results:
0x647679B1F3F515655EC3B478951307CF 6828499890013931471
0x5EC3B478951307CF

Does that help?

RLF

"Gorm Braarvig" <gorm_b (AT) hotmail (DOT) com> wrote in message
news:C9564050-4CC3-45FC-B4C0-BBD263E9D924 (AT) microsoft (DOT) com...
Hi!

I currently use conesans checksum component, but I need a md5-checksum
pushed into a bigint (I believe conesans uses crc-32).
Has anyone solved this?

regards,
Gorm Braarvig



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.