dbTalk Databases Forums  

comparison tool

microsoft.public.sqlserver.tools microsoft.public.sqlserver.tools


Discuss comparison tool in the microsoft.public.sqlserver.tools forum.



Reply
 
Thread Tools Display Modes
  #31  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM






Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #32  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM






Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #33  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM



Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #34  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM



Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #35  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM



Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #36  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM



Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
  #37  
Old   
Dariusz Dziewialtowski
 
Posts: n/a

Default Re: comparison tool - 09-17-2008 , 08:53 PM



Quote:
Can you explain "row-by-rowcomparison"? *Do you mean you inspect the data
in both tables and match rows manually? *Or do you pick columns to "pretend"
they are the key? *Otherwise, without a key (whether PK or UQ), how could
the software even attempt to guess at matching rows on its own?
First of all - apologies for the long post; feel free to skip it - as
a short answer let me say that when compared tables have no key or
indexes (or: one of the tables has keys, and another - not), the user
can additionally specify "sorting" columns for the purpose of data
comparison. All options in SCT (SQL Server Comparison Tool) were
added to handle particular real life situations, and over the years
SCT "accumulated" such options like: "Columns to be ignored during
comparison" or "Comparison offset".

I'll try to explain this I short as possible, but at the beginning let
me say that over the years I've seen lot of databases and what's
always astonishing for me is how un-structured the data in many cases
is. There is surprising (for me at least) amount of tables with no
primary keys and no indexes whatsoever. I've seen tables with close
to hundred columns, no PK nor indexes, and *tons* of duplicate rows.
Everything has a reason, so I assume that business requirements
dictate such design.

It is not uncommon that people just dump the data out of the tables
into text files, remove duplicates using some kind of home-brew
programs, and compare data 'by hand'.

Over the years, I had to adjust SCT to handle user's data; lot of data
required extra fine-tuning before actual comparison.

While in "row-by-row comparison" mode, program "pulls" the data out of
compared tables in their "original" (or "neutral") sequence, using
equivalent of "SELECT * " statement. The user has additional option
of sorting data before comparing, where the user can specify sorting
columns (different sorting can be set up on source table and different
- on target table). This covers situations where for example the
source table has some indexes set, but target - none. This also
allows for comparing tables with different names of the columns.
Users have also options to ignore some columns, which allows for
comparing data in tables with different number of columns.

I had even users comparing data in numeric columns to data in text
columns.

All in all, it seems that current set of options for data comparison
covers pretty much all of "real life" data comparisons performed by
current users of SCT - for long time I didn't have any requests for
additional options.

Having said that, I have to admit that I never checked how data
comparison is handled by other programs. It is quite possible that
all the options could be implemented in simpler, more efficient way.

Uff - I promise in future not to post so long responses.


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.