dbTalk Databases Forums  

Looking for faster integer validation

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss Looking for faster integer validation in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Fred.
 
Posts: n/a

Default Looking for faster integer validation - 04-23-2011 , 05:02 PM






I process a weekly extract which, unfortunately, comes from a table
with a 20 unvalidated integer fields which I need in the form of
numbers when they are valid. Until the most recent update I'd been
sliding by bulk importing the extract and then converting to a
validated table using an append query cased on ISNUMERIC returning 1.,
NULL otherwise for these fields.

With the last update to SQL Server Standard, this finally caught up
with me, when ISNUMERIC('\') started returning 1 (the value was in a
record which hadn't been touched in some time). I have a patch in
which this particualr field in this record, and am working on a more
robust solution.

The best I've been able to come up with is a function which validates
character by character and returns the converted value converts if
valid. This takes about 5 minutes per million records, more than
original query (20 fields per record, 16 microseconds per field) which
is sort of acceptable, but which I would like to improve.

Fred.

Reply With Quote
  #2  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Looking for faster integer validation - 04-24-2011 , 04:11 AM






Fred. (ghrno-google (AT) yahoo (DOT) com) writes:
Quote:
I process a weekly extract which, unfortunately, comes from a table
with a 20 unvalidated integer fields which I need in the form of
numbers when they are valid. Until the most recent update I'd been
sliding by bulk importing the extract and then converting to a
validated table using an append query cased on ISNUMERIC returning 1.,
NULL otherwise for these fields.

With the last update to SQL Server Standard, this finally caught up
with me, when ISNUMERIC('\') started returning 1 (the value was in a
record which hadn't been touched in some time). I have a patch in
which this particualr field in this record, and am working on a more
robust solution.

The best I've been able to come up with is a function which validates
character by character and returns the converted value converts if
valid. This takes about 5 minutes per million records, more than
original query (20 fields per record, 16 microseconds per field) which
is sort of acceptable, but which I would like to improve.
isnumeric has always been useless.

If you need to validate for unsigned integers, you can use this expression:

col NOT LIKE '%[^0-9]%'

This expression returns false, as soon there is a value which contains any
non-digit character including space.

--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Reply With Quote
  #3  
Old   
Pedro Eu
 
Posts: n/a

Default Re: Fred. (ghrno-google@yahoo.com) writes:isnumeric has always beenuseless. - 05-10-2011 , 01:19 PM



Fred, I'm trying to use your solution but I think something is going wrong. Try this:

IF 'abd' LIKE '%b%'
BEGIN
PRINT '1st case OK'
END

IF 'aaa' LIKE '%b%'
BEGIN
PRINT '2nd case OK'
END

IF '123' LIKE '%[^0-9]%'
BEGIN
PRINT '3rd case OK'
END

IF '123a' LIKE '%[^0-9]%'
BEGIN
PRINT '4th case OK'
END

Sql Server result:
1st case OK
4th case OK

The 1st case is ok to me, but I do not understood why the 3rd case is not ok and the 4th is.

Any idea?


Quote:
On Saturday, April 23, 2011 6:02 PM Fred. wrote:

I process a weekly extract which, unfortunately, comes from a table
with a 20 unvalidated integer fields which I need in the form of
numbers when they are valid. Until the most recent update I'd been
sliding by bulk importing the extract and then converting to a
validated table using an append query cased on ISNUMERIC returning 1.,
NULL otherwise for these fields.

With the last update to SQL Server Standard, this finally caught up
with me, when ISNUMERIC('\') started returning 1 (the value was in a
record which had not been touched in some time). I have a patch in
which this particualr field in this record, and am working on a more
robust solution.

The best I have been able to come up with is a function which validates
character by character and returns the converted value converts if
valid. This takes about 5 minutes per million records, more than
original query (20 fields per record, 16 microseconds per field) which
is sort of acceptable, but which I would like to improve.

Fred.

Quote:
On Sunday, April 24, 2011 5:11 AM Erland Sommarskog wrote:

Fred. (ghrno-google (AT) yahoo (DOT) com) writes:

isnumeric has always been useless.

If you need to validate for unsigned integers, you can use this expression:

col NOT LIKE '%[^0-9]%'

This expression returns false, as soon there is a value which contains any
non-digit character including space.

--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008: http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005: http://msdn.microsoft.com/en-us/sqlserver/bb895970.aspx

Reply With Quote
  #4  
Old   
Bob Barrows
 
Posts: n/a

Default Re: Fred. (ghrno-google@yahoo.com) writes:isnumeric has always been useless. - 05-10-2011 , 01:56 PM



Pedro Eu wrote:
Quote:
Fred, I'm trying to use your solution but I think something is going
It wasn't Fred's solution, it was Erland's suggestion.

Quote:
wrong. Try this:

snip
IF '123' LIKE '%[^0-9]%'
BEGIN
PRINT '3rd case OK'
END

IF '123a' LIKE '%[^0-9]%'
BEGIN
PRINT '4th case OK'
END

Sql Server result:
1st case OK
4th case OK

The 1st case is ok to me, but I do not understood why the 3rd case is
not ok and the 4th is.

Any idea?
Yes. You did not follow the suggested solution. Specifically, you left out
the "NOT" keyword:

IF '123' NOT LIKE '%[^0-9]%'
PRINT '''123'' contains only numeric characters'
ELSE
PRINT '''123'' contains at least one non-numeric character'

and

IF '123a' NOT LIKE '%[^0-9]%'
PRINT '''123a'' contains only numeric characters'
ELSE
PRINT '''123a'' contains at least one non-numeric character'


The pattern [^0-9] means "not 0 to 9", so the expression LIKE '%[^0-9]%'
returns true if any characters that are not 0 to 9 are found in the string.
Does that help?

<snip>
Quote:
On Sunday, April 24, 2011 5:11 AM Erland Sommarskog wrote:

Fred. (ghrno-google (AT) yahoo (DOT) com) writes:

isnumeric has always been useless.

If you need to validate for unsigned integers, you can use this
expression:

col NOT LIKE '%[^0-9]%'

This expression returns false, as soon there is a value which
contains any non-digit character including space.

Reply With Quote
  #5  
Old   
Fred.
 
Posts: n/a

Default Re: Fred. (ghrno-google@yahoo.com) writes:isnumeric has always been useless. - 05-10-2011 , 02:20 PM



On May 10, 2:19*pm, Pedro Eu <pedro.... (AT) gmail (DOT) com> wrote:
Quote:
Fred, I'm trying to use your solution but I think something is going wrong. Try this:

IF 'abd' LIKE '%b%'
BEGIN
* * * * PRINT '1st case OK'
END

IF 'aaa' LIKE '%b%'
BEGIN
* * * * PRINT '2nd case OK'
END

IF '123' LIKE '%[^0-9]%'
BEGIN
* * * * PRINT '3rd case OK'
END

IF '123a' LIKE '%[^0-9]%'
BEGIN
* * * * PRINT '4th case OK'
END

Sql Server result:
1st case OK
4th case OK

The 1st case is ok to me, but I do not understood why the 3rd case is notok and the 4th is.

Any idea?



On Saturday, April 23, 2011 6:02 PM Fred. wrote:
I process a weekly extract which, unfortunately, comes from a table
with a 20 unvalidated integer fields which I need in the form of
numbers when they are valid. *Until the most recent update I'd been
sliding by bulk importing the extract and then converting to a
validated table using an append query cased on ISNUMERIC returning 1.,
NULL otherwise for these fields.

With the last update to SQL Server Standard, this finally caught up
with me, when ISNUMERIC('\') started returning 1 (the value was in a
record which had not been touched in some time). *I have a patch in
which this particualr field in this record, and am working on a more
robust solution.

The best I have been able to come up with is a function which validates
character by character and returns the converted value converts if
valid. *This takes about 5 minutes per million records, more than
original query (20 fields per record, 16 microseconds per field) which
is sort of acceptable, but which I would like to improve.

Fred.
On Sunday, April 24, 2011 5:11 AM Erland Sommarskog wrote:
Fred. (ghrno-goo... (AT) yahoo (DOT) com) writes:

isnumeric has always been useless.

If you need to validate for unsigned integers, you can use this expression:

col NOT LIKE '%[^0-9]%'

This expression returns false, as soon there is a value which containsany
non-digit character including space.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008:http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005:http://msdn.microsoft.com/en-us/sqls...bb895970.aspx- Hidequoted text -

- Show quoted text -
Actually, it is Erland's solution. I can understand your confusion
because the logic of this rather simple seeming test is very
convoluted.

The pattern '[0-9]' will match a character which is a digit (0 through
9).

But, the pattern '[^0-9]' will match a character which is NOT a digit
(not zero thru nine). The '^ character takes on a "NOT" meaning in
regular expressions.

So, the pattern '%[^0-9]%' will match any string with a non-digit in
any position.

So,the only strings which fail to match the last pattern all all
digits. Thus,

12345 LIKE '%[^0-9]%'. should test FALSE, but

12345 NOT LIKE '%[^0-9]%'. should test TRUE

Fred.

Reply With Quote
  #6  
Old   
Fred.
 
Posts: n/a

Default Re: Fred. (ghrno-google@yahoo.com) writes:isnumeric has always been useless. - 05-10-2011 , 02:50 PM



On May 10, 2:19*pm, Pedro Eu <pedro.... (AT) gmail (DOT) com> wrote:
Quote:
Fred, I'm trying to use your solution but I think something is going wrong. Try this:

IF 'abd' LIKE '%b%'
BEGIN
* * * * PRINT '1st case OK'
END

IF 'aaa' LIKE '%b%'
BEGIN
* * * * PRINT '2nd case OK'
END

IF '123' LIKE '%[^0-9]%'
BEGIN
* * * * PRINT '3rd case OK'
END

IF '123a' LIKE '%[^0-9]%'
BEGIN
* * * * PRINT '4th case OK'
END

Sql Server result:
1st case OK
4th case OK

The 1st case is ok to me, but I do not understood why the 3rd case is notok and the 4th is.

Any idea?



On Saturday, April 23, 2011 6:02 PM Fred. wrote:
I process a weekly extract which, unfortunately, comes from a table
with a 20 unvalidated integer fields which I need in the form of
numbers when they are valid. *Until the most recent update I'd been
sliding by bulk importing the extract and then converting to a
validated table using an append query cased on ISNUMERIC returning 1.,
NULL otherwise for these fields.

With the last update to SQL Server Standard, this finally caught up
with me, when ISNUMERIC('\') started returning 1 (the value was in a
record which had not been touched in some time). *I have a patch in
which this particualr field in this record, and am working on a more
robust solution.

The best I have been able to come up with is a function which validates
character by character and returns the converted value converts if
valid. *This takes about 5 minutes per million records, more than
original query (20 fields per record, 16 microseconds per field) which
is sort of acceptable, but which I would like to improve.

Fred.
On Sunday, April 24, 2011 5:11 AM Erland Sommarskog wrote:
Fred. (ghrno-goo... (AT) yahoo (DOT) com) writes:

isnumeric has always been useless.

If you need to validate for unsigned integers, you can use this expression:

col NOT LIKE '%[^0-9]%'

This expression returns false, as soon there is a value which containsany
non-digit character including space.

--
Erland Sommarskog, SQL Server MVP, esq... (AT) sommarskog (DOT) se

Links for SQL Server Books Online:
SQL 2008:http://msdn.microsoft.com/en-us/sqlserver/cc514207.aspx
SQL 2005:http://msdn.microsoft.com/en-us/sqls...bb895970.aspx- Hidequoted text -

- Show quoted text -
PS, the query

SELECT ''''+S+'''' AS [STR],
CASE
WHEN S NOT LIKE '%[^0-9]%' THEN 'ALL DIGITS'
ELSE 'NOT ALL DIGITS'
END AS [TYPE]
FROM ( SELECT '0123456789' AS S UNION ALL
SELECT '123X6' UNION ALL
SELECT 'A' UNION ALL
SELECT '' UNION ALL
SELECT '0' UNION ALL
SELECT NULL
) AS t

has the result set:

STR TYPE
'0123456789' ALL DIGITS
'123X6' NOT ALL DIGITS
'A' NOT ALL DIGITS
'' ALL DIGITS
'0' ALL DIGITS
NULL NOT ALL DIGITS

Note that if you want to eliminate the null string, you need an added
condition.

Fred.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.