dbTalk Databases Forums  

[BUGS] BUG #1671: Long interval string representation rejected

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] BUG #1671: Long interval string representation rejected in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Mark Dilger
 
Posts: n/a

Default [BUGS] BUG #1671: Long interval string representation rejected - 05-16-2005 , 11:19 AM







The following bug has been logged online:

Bug reference: 1671
Logged by: Mark Dilger
Email address: markdilger (AT) yahoo (DOT) com
PostgreSQL version: 8.0.2
Operating system: Mandrake Linux 9.1 (2.4.21-0.13mdkenterprise #1 SMP)
Description: Long interval string representation rejected
Details:

<QUOTE>

bash-2.05b$ psql
Welcome to psql 8.0.2, the PostgreSQL interactive terminal.

Type: \copyright for distribution terms
\h for help with SQL commands
\? for help with psql commands
\g or terminate with semicolon to execute query
\q to quit

mark=# select '4 millenniums 5 centuries 4 decades 1 year 4 months 4 days 17
minutes 31 seconds'::interval;
ERROR: invalid input syntax for type interval: "4 millenniums 5 centuries 4
decades 1 year 4 months 4 days 17 minutes 31 seconds"
mark=# select '4 millenniums 5 centuries 4 decades 1 year 4 months 4 days 17
minutes'::interval;
interval
-----------------------------------
4541 years 4 mons 4 days 00:17:00
(1 row)

</QUOTE>

It appears that any string representation of an interval of length greater
than 76 is rejected. (76 = 51 + 25 = MAXDATELEN + MAXDATEFIELDS). This
appears to be a limitation enforced within function interval_in() in the
file src/backend/utils/adt/timestamp.c

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply With Quote
  #2  
Old   
Neil Conway
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 12:44 AM






Mark Dilger wrote:
Quote:
It appears that any string representation of an interval of length greater
than 76 is rejected. (76 = 51 + 25 = MAXDATELEN + MAXDATEFIELDS). This
appears to be a limitation enforced within function interval_in() in the
file src/backend/utils/adt/timestamp.c
Yeah, this seems bogus. It's not even clear to me why MAXDATELEN +
MAXDATEFIELDS is used as the size of that buffer in the first place. I
don't know the datetime code particularly well; perhaps someone who does
can shed some light on this?

-Neil

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 01:04 AM



Neil Conway <neilc (AT) samurai (DOT) com> writes:
Quote:
Yeah, this seems bogus. It's not even clear to me why MAXDATELEN +
MAXDATEFIELDS is used as the size of that buffer in the first place. I
don't know the datetime code particularly well; perhaps someone who does
can shed some light on this?
My rule of thumb with the datetime code is that if it looks bogus,
it probably is :-(

There are a lot of fixed-size local buffers in that code. The ones
used in output routines seem defensible since the string to be generated
is predictable. The ones that are used for processing input are likely
wrong. OTOH I'm not eager to throw a palloc into each of those code
paths ... can we avoid that?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #4  
Old   
Neil Conway
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 08:57 PM



Tom Lane wrote:
Quote:
There are a lot of fixed-size local buffers in that code. The ones
used in output routines seem defensible since the string to be generated
is predictable. The ones that are used for processing input are likely
wrong. OTOH I'm not eager to throw a palloc into each of those code
paths ... can we avoid that?
I'm not sure offhand what the upper bounds on legal input for each of
the datetime types is. Why not just allocate a larger but still
fixed-size buffer -- say, 256 bytes?

(While we're on the subject, it seems rather silly for ParseDateTime()
not to do its own bounds checking -- all of its call sites do a strlen()
on the input buffer before calling it, which could be avoided if
ParseDateTime() we passed the size of `lowstr')

-Neil

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #5  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 10:25 PM



Neil Conway <neilc (AT) samurai (DOT) com> writes:
Quote:
Tom Lane wrote:
There are a lot of fixed-size local buffers in that code. The ones
used in output routines seem defensible since the string to be generated
is predictable. The ones that are used for processing input are likely
wrong.

I'm not sure offhand what the upper bounds on legal input for each of
the datetime types is.
Well, if you allow for whitespace between tokens then it's immediately
clear that there is no fixed upper bound. Perhaps it would work to
downcase just one token at a time, so that the max buffer length equals
the max acceptable token?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #6  
Old   
Neil Conway
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 10:42 PM



Tom Lane wrote:
Quote:
Well, if you allow for whitespace between tokens then it's immediately
clear that there is no fixed upper bound.
Good point -- there is no upper bound on the input string, but since we
skip whitespace, AFAICS this shouldn't affect the requirements for the
size of the working buffer (lowstr). So if we passed the size of the
working buffer to ParseDateTime() (per earlier gripe), we could only
bail out when we actually need to use more working space than was
allocated, not simply when strlen(input) >= sizeof(buffer). The
implementation might be a bit ugly, though.

Quote:
Perhaps it would work to downcase just one token at a time, so that
the max buffer length equals the max acceptable token?
Not sure I follow you.

-Neil

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #7  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 11:08 PM



Neil Conway <neilc (AT) samurai (DOT) com> writes:
Quote:
Perhaps it would work to downcase just one token at a time, so that
the max buffer length equals the max acceptable token?

Not sure I follow you.
I believe that the reason for the local buffer is to hold a downcased
version of the input, which we can compare to the all-lower-case tables
of relevant keywords. So plan A for fixing it is to downcase and
compare one token at a time, instead of downcasing the whole string
at once.

Plan B is to make the token comparison routines case-insensitive
themselves, so that there's no need for a temporary copy of the input.

Probably someone can think of plans C and D too ... I don't have any
strong allegiance to any one way to fix it.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #8  
Old   
Neil Conway
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-18-2005 , 11:46 PM



Tom Lane wrote:
Quote:
I believe that the reason for the local buffer is to hold a downcased
version of the input, which we can compare to the all-lower-case tables
of relevant keywords.
Well, that's one of the reasons, but not the only one. For example, how
do you parse '17 minutes31 seconds'::interval without either a working
buffer or the ability to resize the input buffer? (We need to split the
input into two fields, each of which must be a NUL-terminated string.
These fields are currently implemented as into the working buffer. If we
changed that to be pointers into the input string, we have no where to
put the NUL terminator. So you either need a separate buffer, repalloc()
plus shuffling everything down, or palloc'd allocation for the fields
themselves.) So I don't see that fixing the casing issue is sufficient.

On closer inspection, the current code seems to get this wrong as well
:-( For example, given the query:

select '4millenniums5centuries4decades1year4months4days17 minutes31
seconds'::interval;

attaching via gdb, with a breakpoint at datetime.c:890 --

(gdb) p lp
$1 = 0xbfffdde1 "ÿÞ@"
(gdb) p lowstr + 25 + 51
$4 = 0xbfffdddc "onds"

(i.e. lp points past the end of lowstr)

-Neil

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


Reply With Quote
  #9  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-19-2005 , 12:07 AM



Neil Conway <neilc (AT) samurai (DOT) com> writes:
Quote:
Tom Lane wrote:
I believe that the reason for the local buffer is to hold a downcased
version of the input, which we can compare to the all-lower-case tables
of relevant keywords.

Well, that's one of the reasons, but not the only one. For example, how
do you parse '17 minutes31 seconds'::interval without either a working
buffer or the ability to resize the input buffer?
Sorry, s/downcased/downcased and null-terminated/. I have not read the
parsing code in question lately, but offhand it seems like transferring
one token at a time into a work buffer isn't a completely broken idea...

Quote:
On closer inspection, the current code seems to get this wrong as well
:-(
Wouldn't be surprised :-(

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


Reply With Quote
  #10  
Old   
Neil Conway
 
Posts: n/a

Default Re: [BUGS] BUG #1671: Long interval string representation rejected - 05-19-2005 , 12:31 AM



Tom Lane wrote:
Quote:
Sorry, s/downcased/downcased and null-terminated/. I have not read the
parsing code in question lately, but offhand it seems like transferring
one token at a time into a work buffer isn't a completely broken idea...
I wouldn't call it "broken", but I don't see how it could be done
without a significant refactoring of how datetime parsing is done, and
your handwaving has yet to convince me The gist of the current code is:

1. parse the input string into fields, which are arrays of pointers into
a working buffer, via ParseDateTime()
2. decode the fields as appropriate for the datatype via
DecodeInterval(), DecodeDatetime(), DecodeTimeOnly(), etc.

i.e. I don't see an easy way to do field decoding one field at a time.

-Neil

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.