dbTalk Databases Forums  

[BUGS] Turkish downcasting in PL/pgSQL

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Turkish downcasting in PL/pgSQL in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
ntufar
 
Posts: n/a

Default [BUGS] Turkish downcasting in PL/pgSQL - 08-12-2004 , 07:33 AM






Your name :
Your email address :


System Configuration
---------------------
Architecture (example: Intel Pentium) : Intel Pentium

Operating System (example: Linux 2.4.18) : Debian unstable
Linux 2.6.6-1-k7

PostgreSQL version (example: PostgreSQL-8.0): PostgreSQL-8.0 CVS HEAD

Compiler used (example: gcc 2.95.2) : gcc 3.3.4


Please enter a FULL description of your problem:
------------------------------------------------

Problems with Turkish locale are widely known to developers.
Another one, now in PL/pgSQL have reared it's ugly head.
Regression tests are failing at triggers, plpgsql, copy2
and rangefuncs. Examienation of regression.diff showed that
the failures were due to unrecognised statements like
BEGIN, RAISE and IF in PL/pgSQL functions. Replacing
capital "I" with lower-case "i" (BEGiN, RAiSE, iF) completely
sloves the problem.


If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

Apparently problem is caused by the following directive:

%option case-insensitive

on line 76 in file src/pl/plpgsql/src/scan.l

flex (flex version 2.5.4) incorporates case-insensitivity in it's
state tables because if I run flex stage with LANG=C everything
works fine. A quick and dirty fix could be implemented by placing

LANG=C
export LANG

in file src/pl/plpgsql/src/Makefile before calling flex.

A long term fix can be done by implementing a function
for keyword lookup like ScanKeywordLookup() in
src/backend/parser/keywords.c.

I would gladly prepare a patch and send it for your consideration
tomorrow morning.

Best regards,
Nicolai Tufar


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-12-2004 , 11:35 AM






ntufar <ntufar (AT) pisem (DOT) net> writes:
Quote:
flex (flex version 2.5.4) incorporates case-insensitivity in it's
state tables because if I run flex stage with LANG=C everything
works fine.
Ick. That is of course why it worked for me when I tested it :-(

Quote:
A quick and dirty fix could be implemented by placing
LANG=C
export LANG
in file src/pl/plpgsql/src/Makefile before calling flex.
This is probably what we'd better do. Otherwise we have
build-context-dependency in the system's behavior, which is bad.

Peter, any thoughts on this one way or the other? At the moment
plpgsql's scan.l seems to be the only use of '%option case-insensitive'
but we have enough flex lexers laying about that I wouldn't be surprised
to have this same risk elsewhere. Is it reasonable to try to force
LANG=C in some global fashion during the build?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #3  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-12-2004 , 03:03 PM



ntufar <ntufar (AT) pisem (DOT) net> writes:
Quote:
I attached a diff of fix that adds LANG=C; before call to $(FLEX).
Fixes the problem here but I don't know if adding environment variable
assignment like this is appropriate. I am not too fluent in PostgreSQL
build environment and do not know where one can put a global deffinition
you are talking below.
Um, the attachment was unreadable :-( but I get the idea.

As for the global solution, I was wondering if it would work to put
"LANG=C" right inside the definition of $(FLEX). That would ensure
the right behavior from all our flex builds without unnecessarily
messing up people's build environments otherwise. I don't know however
whether this would parse properly.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


Reply With Quote
  #4  
Old   
ntufar
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-12-2004 , 03:31 PM



12-08-2004 Perşembe günü saat 22:27 sularında, Tom Lane dedi ki:
Quote:
ntufar <ntufar (AT) pisem (DOT) net> writes:
I attached a diff of fix that adds LANG=C; before call to $(FLEX).
Fixes the problem here but I don't know if adding environment variable
assignment like this is appropriate. I am not too fluent in PostgreSQL
build environment and do not know where one can put a global deffinition
you are talking below.

Um, the attachment was unreadable :-( but I get the idea.
Something to do with my mail provider, sorry.
in file src/pl/plpgsql/src/Makefile:
LANG=C;$(FLEX) $(FLEXFLAGS) -Pplpgsql_base_yy -o'$@' $<
instead of
$(FLEX) $(FLEXFLAGS) -Pplpgsql_base_yy -o'$@' $<

Quote:
As for the global solution, I was wondering if it would work to put
"LANG=C" right inside the definition of $(FLEX). That would ensure
the right behavior from all our flex builds without unnecessarily
messing up people's build environments otherwise. I don't know however
whether this would parse properly.
The only thing that comest in mind is that it may break Win32 port.
Can someone comment on this?

Quote:
regards, tom lane
Regards,
Nicolai Tufar


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


Reply With Quote
  #5  
Old   
ntufar
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-12-2004 , 03:39 PM



--=-XoaTFdqoFlGxkKzIit1j
Content-Type: text/plain; charset=ISO-8859-9
Content-Transfer-Encoding: 8bit

Greetings,


12-08-2004 Perşembe günü saat 18:32 sularında, Tom Lane dedi ki:
ntufar <ntufar (AT) pisem (DOT) net> writes:
Quote:
flex (flex version 2.5.4) incorporates case-insensitivity in it's
state tables because if I run flex stage with LANG=C everything
works fine.

Ick. That is of course why it worked for me when I tested it :-(

A quick and dirty fix could be implemented by placing
LANG=C
export LANG
in file src/pl/plpgsql/src/Makefile before calling flex.

This is probably what we'd better do. Otherwise we have
build-context-dependency in the system's behavior, which is bad.

I attached a diff of fix that adds LANG=C; before call to $(FLEX).
Fixes the problem here but I don't know if adding environment variable
assignment like this is appropriate. I am not too fluent in PostgreSQL
build environment and do not know where one can put a global deffinition
you are talking below.

Peter, any thoughts on this one way or the other? At the moment
Quote:
plpgsql's scan.l seems to be the only use of '%option
case-insensitive'
but we have enough flex lexers laying about that I wouldn't be
surprised
to have this same risk elsewhere. Is it reasonable to try to force
LANG=C in some global fashion during the build?

regards, tom lane

Best regards,
Nicolai Tufar


--=-XoaTFdqoFlGxkKzIit1j
Content-Disposition: attachment; filename=TurkishFlex.diff
Content-Type: text/x-patch; name=TurkishFlex.diff; charset=ISO-8859-9
Content-Transfer-Encoding: base64

Indexsrc/pl/plpgsql/src/MakefileAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAARCSfile/projects/cvsr
oot/pgsqlserver/src/pl/plpgsql/src/Makefilevretrieving0K3r4r
IqJ9dunYn33K9dujGpHn4pXrK3P6Zf6ZaYLKpf7K3PzGpHn4pX tybp9tNONf
dtd9dNNNNdurK3P6Zf6ZaYLKpf7K3PzGpHn4pXtdgLoNtNONfO OON9NNNO+/
N7K3HYq/6ZbHGp3LHGp5Yn3XnxSxFxSxFxSxFxSwBkj6ZaYLKpW2rHssqH pb
HporLIp4H5XsXp3Yn++/N7K3HYq/6ZbHGp3LHGp5Yn3XnxSxFywDRgAhSxFx
SxFxSwBkj6ZaYLKpW2rHssqHpbHporLIp4H5XsXp3Q==

--=-XoaTFdqoFlGxkKzIit1j
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

--=-XoaTFdqoFlGxkKzIit1j--



Reply With Quote
  #6  
Old   
Peter Eisentraut
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-14-2004 , 03:22 AM



Tom Lane wrote:
Quote:
Peter, any thoughts on this one way or the other? At the moment
plpgsql's scan.l seems to be the only use of '%option
case-insensitive' but we have enough flex lexers laying about that I
wouldn't be surprised to have this same risk elsewhere. Is it
reasonable to try to force LANG=C in some global fashion during the
build?
You'd have to set LC_ALL=C to be really sure to override everything.
But I would stay away from doing that globally, because all the
translation work in gcc and make would go to waste.

I would also suggest that Nicolai report this issue to the flex
developers. It's only bound to reappear everywhere case-insensitive
flex scanners are used.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org


Reply With Quote
  #7  
Old   
Peter Eisentraut
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-14-2004 , 03:35 AM



ntufar wrote:
Quote:
Apparently problem is caused by the following directive:

%option case-insensitive

on line 76 in file src/pl/plpgsql/src/scan.l

flex (flex version 2.5.4) incorporates case-insensitivity in it's
state tables because if I run flex stage with LANG=C everything
works fine. A quick and dirty fix could be implemented by placing

LANG=C
export LANG

in file src/pl/plpgsql/src/Makefile before calling flex.
I have tried running flex (2.5.4) with a number of different locales
including tr_TR, but the output file is always the same. Can you show
us a diff of the generated files?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #8  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-14-2004 , 10:22 AM



Peter Eisentraut <peter_e (AT) gmx (DOT) net> writes:
Quote:
You'd have to set LC_ALL=C to be really sure to override everything.
But I would stay away from doing that globally, because all the
translation work in gcc and make would go to waste.
Agreed. I was toying with changing the FLEX variable to contain
"LC_ALL=C flex" but I'm a bit worried about breaking the build on
some platforms (especially Windows).

Quote:
I would also suggest that Nicolai report this issue to the flex
developers. It's only bound to reappear everywhere case-insensitive
flex scanners are used.
True. Maybe we should just call it a flex bug and wait for them to
fix it. It's not going to affect builds from tarballs anyway, only
people who build from CVS.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


Reply With Quote
  #9  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-14-2004 , 11:06 AM



Peter Eisentraut <peter_e (AT) gmx (DOT) net> writes:
Quote:
I have tried running flex (2.5.4) with a number of different locales
including tr_TR, but the output file is always the same. Can you show
us a diff of the generated files?
Hmm ... a quick look at the flex sources shows that flex does rely on
the <ctype.h> routines for case-folding, so I have no doubt that
ntufar's report is accurate. Maybe you used the wrong tr_TR locale?

(Just for the record, though, I can't see any change in the generated
pl_scan.c output in any of the tr_TR variants available on either HPUX
or OS X. I don't have a full set of locales installed on my Linux
machine so I can't try it there.)

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org


Reply With Quote
  #10  
Old   
Devrim GUNDUZ
 
Posts: n/a

Default Re: [BUGS] Turkish downcasting in PL/pgSQL - 08-16-2004 , 06:17 AM




Hi,

On Thu, 12 Aug 2004, Tom Lane wrote:

Quote:
flex (flex version 2.5.4) incorporates case-insensitivity in it's
state tables because if I run flex stage with LANG=C everything
works fine.

Ick. That is of course why it worked for me when I tested it :-(
Nicolai is on holiday now. I tested on my Fedora Core 2 and RHEL 3 ES
systems and all regression tests passed:

======================
All 96 tests passed.
======================

I'm using the latest tr_TR locale of glibc, and flex-2.5.4a-29 (of RHEL)
and flex-2.5.4a-31 (of FC 2).

What am I missing?

Regards,
--
Devrim GUNDUZ
devrim~gunduz.org devrim.gunduz~linux.org.tr
http://www.tdmsoft.com
http://www.gunduz.org


---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.