dbTalk Databases Forums  

Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Kalador Tech Support
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-05-2006 , 10:54 AM






This is a table with just one entry created to test the problem. Should
not have any indexing issues.

I've since isolated the problem to the unescape_bytea function not the
SELECT.

I inserted the same image to a bytea column using base64 encoding, and
extracted it from the table (using base64 decoding) and this worked very
fast (<1 second). So, it is the unescape_bytea function that is to blame.

Hope that helps,

Kai Ronan
Technical Support
Kalador Entertainment Inc.



Alvaro Herrera wrote:

Quote:
Kai Ronan wrote:



// Get the bytea data
$res = pg_query("SELECT data FROM image WHERE name='big.gif'");



Do you have an index in the image.name column? What does an

EXPLAIN ANALYZE SELECT data FROM image WHERE name='big.gif'

say?



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq


Reply With Quote
  #2  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-08-2006 , 12:59 AM






On Sat, Feb 04, 2006 at 04:06:11PM -0800, Kalador Tech Support wrote:
Quote:
I've since isolated the problem to the unescape_bytea function not the
SELECT.

I inserted the same image to a bytea column using base64 encoding, and
extracted it from the table (using base64 decoding) and this worked very
fast (<1 second). So, it is the unescape_bytea function that is to blame.
pg_unescape_bytea is fast here; I just unescaped an 850K jpeg image
in about 0.18 seconds on a slow (500MHz) machine.

How did you determine that pg_unescape_bytea was the problem? What
does something like the following show?

$tstart = microtime(true);
$data = pg_unescape_bytea(pg_fetch_result($res, 'data'));
$dt = microtime(true) - $tstart;
header("Content-Type: text/plain");
printf("unescape time = %.3fms, %d bytes\n", $dt * 1000.0, strlen($data));

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
  #3  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-08-2006 , 04:50 PM



On Wed, Feb 08, 2006 at 09:52:33AM -0800, Kalador Tech Support wrote:
Quote:
$tfetch = microtime(true);

// Convert to binary and send to the browser
// header('Content-type: image/gif');
header('Content-type: text/plain');
$data = pg_unescape_bytea($raw);
$dt = microtime(true) - $tfetch;
The header call should be outside the section of code you're timing.
It probably doesn't matter, but the idea is to time pg_unescape_bytea
so that's all that should be timed (I agree with your decision to
time pg_fetch_result separately).

Quote:
When run from a browser, accessing an Apache webserver, the results
displayed were:

fetch time = 3.632ms, 296043 bytes
unescape time = 70625.518ms, 296043 bytes

When running the same code from command line php, the results were:

fetch time = 3.618ms, 296043 bytes
unescape time = 8.298ms, 296043 bytes

Am running Apache 1.3.33. I can't figure the difference between
environments - command line php and apache php are compiled the same
(except for the --with-apxs=... instruction for apache version).
Something must be different, although offhand I can't think of what it
might be. Locale settings, perhaps? I don't know if pg_unescape_bytea's
performance could be affected by that, but I've seen locale settings
cause horrible performance in string-manipulating code before.

Quote:
As an aside, I'm now using base64 encode/decode to insert data without
problems. For GIF images, at least, it is also a much smaller encoding
than the escape_bytea encoding.
Too bad PHP's PostgreSQL extension doesn't handle binary results
(or does it?). With a binary cursor or a way to tell pg_query and
friends that you want binary results, you could retrieve the binary
image data without having to mess with encoding and decoding.

Even if you're satisfied with what you're doing, it would be
interesting to find an explanation for pg_unescape_bytea's performance
discrepancy and determine whether it's a problem with PHP or with
the underlying libpq function. Please let us know if you find
anything.

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #4  
Old   
Kalador Tech Support
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-09-2006 , 08:47 AM



I modified your code slightly to isolate the pg_unescape_bytea() from
the pg_fetch_result():

<?php
// Connect to the database
$dbconn = pg_connect('dbname=foo');

// Get the bytea data
$res = pg_query("SELECT data FROM bar WHERE name='image'");
$tstart = microtime(true);
$raw = pg_fetch_result($res, 'data');
$fetch = microtime(true) - $tstart;
$tfetch = microtime(true);

// Convert to binary and send to the browser
// header('Content-type: image/gif');
header('Content-type: text/plain');
$data = pg_unescape_bytea($raw);
$dt = microtime(true) - $tfetch;
printf("fetch time = %.3fms, %d bytes\n", $fetch*1000.0,strlen($data));
printf("unescape time = %.3fms, %d bytes\n", $dt*1000.0,strlen($data));
?>

When run from a browser, accessing an Apache webserver, the results
displayed were:

fetch time = 3.632ms, 296043 bytes
unescape time = 70625.518ms, 296043 bytes

When running the same code from command line php, the results were:

fetch time = 3.618ms, 296043 bytes
unescape time = 8.298ms, 296043 bytes

Am running Apache 1.3.33. I can't figure the difference between
environments - command line php and apache php are compiled the same
(except for the --with-apxs=... instruction for apache version).

The code to insert into the database was:

<?php
// Connect to the database
$dbconn = pg_connect('dbname=foo');

// Read in a binary file
$data = file_get_contents('big.gif');

// Escape the binary data
$escaped = pg_escape_bytea($data);

// Insert it into the database
pg_query("INSERT INTO bar (name, data) VALUES ('image', '{$escaped}')");
?>

As an aside, I'm now using base64 encode/decode to insert data without
problems. For GIF images, at least, it is also a much smaller encoding
than the escape_bytea encoding.

Kai Ronan
Technical Support
Kalador Entertainment Inc.



Michael Fuhr wrote:

Quote:
On Sat, Feb 04, 2006 at 04:06:11PM -0800, Kalador Tech Support wrote:


I've since isolated the problem to the unescape_bytea function not the
SELECT.

I inserted the same image to a bytea column using base64 encoding, and
extracted it from the table (using base64 decoding) and this worked very
fast (<1 second). So, it is the unescape_bytea function that is to blame.



pg_unescape_bytea is fast here; I just unescaped an 850K jpeg image
in about 0.18 seconds on a slow (500MHz) machine.

How did you determine that pg_unescape_bytea was the problem? What
does something like the following show?

$tstart = microtime(true);
$data = pg_unescape_bytea(pg_fetch_result($res, 'data'));
$dt = microtime(true) - $tstart;
header("Content-Type: text/plain");
printf("unescape time = %.3fms, %d bytes\n", $dt * 1000.0, strlen($data));



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly


Reply With Quote
  #5  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-09-2006 , 09:25 AM



Kalador Tech Support <support (AT) kalador (DOT) com> writes:
Quote:
When run from a browser, accessing an Apache webserver, the results
displayed were:

fetch time = 3.632ms, 296043 bytes
unescape time = 70625.518ms, 296043 bytes
Is it possible that you are relying on an old (like 7.3) libpq?
I find this in the CVS logs:

2003-06-11 21:17 momjian

* src/interfaces/libpq/fe-exec.c: I found the libpq function
PGunescapeBytea a little slow. It was taking a minute and a half to
decode a 500Kb on a fairly fast machine. I think the culprit is
sscanf.
...
The new function is significantly faster on my machine with the
JPEGs being decoded in less than a second. I attach a modified
libpq example program that I used for my testing.

Ben Lamb.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #6  
Old   
Alvaro Herrera
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-09-2006 , 09:47 AM



Tom Lane wrote:
Quote:
Kalador Tech Support <support (AT) kalador (DOT) com> writes:
When run from a browser, accessing an Apache webserver, the results
displayed were:

fetch time = 3.632ms, 296043 bytes
unescape time = 70625.518ms, 296043 bytes

Is it possible that you are relying on an old (like 7.3) libpq?
I find this in the CVS logs:

2003-06-11 21:17 momjian

* src/interfaces/libpq/fe-exec.c: I found the libpq function
PGunescapeBytea a little slow. It was taking a minute and a half to
decode a 500Kb on a fairly fast machine. I think the culprit is
sscanf.
...
The new function is significantly faster on my machine with the
JPEGs being decoded in less than a second. I attach a modified
libpq example program that I used for my testing.
I note in the PHP 4 sources that the PQunescapeBytea function seems to
have been copied there, "for the benefit of PostgreSQL 7.2 users". It
says that it comes from 7.3 but I don't see any sscanf call.

There is no PQunescapeBytea call in the whole source that I can see, so
my guess is that the libpq function is not called at all. So this may
be a PHP bug rather than a Postgres bug.

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org


Reply With Quote
  #7  
Old   
Michael Fuhr
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-09-2006 , 09:59 AM



On Thu, Feb 09, 2006 at 12:46:46PM -0300, Alvaro Herrera wrote:
Quote:
I note in the PHP 4 sources that the PQunescapeBytea function seems to
have been copied there, "for the benefit of PostgreSQL 7.2 users". It
says that it comes from 7.3 but I don't see any sscanf call.

There is no PQunescapeBytea call in the whole source that I can see, so
my guess is that the libpq function is not called at all. So this may
be a PHP bug rather than a Postgres bug.
The OP claimed to be using PHP 5.1.2, which does have a call to
PQunescapeBytea(), although it also has the old code you're seeing
and a HAVE_PQUNESCAPEBYTEA macro that determines which to use.

Interesting that the command line php and the Apache module behave
differently. I wonder if ldd would show the php executable and
libphp5.so linked against different versions of libpq; that would
add weight to Tom's suggestion that an old libpq might be responsible.

--
Michael Fuhr

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


Reply With Quote
  #8  
Old   
Kalador Tech Support
 
Posts: n/a

Default Re: [BUGS] BUG #2236: extremely slow to get unescaped bytea data - 02-09-2006 , 09:24 PM



You guys are smart!

Apache was running against an old libpq. I shutdown apache, updated
/etc/ld.so.conf with the postgres lib dir, ran ldconfig, restarted
apache, and the problem went away.

The old libpq was libpq.so.3.0 (pre-installed on machine). The new one
is libpq.so.3.2 (installed with 8.0.1)

Sorry for the false alarm - thanks for the help.

Kai Ronan
Technical Support
Kalador Entertainment Inc.



Michael Fuhr wrote:

Quote:
On Thu, Feb 09, 2006 at 12:46:46PM -0300, Alvaro Herrera wrote:


I note in the PHP 4 sources that the PQunescapeBytea function seems to
have been copied there, "for the benefit of PostgreSQL 7.2 users". It
says that it comes from 7.3 but I don't see any sscanf call.

There is no PQunescapeBytea call in the whole source that I can see, so
my guess is that the libpq function is not called at all. So this may
be a PHP bug rather than a Postgres bug.



The OP claimed to be using PHP 5.1.2, which does have a call to
PQunescapeBytea(), although it also has the old code you're seeing
and a HAVE_PQUNESCAPEBYTEA macro that determines which to use.

Interesting that the command line php and the Apache module behave
differently. I wonder if ldd would show the php executable and
libphp5.so linked against different versions of libpq; that would
add weight to Tom's suggestion that an old libpq might be responsible.



---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.