dbTalk Databases Forums  

snowflaking

comp.databases.postgresql.novice comp.databases.postgresql.novice


Discuss snowflaking in the comp.databases.postgresql.novice forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Nabil Sayegh
 
Posts: n/a

Default snowflaking - 04-04-2004 , 10:29 PM






Hi all,

I need some help optimizing a snowflaked db structure.

I'm using 7.3.4 at the moment

Scenario:
I have only a few tables holding real data:
e.g. txt,flt,tf,datum

these tables only hold 1 column of real data and information what 'virtual' type they are.

I have a table containing meta information about 'virtual' tables, i.e. tables that don't really
exist in postgresql but have to be joined on the fly via the meta information.

At runtime a query is build:
---------------------------------------------------------------------------------------------------
SELECT
o.id_objekt as id_objekt
,kapsel.id2_objekt as id2_kapsel
,vater.id2_objekt as id2_vater
,anzeige.id_objekt_objekt as id_anzeige
,anzeige.id2_objekt as anzeige
,anzeige.id2_objekt as id2_anzeige
,anzeige_datum_display_von.id_datum as id_anzeige_datum_display_von
,anzeige_datum_display_von.datum_fld as anzeige_datum_display_von
,anzeige_datum_display_bis.id_datum as id_anzeige_datum_display_bis
,anzeige_datum_display_bis.datum_fld as anzeige_datum_display_bis
,anzeige_enabled.id_tf as id_anzeige_enabled
,anzeige_enabled.tf as anzeige_enabled
,headline.id_objekt_objekt as id_headline
,headline.id2_objekt as headline
[...MANY MANY MORE...]
FROM
objekt o

JOIN ( SELECT id_objekt, id2_objekt FROM objekt_objekt JOIN meta USING (
id_meta ) WHERE id_meta = 91 ) vater ON ( vater.id_objekt = o.id_objekt )
JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING (
id_meta ) WHERE id_meta = 110 ) Kapsel ON ( kapsel.id_objekt = o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=84 ) anzeige ON ( anzeige.id_objekt=o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta )
WHERE id_meta=73 ) anzeige_datum_display_von ON (
anzeige_datum_display_von.id_objekt=anzeige.id2_ob jekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta )
WHERE id_meta=74 ) anzeige_datum_display_bis ON (
anzeige_datum_display_bis.id_objekt=anzeige.id2_ob jekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE
id_meta=75 ) anzeige_enabled ON ( anzeige_enabled.id_objekt=anzeige.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=35 ) headline ON ( headline.id_objekt=o.id_objekt )
[...MANY MANY MORE...]
----------------------------------------------------------------------------------------------------

Execution takes approximately 0.05s
When I EXPLAIN ANALYZE the query I get:

----------------------------------------------------------------------------------------------------
Hash Join (cost=123.24..212.68 rows=1 width=576) (actual time=11.93..12.50 rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Hash Join (cost=118.56..207.99 rows=1 width=552) (actual time=11.24..11.80 rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Nested Loop (cost=107.33..196.76 rows=1 width=535) (actual time=10.40..10.94 rows=1
loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Nested Loop (cost=105.41..185.51 rows=1 width=478) (actual time=9.82..10.34
rows=1 loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Nested Loop (cost=96.53..174.35 rows=1 width=421) (actual time=9.12..9.63
rows=1 loops=1)
[... MANY MORE COLUMNS LOOKING SIMILAR ...]
Total runtime: 17.36 msec
(171 rows)
----------------------------------------------------------------------------------------------------

Is there anything I can do to speed it a bit up?
We can assume, that
a) inserts are done very rarely and
b) the number of actual results are very low

--
e-Trolley Sayegh & John, Nabil Sayegh
Tel.: 0700 etrolley /// 0700 38765539
Fax.: +49 69 8299381-8
PGP : http://www.e-trolley.de

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html


Reply With Quote
  #2  
Old   
Joe Conway
 
Posts: n/a

Default Re: snowflaking - 04-05-2004 , 12:27 AM






Nabil Sayegh wrote:
[...big snowflake query and explain analyze results...]
Quote:
Is there anything I can do to speed it a bit up?
We can assume, that
a) inserts are done very rarely and
b) the number of actual results are very low
One simple idea is to use PREPARE and EXECUTE. This is only useful if
you can PREPARE once and then EXECUTE multiple times (i.e. as long as
you don't need to reconnect prior to every execution of the query).

HTH,

Joe

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo (AT) postgresql (DOT) org so that your
message can get through to the mailing list cleanly



Reply With Quote
  #3  
Old   
Nabil Sayegh
 
Posts: n/a

Default Re: snowflaking - 04-05-2004 , 07:02 AM



Joe Conway wrote:
Quote:
Nabil Sayegh wrote:
[...big snowflake query and explain analyze results...]

Is there anything I can do to speed it a bit up?
We can assume, that
a) inserts are done very rarely and
b) the number of actual results are very low


One simple idea is to use PREPARE and EXECUTE. This is only useful if
you can PREPARE once and then EXECUTE multiple times (i.e. as long as
you don't need to reconnect prior to every execution of the query).
Hm, unfortunately my queries (at least the expensive ones) are only executed once per session.
If only it could be made persistent, then it would help

Any other ideas?
What if I create views for such queries?
Would these be faster (i.e. prepared automatically/persistent)?

TFYH
--
e-Trolley Sayegh & John, Nabil Sayegh
Tel.: 0700 etrolley /// 0700 38765539
Fax.: +49 69 8299381-8
PGP : http://www.e-trolley.de

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #4  
Old   
Nabil Sayegh
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? (was: snowflaking) - 04-06-2004 , 08:22 AM



When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Hash Join (cost=80.98..208.48 rows=1 width=228) (actual time=10.49..10.94 rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Nested Loop (cost=69.75..197.24 rows=1 width=216) (actual time=9.65..10.09 rows=1
loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Nested Loop (cost=67.83..185.99 rows=1 width=204) (actual time=9.17..9.58
rows=1 loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Nested Loop (cost=65.91..174.72 rows=1 width=192) (actual time=8.68..9.08
rows=1 loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Nested Loop (cost=63.99..163.44 rows=1 width=180) (actual
time=8.19..8.58 rows=1 loops=1)
Join Filter: ("inner".id_objekt = "outer".id2_objekt)
-> Hash Join (cost=59.66..149.14 rows=1 width=164) (actual
time=7.63..8.00 rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Nested Loop (cost=45.61..135.07 rows=1 width=148)
(actual time=6.48..6.83 rows=1
[etc etc etc]
----------------------------------------------------------------------------------------------------

Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

----------------------------------------------------------------------------------------------------
Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)
Merge Cond: ("outer".id_objekt = "inner".id_objekt)
-> Sort (cost=136.19..136.20 rows=2 width=4) (actual time=18.464..18.465 rows=1 loops=1)
Sort Key: o.id_objekt
-> Hash Left Join (cost=136.01..136.18 rows=2 width=4) (actual time=18.289..18.293
rows=1 loops=1)
Hash Cond: ("outer".id_objekt = "inner".id_objekt)
-> Merge Left Join (cost=129.95..130.03 rows=2 width=4) (actual
time=17.860..17.863 rows=1 loops=1)
Merge Cond: ("outer".id2_objekt = "inner".id_objekt)
-> Sort (cost=123.05..123.06 rows=2 width=8) (actual time=17.568..17.569
rows=1 loops=1)
Sort Key: public.objekt_objekt.id2_objekt
-> Hash Left Join (cost=122.77..123.04 rows=2 width=8) (actual
time=17.543..17.547 rows=1 loops=1)
Hash Cond: ("outer".id2_objekt = "inner".id_objekt)
-> Merge Left Join (cost=116.05..116.13 rows=2 width=12) (actual
time=16.933..16.936 rows=1 loops=1)
Merge Cond: ("outer".id2_objekt = "inner".id_objekt)
-> Sort (cost=109.11..109.12 rows=2 width=12) (actual
time=16.622..16.623 rows=1 loops=1)
[etc etc etc]
----------------------------------------------------------------------------------------------------

Any idea?

Additional Information:
I'm JOINing exclusively on PKeys/FKeys and the WHERE clause also only uses PKeys (these should have
indexes automatically, right?)
I did VACUUM ANALYZE on both machines, didn't help.

--
e-Trolley Sayegh & John, Nabil Sayegh
Tel.: 0700 etrolley /// 0700 38765539
Fax.: +49 69 8299381-8
PGP : http://www.e-trolley.de

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match


Reply With Quote
  #5  
Old   
Stephan Szabo
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? (was: snowflaking) - 04-06-2004 , 09:20 AM




On Tue, 6 Apr 2004, Nabil Sayegh wrote:

Quote:
When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)

Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

----------------------------------------------------------------------------------------------------
Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)
Hmm, this actual time number seems about 1/2 the above unless I'm missing
something (34 vs 19). Is it possible you missed some lines or something
while posting?


---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Reply With Quote
  #6  
Old   
Nabil Sayegh
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? - 04-06-2004 , 09:33 AM



Stephan Szabo wrote:
Quote:
On Tue, 6 Apr 2004, Nabil Sayegh wrote:


When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)


Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

----------------------------------------------------------------------------------------------------
Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)


Hmm, this actual time number seems about 1/2 the above unless I'm missing
something (34 vs 19). Is it possible you missed some lines or something
while posting?
I posted only the first few lines (it's repeating).

But I found the solution/problem:
If I use JOIN instead of LEFT OUTER JOIN it is on 7.4 as fast as on 7.3
(At the moment there are no NULL values, so for the moment I can live without the LEFT OUTER JOIN).

But how can it be, that the query with LEFT OUTER JOIN has the same speed on 7.3 but is 40 times
slower on 7.4?

TFYH
--
e-Trolley Sayegh & John, Nabil Sayegh
Tel.: 0700 etrolley /// 0700 38765539
Fax.: +49 69 8299381-8
PGP : http://www.e-trolley.de

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #7  
Old   
Tom Lane
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? (was: snowflaking) - 04-06-2004 , 09:41 AM



Nabil Sayegh <postgresql (AT) e-trolley (DOT) de> writes:
Quote:
When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)

Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)
You're showing actual time of 18 msec vs 34 msec in 7.3, which doesn't
work out to 40 times slower on my calculator. If there's a problem here
you're not showing it.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster



Reply With Quote
  #8  
Old   
Stephan Szabo
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? - 04-06-2004 , 10:09 AM



On Tue, 6 Apr 2004, Nabil Sayegh wrote:

Quote:
Stephan Szabo wrote:
On Tue, 6 Apr 2004, Nabil Sayegh wrote:


When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)


Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

----------------------------------------------------------------------------------------------------
Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)


Hmm, this actual time number seems about 1/2 the above unless I'm missing
something (34 vs 19). Is it possible you missed some lines or something
while posting?

I posted only the first few lines (it's repeating).
Like I said, if that's the first line up there, it's saying that 7.4 is
twice as fast as 7.3 (at 19 ms vs 34 ms). We don't know where the other
39.5x the cost is going since it isn't into the plan AFAICS.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org



Reply With Quote
  #9  
Old   
Nabil Sayegh
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? - 04-06-2004 , 10:14 AM



Tom Lane wrote:

Quote:
You're showing actual time of 18 msec vs 34 msec in 7.3, which doesn't
work out to 40 times slower on my calculator. If there's a problem here
you're not showing it.
Sorry for the confusion with 40 times slower I meant the execution of my app. due to this 1 query.

To clarify I modified the query:

7.3 with LEFT OUTER JOIN:
-------------------------------------
$ time psql test2 -f query.sql
?column?
----------

(1 row)


real 0m0.079s
user 0m0.000s
sys 0m0.010s
-------------------------------------

7.4. with LEFT OUTER JOIN:
-------------------------------------
$ time psql test2 -f query.sql
?column?
----------

(1 Zeile)


real 0m3.256s
user 0m0.038s
sys 0m0.026s

3.256 / 0.079 = 41 (That's what I meant)

7.3 without LEFT OUTER JOIN
---------------------------------------
$ time psql test2 -f query.sql
?column?
----------

(1 row)


real 0m0.072s = same as with LEFT OUTER JOIN
user 0m0.000s
sys 0m0.020s
---------------------------------------

7.4 without LEFT OUTER JOIN
$ time psql test2 -f query.sql
?column?
----------

(1 Zeile)


real 0m0.149s
user 0m0.035s
sys 0m0.020s
----------------------------------------



The query was (with LEFT OUTER JOIN):
SELECT
NULL
FROM
objekt o

JOIN ( SELECT id_objekt, id2_objekt FROM objekt_objekt JOIN meta USING (
id_meta ) WHERE id_meta = 91 ) vater ON ( vater.id_objekt = o.id_objekt )
JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta USING (
id_meta ) WHERE id_meta = 110 ) Kapsel ON ( kapsel.id_objekt = o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=84 ) anzeige ON ( anzeige.id_objekt=o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta )
WHERE id_meta=73 ) anzeige_datum_display_von ON (
anzeige_datum_display_von.id_objekt=anzeige.id2_ob jekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_datum, datum_fld FROM datum JOIN meta USING ( id_meta )
WHERE id_meta=74 ) anzeige_datum_display_bis ON (
anzeige_datum_display_bis.id_objekt=anzeige.id2_ob jekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE
id_meta=75 ) anzeige_enabled ON ( anzeige_enabled.id_objekt=anzeige.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=35 ) headline ON ( headline.id_objekt=o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=72 ) headline_CSS ON ( headline_CSS.id_objekt=headline.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=16 ) headline_CSS_regeln ON ( headline_CSS_regeln.id_objekt=headline_CSS.id2_obj ekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=144 ) headline_CSS_hover ON ( headline_CSS_hover.id_objekt=headline_CSS.id2_obje kt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=71 ) headline_txt ON ( headline_txt.id_objekt=headline.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=30 ) fliesstext ON ( fliesstext.id_objekt=o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_objekt_objekt, id2_objekt FROM objekt_objekt JOIN meta
USING ( id_meta ) WHERE id_meta=72 ) fliesstext_CSS ON (
fliesstext_CSS.id_objekt=fliesstext.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=16 ) fliesstext_CSS_regeln ON ( fliesstext_CSS_regeln.id_objekt=fliesstext_CSS.id2 _objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=144 ) fliesstext_CSS_hover ON ( fliesstext_CSS_hover.id_objekt=fliesstext_CSS.id2_ objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_txt, txt FROM txt JOIN meta USING ( id_meta ) WHERE
id_meta=71 ) fliesstext_txt ON ( fliesstext_txt.id_objekt=fliesstext.id2_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_tf, tf FROM tf JOIN meta USING ( id_meta ) WHERE
id_meta=34 ) html_translate ON ( html_translate.id_objekt=o.id_objekt )
LEFT OUTER JOIN ( SELECT id_objekt, id_flt, flt FROM flt JOIN meta USING ( id_meta ) WHERE
id_meta=33 ) pos_y ON ( pos_y.id_objekt=o.id_objekt )
WHERE
o.id_objekt=26377;


Again, sorry for the confusion, I hope now I made it a bit clearer.

TFYH
--
e-Trolley Sayegh & John, Nabil Sayegh
Tel.: 0700 etrolley /// 0700 38765539
Fax.: +49 69 8299381-8
PGP : http://www.e-trolley.de

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend



Reply With Quote
  #10  
Old   
Stephan Szabo
 
Posts: n/a

Default Re: 7.4 dramatically slower than 7.3? - 04-06-2004 , 10:15 AM




On Tue, 6 Apr 2004, Nabil Sayegh wrote:

Quote:
Stephan Szabo wrote:
On Tue, 6 Apr 2004, Nabil Sayegh wrote:


When I take the query of my first post and EXPLAIN ANALYZE it with 7.3 I get the following output:
----------------------------------------------------------------------------------------------------
Hash Join (cost=85.66..213.16 rows=1 width=240) (actual time=34.01..34.47 rows=1 loops=1)


Now I tested it with 7.4 to see if gets faster, but guess what?
It's about 40 times slower(!):

----------------------------------------------------------------------------------------------------
Merge Left Join (cost=142.22..142.28 rows=2 width=0) (actual time=18.681..18.684 rows=1 loops=1)


Hmm, this actual time number seems about 1/2 the above unless I'm missing
something (34 vs 19). Is it possible you missed some lines or something
while posting?

I posted only the first few lines (it's repeating).
One other possibility is that the planning is taking a really long time.
How long do explain (non-analyze) on the two systems take? And how many
joins are there precisely?

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.