Johannes Bauer schrieb:
Quote:
Postgres is really *very* cool! |
Very much so. But is so powerful I do not really udnerstand what it's
doing. Although my problem is solved, there is still something curious
which made me wonder. When finding the correct query, I played around (a
lot!). During my way, I found a statement which worked, but was very
slow. This does not require any extra tables so you can play around with
it immediately:
SELECT ip FROM (SELECT set_masklen(('192.0.0.0/8'::cidr + series)::inet,
32) AS ip FROM generate_series(1, (2 ^ (32 -
masklen('192.0.0.0/8'::cidr)))::integer - 2) AS series) AS sub WHERE ip
NOT IN ('192.0.0.4'::inet, '192.0.0.8'::inet) LIMIT 10;
Then, here is the blazingly fast one (which I use in the final version):
SELECT sub.ip FROM (SELECT set_masklen(((generate_series(1, (2 ^ (32 -
masklen('192.0.0.0/8'::cidr)))::integer - 2) +
'192.0.0.0/8'::cidr)::inet), 32) as ip) AS sub WHERE sub.ip NOT IN
('192.0.0.4'::inet, '192.0.0.8'::inet) LIMIT 10;
They look pretty similar, but the slow one takes about 10 seconds, the
fast one returns almost immediately. I guess this is because the slow
query first builds up the whole 16777214 addresses by generate_series(),
then starts removing the 192.0.0.4 and 192.0.0.8 and only then limits.
The fast one seems to make the LIMIT clause effective immediately, so it
is done very fast.
I would therefore guess that without the LIMIT clause both would be
(almost) equally fast. Looking at the execution plans for the slow version:
Limit (cost=0.00..0.30 rows=10 width=4)
-> Function Scan on generate_series series (cost=0.00..29.93
rows=990 width=4)
Filter: (set_masklen(('192.0.0.0/8'::inet + (series)::bigint),
32) <> ALL ('{192.0.0.4,192.0.0.8}'::inet[]))
and with the ANALYZE clause:
Limit (cost=0.00..0.30 rows=10 width=4) (actual
time=6701.258..6701.273 rows=10 loops=1)
-> Function Scan on generate_series series (cost=0.00..29.93
rows=990 width=4) (actual time=6701.257..6701.270 rows=10 loops=1)
Filter: (set_masklen(('192.0.0.0/8'::inet + (series)::bigint),
32) <> ALL ('{192.0.0.4,192.0.0.8}'::inet[]))
Total runtime: 8183.585 ms
and for the fast version:
Limit (cost=0.00..0.03 rows=1 width=32)
-> Subquery Scan sub (cost=0.00..0.03 rows=1 width=32)
Filter: (sub.ip <> ALL ('{192.0.0.4,192.0.0.8}'::inet[]))
-> Result (cost=0.00..0.02 rows=1 width=0)
with analyze:
Limit (cost=0.00..0.03 rows=1 width=32) (actual time=0.010..0.022
rows=10 loops=1)
-> Subquery Scan sub (cost=0.00..0.03 rows=1 width=32) (actual
time=0.009..0.019 rows=10 loops=1)
Filter: (sub.ip <> ALL ('{192.0.0.4,192.0.0.8}'::inet[]))
-> Result (cost=0.00..0.02 rows=1 width=0) (actual
time=0.006..0.014 rows=12 loops=1)
Total runtime: 0.039 ms
Does someone know why those two (almost identical) statements produce so
different runtimes?
Thanks in advance,
Kind regards,
Johannes