dbTalk Databases Forums  

[BUGS] Disk space is consumed by UPDATE query

mailing.database.pgsql-bugs mailing.database.pgsql-bugs


Discuss [BUGS] Disk space is consumed by UPDATE query in the mailing.database.pgsql-bugs forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Patrick Clery
 
Posts: n/a

Default [BUGS] Disk space is consumed by UPDATE query - 10-01-2004 , 08:39 AM






Your name : Patrick Clery
Your email address : patrick (AT) phpforhire (DOT) com


System Configuration
---------------------
Architecture (example: Intel Pentium) : Intel
Operating System (example: Linux 2.4.18) : FreeBSD 4.10-stable
PostgreSQL version (example: PostgreSQL-8.0): PostgreSQL-8.0
Compiler used (example: gcc 2.95.2) : gcc 2.95.4


Please enter a FULL description of your problem:
------------------------------------------------

This query appears to enter an infinite loop and fill up my /usr partition=
=20
(PGDATA=3D/usr/local/pgsql/data) at a rapid rate:

UPDATE people_locations
SET postalcode =3D foo.pcode, city_id =3D foo.cid
FROM (
SELECT
p.postalcode AS pcode,
p.city_id AS cid,
c.state_id AS sid
FROM postalcodes p
JOIN cities c USING (city_id)
) foo
WHERE foo.sid =3D state_id AND old_postalcode =3D foo.pcode

psql:/usr/local/www/beano/datingsite/sql/import_people.sql:363: ERROR: cou=
ld=20
not write to hash-join temporary file: No space left on device


From when the query is first run (somehow the disk space goes up initially):
=20
$ while : ; do df -h /usr/; sleep 3; done
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.8G 7.8G 43% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.2G 8.3G 39% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.1G 8.4G 38% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.2G 8.4G 38% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.2G 8.3G 39% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.3G 8.3G 39% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.3G 8.3G 39% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.4G 8.2G 40% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.4G 8.2G 40% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.5G 8.1G 40% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.5G 8.1G 41% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.6G 8.0G 41% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.6G 8.0G 41% /usr
Filesystem Size Used Avail Capacity Mounted on
/dev/ad0s1g 15G 5.7G 7.9G 42% /usr
.... and on and on until it reaches zero.


Here's the query plan:

QUERY PLAN=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=2 0=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=2 0=20=20=20=20=20=20=20=20
---------------------------------------------------------------------------=
-------------------------------------------
Hash Join (cost=3D18770.77..185690.90 rows=3D20626 width=3D140)
Hash Cond: ((("outer".postalcode)::text =3D ("inner".old_postalcode)::te=
xt)=20
AND ("outer".city_id =3D "inner".city_id))
-> Seq Scan on postalcodes p (cost=3D0.00..14742.12 rows=3D825012 widt=
h=3D18)
-> Hash (cost=3D9955.64..9955.64 rows=3D366625 width=3D126)
-> Merge Join (cost=3D69.83..9955.64 rows=3D366625 width=3D126)
Merge Cond: ("outer".state_id =3D "inner".state_id)
-> Index Scan using cities_state_id on cities c=20=20
(cost=3D0.00..4203.13 rows=3D73325 width=3D8)
-> Sort (cost=3D69.83..72.33 rows=3D1000 width=3D122)
Sort Key: people_locations.state_id
-> Seq Scan on people_locations (cost=3D0.00..20.00=
=20
rows=3D1000 width=3D122)
(10 rows)


Here's the inner query by itself:
=20
datingsite=3D> EXPLAIN ANALYZE SELECT
datingsite-> p.postalcode AS pcode,
datingsite-> p.city_id AS cid,
datingsite-> c.state_id AS sid
datingsite-> FROM postalcodes p
datingsite-> JOIN cities c USING (city_id);
QUERY PLAN=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=2 0=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20=2 0=20=20=20=20=20=20=20=20=
=20=20=20=20=20=20=20
---------------------------------------------------------------------------=
----------------------------------------------------
Hash Join (cost=3D2091.56..47451.98 rows=3D825012 width=3D22) (actual=20
time=3D1132.994..16764.241 rows=3D825012 loops=3D1)
Hash Cond: ("outer".city_id =3D "inner".city_id)
-> Seq Scan on postalcodes p (cost=3D0.00..14742.12 rows=3D825012 widt=
h=3D18)=20
(actual time=3D0.077..4657.842 rows=3D825012 loops=3D1)
-> Hash (cost=3D1585.25..1585.25 rows=3D73325 width=3D8) (actual=20
time=3D1131.010..1131.010 rows=3D0 loops=3D1)
-> Seq Scan on cities c (cost=3D0.00..1585.25 rows=3D73325 width=
=3D8)=20
(actual time=3D0.031..738.582 rows=3D73325 loops=3D1)
Total runtime: 20475.610 ms
(6 rows)



Both tables are rather large:


datingsite=3D> select count(*) from people_locations ;
count=20=20
--------
131266
(1 row)
=20
Time: 2566.282 ms
datingsite=3D> select count(*) from postalcodes;=20=20=20=20=20=20
count=20=20
--------
825012
(1 row)
=20
Time: 4246.360 ms



Here is the schema:



datingsite=3D> \d postalcodes;
Table "public.postalcodes"
Column | Type | Modifiers=20
------------+-----------------------+-----------
postalcode | character varying(10) | not null
city_id | integer | not null
Indexes:
"postalcodes_pkey" PRIMARY KEY, btree (postalcode, city_id)
Foreign-key constraints:
"postalcodes_city_id_fkey" FOREIGN KEY (city_id) REFERENCES=20
cities(city_id) ON DELETE CASCADE
=20
datingsite=3D> \d people_locations;
Table "public.people_locations"
Column | Type | Modifiers=20=20=
=20=20=20=20=20=20=20=20=20=20=20=20=20=20=20
-----------------+------------------------+--------------------------------=
-----------
person_id | integer | not null
city_id | integer | not null default 0
postalcode | character varying(30) | not null default 'N/A'::charact=
er=20
varying
postalcode_city | integer | not null default 0
country_iso | integer | not null default 0
state_id | integer | not null default 0
areacode | integer | not null default 0
old_postalcode | character varying(10) | not null default ''::character=
=20
varying
old_cityname | character varying(128) | not null default ''::character=
=20
varying
Indexes:
"people_locations_pkey" PRIMARY KEY, btree (person_id)
"people_loc_postalcode" btree (postalcode)
Foreign-key constraints:
"people_locations_person_id_fkey" FOREIGN KEY (person_id) REFERENCES=20
people(person_id) ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
"people_locations_city_id_fkey" FOREIGN KEY (city_id) REFERENCES=20
cities(city_id) ON UPDATE CASCADE ON DELETE RESTRICT
"people_locations_country_iso_fkey" FOREIGN KEY (country_iso) REFERENCE=
S=20
countries(country_iso)
"people_locations_state_id_fkey" FOREIGN KEY (state_id) REFERENCES=20
states(state_id)


Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:=20
----------------------------------------------------------------------
I can repeat the problem each time I run that query. If there is any furthe=
r=20
debugging info needed, I'm willing to provide it.




If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Reply With Quote
  #2  
Old   
Tom Lane
 
Posts: n/a

Default Re: [BUGS] Disk space is consumed by UPDATE query - 10-01-2004 , 10:07 AM






Patrick Clery <patrick (AT) phpforhire (DOT) com> writes:
Quote:
Here's the query plan:
...
-> Seq Scan on people_locations (cost=0.00..20.00
rows=1000 width=122)
The query planner evidently does not know that the people_locations
table is large. A VACUUM or ANALYZE on it would help a lot.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo (AT) postgresql (DOT) org


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.