dbTalk Databases Forums  

Concurrent Database Maintenance?

comp.databases.informix comp.databases.informix


Discuss Concurrent Database Maintenance? in the comp.databases.informix forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
shorti
 
Posts: n/a

Default Concurrent Database Maintenance? - 09-02-2010 , 01:02 PM






OK..more questions! Informix on Linux. I read on one site that
Informix does not allow concurrent index reorgs...that you have to
drop table indexes and rebuilt them (ALTER INDEX CLUSTER). I dont
know if it was an outdated website. Can anyone tell me if this is
still true and how it affects high availability of the database?

Also, I read that reorganizing the tables is a huge performance hit
(ALTER FRAGMENT)...is this also true?

There is also database statistics...are there issues with running this
concurrently?

Or anything else needed to be performed to keep the database running
optimally that I did not mention?

Once again, thanks for your time.

Reply With Quote
  #2  
Old   
Art Kagel
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 03:18 PM






See my responses inline below:

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (art (AT) iiug (DOT) org)

Disclaimer: Please keep in mind that my own opinions are my own opinions and
do not reflect on my employer, Advanced DataTools, the IIUG, nor any other
organization with which I am associated either explicitly, implicitly, or by
inference. Neither do those opinions reflect those of other individuals
affiliated with any entity with which I am affiliated nor those of the
entities themselves.



On Thu, Sep 2, 2010 at 2:02 PM, shorti <lbryan21 (AT) juno (DOT) com> wrote:

Quote:
OK..more questions! Informix on Linux. I read on one site that
Informix does not allow concurrent index reorgs...that you have to

First, because Informix has a built-in thread, known as the Btree Scanner
Thread, which removes deleted keys and compresses partially empty adjacent
index nodes constantly, index reorgs are rarely required. That said, while
you have to drop an index (or at least disable it and re-enable it) to
rebuild it, this operation does NOT require that the server be offline nor
does it necessarily lock out access to the table during the index build
(online index build is optional - offline index builds are normally a bit
faster). If you are reorging the index to migrate its storage, the ALTER
FRAGMENT ON INDEX... INIT IN... command can do that without even taking the
index offline.


Quote:
drop table indexes and rebuilt them (ALTER INDEX CLUSTER). I dont

ALTER INDEX ... CLUSTER is one way to reorganize a table's storage or to
physically sort a table's rows into the order of a given index key to
improve sequential access in the order of that particular key. It is not a
way to reorg an index.


Quote:
know if it was an outdated website. Can anyone tell me if this is
still true and how it affects high availability of the database?

Outdated, and either wrong or at least misunderstood. Informix's
availability is usually better than five 9's (99.999% available).


Quote:
Also, I read that reorganizing the tables is a huge performance hit
(ALTER FRAGMENT)...is this also true?

Not sure what the question is here. Are you asking if table performance
suffers if you don't reorg the table periodically? Are you asking if the
reorg itself is a resource hog? Informix tables in general do not require
reorganization very often. Certain situations and certain data patterns may
require you to periodically reorg a table to improve performance or to
prevent the table from running out of specific resources.

These are the usual reasons to reorg a table:

1. The table's data is requiring more data pages than are permitted in a
single partition (~16million pages).
1. You can reorganize the table into a dbspace with a larger page size
which will reduce the number of pages.
2. You can partition the table into multiple partitions. This can
improve performance greatly if the partitions can be logically defined to
permit parallel query access or partition elimination to reduce
the amount
of data that needs to be examined.
2. The table's data extents have become fragmented due to insufficient
planning in sizing the table's initial extent allocations. This can
sometimes hurt performance if a large number of non-local extents within a
table need to be accessed by typical queries. In addition, a partition can
only have about 230 extents (more or less depending on the default pagesize
of the server and the number of indexes and "special" columns in the table).
1. Modify the table's extent allocation strategy and reorganize it into
fewer extents.
2. Repartition the table into multiple partitions to relieve the
pressure on the number of extents.
3. The table experiences frequent deletions of large numbers of data
in a fairly random way relative to the order in which data was inserted into
the database. This can cause new rows to be scattered into the slots
vacated by deleted rows, reducing the locality of recently inserted data.
4. The table has experienced a large permanent deletion event and is
carrying large numbers of empty pages which could be freed for reuse by
other tables/indexes.


There are multiple ways to reorg a table:

1. Export data, drop table, recreate table, reload data;
2. ALTER FRAGMENT ON TABLE... INIT IN....
3. Use the new 11.50 Administration API REPACK function.
1. with the OFFLINE option
2. without the OFFLINE option

This is pretty much ordered from the slowest, most intrusive, method to the
fastest and least intrusive method. The Adminstrative API REPACK function,
usually used with the SHRINK option, will reorganize the table's storage
compressing out unused space and optionally freeing unused data pages for
other data and index partitions to use. It can be done completely online
with minimal server access performance impact. The ALTER FRAGMENT...INIT
IN... function performs similarly but it will lock the table. However, if
your need is to reorder and reorganize the data within the table's data
pages, then only the first method, combined with some ordering of the data
at export time, will accomplish this.


Quote:
There is also database statistics...are there issues with running this
concurrently?

No. The only issue is that statements PREPARED/optimized before the
database statistics are generated will have their query plans invalidated
and will need to be rePREPARED after the stats have been completed for the
table(s) that the query accesses. In the latest versions of Informix,
creating an index concurrently generates a basic level of statistics
reducing the requirement to update statistics immediately after index
creation. Also, if you do not disable it (personally I recommend that you
do so), Informix has a built-in scheduled task known as AUS (Automated
Update Statistics), which runs daily and recreates any statistics which it
can determine have become stale. The AUS evaluator uses substantial server
resources when it runs if your database has a large number of tables and
indexes. There are also external tools, my dostats utility the most
notable, that will take care of this task for you with lower overhead than
AUS if you decide to disable AUS.


Quote:
Or anything else needed to be performed to keep the database running
optimally that I did not mention?

Informix engines, once tuned, tend to run themselves quite well until
something changes. Only monitoring of the server's reported performance
data can tell you when something changed that you did not plan as an
intentional change.


Quote:
Once again, thanks for your time.
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list

Reply With Quote
  #3  
Old   
Art Kagel
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 03:51 PM



See below:

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (art (AT) iiug (DOT) org)

Disclaimer: Please keep in mind that my own opinions are my own opinions and
do not reflect on my employer, Advanced DataTools, the IIUG, nor any other
organization with which I am associated either explicitly, implicitly, or by
inference. Neither do those opinions reflect those of other individuals
affiliated with any entity with which I am affiliated nor those of the
entities themselves.



<SNIP>
Quote:
The Clown said:
Actually, if you set things up right from the get-go, you have to do
very little indeed to maintain good performance with Informix.

You asked:
Can I ask what you mean by setting up things from the start to avoid
the need for statistics updates and reorgs?

Properly configuring a table's extent allocation sizing is a big one.

Quote:
I know you can place tables and indexes on separate disks for
optimization. Also, utilizing indexed table partitions in DB2 allows
the ability to reorg partitions separately. These do not eliminate or
reduce the need for stats or reorgs...just makes it more efficient to
run them.

This is an OLTP database and there will be continuous updates, inserts
and deletes that will cause fragmentation.

Actually, normally, OLTP activity, including in tables that have frequent
DELETES, rarely causes any of the kind of fragmentation that will affect
performance. Here is the most common scenario that causes such a problem:


- New rows are added with a strictly increasing key or access value. Say
a SERIAL number as a invoice number or a DATETIME as an order or shipping
date.
- Deletions occur to rows for reasons other than purging the oldest
data. Say because a customer is no longer a customer or a product is no
longer kept in inventory.
- This means that records inserted serially will be deleted from across a
large percentage of the pages of the table.
- When new rows are inserted after this big DELETE, they will fill in the
holes in the existing pages, which mainly contain older transaction data,
rather than be inserted into new pages containing only newer transaction
data.
- Now when you run a daily transaction report it must access single rows
in slots spread across a large percentage of the table's storage pages
rather than just accessing a relative few pages containing the most recent
transaction data which will most likely be in memory already.

This is rare in the real world. Such random (relative to the insertion
order) deletes just are not part of an OLTP system. Even history purges are
relatively rare things, but they don't affect performance in the same way
anyway.

As Obnoxio said, it happens, I've seen a couple of instances myself over the
past 25 years or so, but it is exceedingly rare. Normal OLTP databases at
worst purge oldest data that is no longer needed, and which were mostly
inserted at around the same time, freeing up a contiguous block of pages
that then contain no data or only a few remaining active rows. Inserts
happening after this kind of purge load up blocks of contiguous rows with
many recent rows on a page, just as if the deletes never happened. It's
just that a more extensive report than a daily report, say a weekly report,
will have to access two contiguous blocks of data pages instead of one for a
while and always with many rows that are interesting on a single page.
That's no big deal to a well built RDBMS, and Informix is exceedingly well
built! Hell, even Orable doesn't have a problem in this area.


Quote:
Pray tell please...the suspense is killing me!
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list

Reply With Quote
  #4  
Old   
shorti
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 04:47 PM



On Sep 2, 1:18*pm, Art Kagel <art.ka... (AT) gmail (DOT) com> wrote:

Quote:
*That said, while
you have to drop an index (or at least disable it and re-enable it) to
rebuild it, this operation does NOT require that the server be offline nor
does it necessarily lock out access to the table during the index build
(online index build is optional - offline index builds are normally a bit
faster). *
Are you saying that, "Yes...you can definitely perform an online index
reorg
that will not affect write access and with a minor performance hit"?
I am not
necessarily concerned with the speed of completing the task. My main
concern is if we do have to perform the index reorg, that we can
continue to
run normally. We can work around other issues like scheduling it
during
slower periods and such.


Quote:
Also, I read that reorganizing the tables is a huge performance hit
(ALTER FRAGMENT)...is this also true?

Not sure what the question is here. *Are you asking if table performance
suffers if you don't reorg the table periodically? *Are you asking if the
reorg itself is a resource hog? *
I was actually asking if the reorg is a resource hog but mainly
because I assumed
that all DBMS worked similarly when dealing with fragmentation. In
our product
we would normally be plagued by reason #3:


Quote:
* * * 3. The table experiences frequent deletions of large numbers of data
* *in a fairly random way relative to the order in which data was inserted into
* *the database. *This can cause new rows to be scattered into the slots
* *vacated by deleted rows, reducing the locality of recently inserted data.
....along with frequent inserts. We dont normally delete a large
number of rows
at the same time but over a span of time (around 5k per hour maybe).
Inserts, however
do come in chunks of 5k or 10k periodically but infrequently (maybe a
bulk insert once a
month). So nothing extreme. With our current DB2 system have to
REORG about once
every 4 months on a very busy machine with maybe 800k rows in the main
table.

The REORG we do is online and allows write access. Im sure it does
not do a physical
data reorganization but probably more like what API REPACK does.
Although, I dont
know techincal details on Online REORG.

So, my phrasing may be misleading. When I say reorg I dont mean
physical reorganization
of the tables. One thing to add DB2 REORG also does online index
reorgs as well. This is
all done without impacting the update availability.


Quote:
There is also database statistics...are there issues with running this
concurrently?

No. *The only issue is that statements PREPARED/optimized before the
database statistics are generated will have their query plans invalidated
and will need to be rePREPARED after the stats have been completed for the
table(s) that the query accesses.
Does the application need to redrive the PREPARED statement or is that
handled internally by IDS? In DB2 statistics take some time to run
(perhaps
a minute) so are you saying queries coming in during the time of the
stats
update have to wait seconds or minutes for stats to finish?


Our system requires 24/7 availability and would not tolerate any
significant
performance degradation, even for a short time, or any delay in query
operations,
even for a few seconds. Further, a maintenance outage, even once a
year, is
intolerable for most of our customers. So my goal is high
availability...no
database outage. (My next research will have to do with concurrent
database
migrations!)

Reply With Quote
  #5  
Old   
shorti
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 05:19 PM



Quote:
* *- When new rows are inserted after this big DELETE, they will fillin the
* *holes in the existing pages, which mainly contain older transaction data,
* *rather than be inserted into new pages containing only newer transaction
* *data.
* *- Now when you run a daily transaction report it must access single rows
* *in slots spread across a large percentage of the table's storage pages
* *rather than just accessing a relative few pages containing the most recent
* *transaction data which will most likely be in memory already.

This is rare in the real world. Such random (relative to the insertion
order) deletes just are not part of an OLTP system. *Even history purges are
relatively rare things, but they don't affect performance in the same way
anyway.

This may be our difference since we dont delete the "oldest" row
inserted. However, I disagree with this being rare. We may not delete
what would be perceived as the oldest row inserted but we do delete
when records expire. A record reaching its expiration date may be due
to other elements besides just being the oldest one inserted.
Therefore, you can have pockets of data deleted within your table. I
would think this would be more common than having a database table
that always deletes the oldest record. Things in this world dont
normally happen so serially!

Reply With Quote
  #6  
Old   
shorti
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 05:28 PM



No REORG..I'm not sure I know how to function.......does not
compute...does not compute.....does not

Reply With Quote
  #7  
Old   
Art Kagel
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-02-2010 , 07:12 PM



Answers below again:

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (art (AT) iiug (DOT) org)

Disclaimer: Please keep in mind that my own opinions are my own opinions and
do not reflect on my employer, Advanced DataTools, the IIUG, nor any other
organization with which I am associated either explicitly, implicitly, or by
inference. Neither do those opinions reflect those of other individuals
affiliated with any entity with which I am affiliated nor those of the
entities themselves.



On Thu, Sep 2, 2010 at 5:47 PM, shorti <lbryan21 (AT) juno (DOT) com> wrote:

Quote:
On Sep 2, 1:18 pm, Art Kagel <art.ka... (AT) gmail (DOT) com> wrote:


That said, while
you have to drop an index (or at least disable it and re-enable it) to
rebuild it, this operation does NOT require that the server be offline
nor
does it necessarily lock out access to the table during the index build
(online index build is optional - offline index builds are normally a bit
faster).

Are you saying that, "Yes...you can definitely perform an online index
reorg
that will not affect write access and with a minor performance hit"?

Yes. But as OC pointed out, you REALLY almost never have to reorg indexes.
Honestly. I managed over 50 servers with over 200 databases and nearly a TB
of data for 14 years and I don't remember ever having to reorg any index for
performance reasons.


Quote:
I am not
necessarily concerned with the speed of completing the task. My main
concern is if we do have to perform the index reorg, that we can
continue to
run normally. We can work around other issues like scheduling it
during
slower periods and such.


Also, I read that reorganizing the tables is a huge performance hit
(ALTER FRAGMENT)...is this also true?

Not sure what the question is here. Are you asking if table performance
suffers if you don't reorg the table periodically? Are you asking if the
reorg itself is a resource hog?

I was actually asking if the reorg is a resource hog but mainly
because I assumed
that all DBMS worked similarly when dealing with fragmentation. In
our product
we would normally be plagued by reason #3:

REPACK then.


Quote:

3. The table experiences frequent deletions of large numbers of
data
in a fairly random way relative to the order in which data was
inserted into
the database. This can cause new rows to be scattered into the slots
vacated by deleted rows, reducing the locality of recently inserted
data.

...along with frequent inserts. We dont normally delete a large
number of rows
at the same time but over a span of time (around 5k per hour maybe).
Inserts, however
do come in chunks of 5k or 10k periodically but infrequently (maybe a
bulk insert once a
month). So nothing extreme. With our current DB2 system have to
REORG about once
every 4 months on a very busy machine with maybe 800k rows in the main
table.

The REORG we do is online and allows write access. Im sure it does
not do a physical
data reorganization but probably more like what API REPACK does.
Although, I dont
know techincal details on Online REORG.

So, my phrasing may be misleading. When I say reorg I dont mean
physical reorganization
of the tables. One thing to add DB2 REORG also does online index
reorgs as well. This is
all done without impacting the update availability.


Yes, but DB2 doesn't clean indexes live, so reorgs of indexes are needed.
Not needed in Informix because of the Btree Scanner Thread. You just have
to configure that correctly for your workload and resources.


Quote:

There is also database statistics...are there issues with running this
concurrently?

No. The only issue is that statements PREPARED/optimized before the
database statistics are generated will have their query plans invalidated
and will need to be rePREPARED after the stats have been completed for
the
table(s) that the query accesses.

Does the application need to redrive the PREPARED statement or is that

Under most circumstances it will now happen automatically (used to be manual
always) and the application won't know about it at all. There are still
some conditions where the DECLARE CURSOR or the cursor OPEN will return a
-710 error in which case your application will have to manually reprepare
the statement and start over. Sometimes it's easier to just exit out on a
-710 and restart the app from scratch. However, there are not many -710
conditions left.


Quote:
handled internally by IDS? In DB2 statistics take some time to run
(perhaps
a minute) so are you saying queries coming in during the time of the
stats
update have to wait seconds or minutes for stats to finish?

Update stats can take a while to run properly, from a few seconds to several
minutes depending, just like DB2, but apps can continue to run while the
update statistics are running. If you are running your sessions with SET
LOCK MODE TO WAIT <nsecs>; then even the transient lock that is taken on the
data distributions table (sysdistrib) and the momentary lock on the
sysprocplan table while stored procedures are being recompiled will be
ignored by your sessions.


Quote:
Our system requires 24/7 availability and would not tolerate any
significant
performance degradation, even for a short time, or any delay in query
operations,
even for a few seconds. Further, a maintenance outage, even once a
year, is
intolerable for most of our customers. So my goal is high
availability...no
database outage. (My next research will have to do with concurrent
database
migrations!)
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list

Reply With Quote
  #8  
Old   
Superboer
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-03-2010 , 06:37 AM



You may want to use the expiration date in a fragmentation(partition)
strategy,
this may reduce the issue a bit....

ideally
You could run a purge expired data as a detach fragment. this
eliminates the issue of "haveing pockets of data deleted within your
table."
Dono how long the expired data my be in your database; suppose you
could have a week of that stuff in your db
then at the end of the week detach the current week of expired data.
however this may not be possible in your case.

Superboer.



Quote:
This may be our difference since we dont delete the "oldest" row
inserted. *However, I disagree with this being rare. We may not delete
what would be perceived as the oldest row inserted but we do delete
when records expire. *A record reaching its expiration date may be due
to other elements besides just being the oldest one inserted.
Therefore, you can have pockets of data deleted within your table. *I
would think this would be more common than having a database table
that always deletes the oldest record. *Things in this world dont
normally happen so serially!

Reply With Quote
  #9  
Old   
Art Kagel
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-03-2010 , 07:26 AM



I get the definite impression that there is no "expiration_date" column in
the table, but that certain conditions on these rows and rows in related
tables determine the expiration of a row.

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (art (AT) iiug (DOT) org)

Disclaimer: Please keep in mind that my own opinions are my own opinions and
do not reflect on my employer, Advanced DataTools, the IIUG, nor any other
organization with which I am associated either explicitly, implicitly, or by
inference. Neither do those opinions reflect those of other individuals
affiliated with any entity with which I am affiliated nor those of the
entities themselves.



On Fri, Sep 3, 2010 at 7:37 AM, Superboer <superboer7 (AT) t-online (DOT) de> wrote:

Quote:
You may want to use the expiration date in a fragmentation(partition)
strategy,
this may reduce the issue a bit....

ideally
You could run a purge expired data as a detach fragment. this
eliminates the issue of "haveing pockets of data deleted within your
table."
Dono how long the expired data my be in your database; suppose you
could have a week of that stuff in your db
then at the end of the week detach the current week of expired data.
however this may not be possible in your case.

Superboer.



This may be our difference since we dont delete the "oldest" row
inserted. However, I disagree with this being rare. We may not delete
what would be perceived as the oldest row inserted but we do delete
when records expire. A record reaching its expiration date may be due
to other elements besides just being the oldest one inserted.
Therefore, you can have pockets of data deleted within your table. I
would think this would be more common than having a database table
that always deletes the oldest record. Things in this world dont
normally happen so serially!

_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list

Reply With Quote
  #10  
Old   
shorti
 
Posts: n/a

Default Re: Concurrent Database Maintenance? - 09-03-2010 , 11:43 AM



On Sep 3, 5:26*am, Art Kagel <art.ka... (AT) gmail (DOT) com> wrote:
Quote:
I get the definite impression that there is no "expiration_date" column in
the table, but that certain conditions on these rows and rows in related
tables determine the expiration of a row.

Art

Art S. Kagel
Advanced DataTools (www.advancedatatools.com)
IIUG Board of Directors (a... (AT) iiug (DOT) org)

Disclaimer: Please keep in mind that my own opinions are my own opinions and
do not reflect on my employer, Advanced DataTools, the IIUG, nor any other
organization with which I am associated either explicitly, implicitly, orby
inference. *Neither do those opinions reflect those of other individuals
affiliated with any entity with which I am affiliated nor those of the
entities themselves.

On Fri, Sep 3, 2010 at 7:37 AM, Superboer <superbo... (AT) t-online (DOT) de> wrote:
You may want to use the expiration date *in a fragmentation(partition)
strategy,
this may reduce the issue a bit....

ideally
You could run a purge expired data as a detach fragment. this
eliminates the issue of *"haveing pockets of data deleted within your
table."
Dono how long the expired data my be in your database; suppose you
could have a week of that stuff in your db
then at the end of the week detach the current week of expired data.
however this may not be possible in your case.

Superboer.

This may be our difference since we dont delete the "oldest" row
inserted. *However, I disagree with this being rare. We may not delete
what would be perceived as the oldest row inserted but we do delete
when records expire. *A record reaching its expiration date may be due
to other elements besides just being the oldest one inserted.
Therefore, you can have pockets of data deleted within your table. *I
would think this would be more common than having a database table
that always deletes the oldest record. *Things in this world dont
normally happen so serially!

_______________________________________________
Informix-list mailing list
Informix-l... (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list
Art, that is correct. It is much more complex than expiring oldest
data. The expiration lengths are actually user defined and differ
depending on the importance. We have a separate table that is used by
our customer to define categories. The categories can contain an
expiration plan. The main data is set to one of these categories. So
depending on what the customer has set determines how often a record
would expire (If at all). On some machines very few deletes may occur
and on others it may be frequent.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.