dbTalk Databases Forums  

transaction logs filling up: detect offending job

comp.databases.ingres comp.databases.ingres


Discuss transaction logs filling up: detect offending job in the comp.databases.ingres forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
seldom isnice
 
Posts: n/a

Default transaction logs filling up: detect offending job - 10-26-2011 , 04:59 AM






Hi

We have numerous batch jobs that run each night. Some jobs might run
weekly.

In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!
This is because when the transaction logs reach 72% the oldest running
transaction is rolled back while the current job that is filling the
tx logs continues on. It will only fail if it reaches a limit such as
90%.

So we're in a situation where the logs occasionally go above 72% but
do not reach 90%.

(Also the job that is rolled back is only rolled back when it goes to
commit which is at the very end so that "time" is not very useful
either)

Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.

So far I have some monitoring in place that shows:

Tuesday 25 October 2011 21:17:27 IST Percentage of log file in use or
reserved: 14
Tuesday 25 October 2011 21:18:27 IST Percentage of log file in use or
reserved: 15
Tuesday 25 October 2011 21:19:27 IST Percentage of log file in use or
reserved: 15
Tuesday 25 October 2011 21:20:27 IST Percentage of log file in use or
reserved: 16
Tuesday 25 October 2011 21:21:27 IST Percentage of log file in use or
reserved: 16
Tuesday 25 October 2011 21:22:27 IST Percentage of log file in use or
reserved: 17
Tuesday 25 October 2011 21:23:28 IST Percentage of log file in use or
reserved: 19
Tuesday 25 October 2011 21:24:28 IST Percentage of log file in use or
reserved: 20
Tuesday 25 October 2011 21:25:28 IST Percentage of log file in use or
reserved: 19
Tuesday 25 October 2011 21:26:28 IST Percentage of log file in use or
reserved: 19
Tuesday 25 October 2011 21:27:28 IST Percentage of log file in use or
reserved: 20
Tuesday 25 October 2011 21:28:28 IST Percentage of log file in use or
reserved: 21
Tuesday 25 October 2011 21:29:28 IST Percentage of log file in use or
reserved: 21
Tuesday 25 October 2011 21:30:28 IST Percentage of log file in use or
reserved: 22
Tuesday 25 October 2011 21:31:29 IST Percentage of log file in use or
reserved: 22
Tuesday 25 October 2011 21:32:29 IST Percentage of log file in use or
reserved: 37
Tuesday 25 October 2011 21:33:29 IST Percentage of log file in use or
reserved: 31
Tuesday 25 October 2011 21:34:29 IST Percentage of log file in use or
reserved: 32
Tuesday 25 October 2011 21:35:29 IST Percentage of log file in use or
reserved: 32
Tuesday 25 October 2011 21:36:29 IST Percentage of log file in use or
reserved: 33
Tuesday 25 October 2011 21:37:29 IST Percentage of log file in use or
reserved: 33
Tuesday 25 October 2011 21:38:29 IST Percentage of log file in use or
reserved: 34
Tuesday 25 October 2011 21:39:29 IST Percentage of log file in use or
reserved: 35
Tuesday 25 October 2011 21:40:29 IST Percentage of log file in use or
reserved: 35
Tuesday 25 October 2011 21:41:30 IST Percentage of log file in use or
reserved: 36
Tuesday 25 October 2011 21:42:30 IST Percentage of log file in use or
reserved: 36
Tuesday 25 October 2011 21:43:30 IST Percentage of log file in use or
reserved: 37
Tuesday 25 October 2011 21:44:30 IST Percentage of log file in use or
reserved: 37
Tuesday 25 October 2011 21:45:30 IST Percentage of log file in use or
reserved: 38
Tuesday 25 October 2011 21:46:30 IST Percentage of log file in use or
reserved: 38
Tuesday 25 October 2011 21:47:30 IST Percentage of log file in use or
reserved: 39
Tuesday 25 October 2011 21:48:30 IST Percentage of log file in use or
reserved: 39
Tuesday 25 October 2011 21:49:30 IST Percentage of log file in use or
reserved: 39
Tuesday 25 October 2011 21:50:31 IST Percentage of log file in use or
reserved: 40
Tuesday 25 October 2011 21:51:31 IST Percentage of log file in use or
reserved: 42
Tuesday 25 October 2011 21:52:31 IST Percentage of log file in use or
reserved: 43
Tuesday 25 October 2011 21:53:31 IST Percentage of log file in use or
reserved: 46
Tuesday 25 October 2011 21:54:31 IST Percentage of log file in use or
reserved: 47
Tuesday 25 October 2011 21:55:31 IST Percentage of log file in use or
reserved: 51
Tuesday 25 October 2011 21:56:31 IST Percentage of log file in use or
reserved: 52
Tuesday 25 October 2011 21:57:31 IST Percentage of log file in use or
reserved: 54
Tuesday 25 October 2011 21:58:31 IST Percentage of log file in use or
reserved: 55
Tuesday 25 October 2011 21:59:31 IST Percentage of log file in use or
reserved: 56
Tuesday 25 October 2011 22:00:31 IST Percentage of log file in use or
reserved: 58
Tuesday 25 October 2011 22:01:31 IST Percentage of log file in use or
reserved: 59
Tuesday 25 October 2011 22:02:31 IST Percentage of log file in use or
reserved: 60

Reply With Quote
  #2  
Old   
Roy Hann
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-26-2011 , 06:34 AM






seldom isnice wrote:

Quote:
Hi

We have numerous batch jobs that run each night. Some jobs might run
weekly.

In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!
This is because when the transaction logs reach 72% the oldest running
transaction is rolled back while the current job that is filling the
tx logs continues on. It will only fail if it reaches a limit such as
90%.

So we're in a situation where the logs occasionally go above 72% but
do not reach 90%.

(Also the job that is rolled back is only rolled back when it goes to
commit which is at the very end so that "time" is not very useful
either)

Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.
Run this query repeatedly in imadb, looking for large/growing values of
tx_state_write:

SELECT s.client_pid, t.tx_state_write
FROM ima_server_sessions s JOIN ima_log_transactions t
ON s.session_id = t.tx_session_id
WHERE s.client_pid <> '';

--
Roy

UK Ingres User Association Conference 2012 will be on Tuesday June 19 2012.
The latest information is available from www.uk-iua.org.uk.

Reply With Quote
  #3  
Old   
Karl Schendel
 
Posts: n/a

Default Re: [Info-Ingres] transaction logs filling up: detect offending job - 10-26-2011 , 07:15 AM



On Oct 26, 2011, at 5:59 AM, seldom isnice wrote:

Quote:
In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!
...

(Also the job that is rolled back is only rolled back when it goes to
commit which is at the very end so that "time" is not very useful
either)
That might be a clue. Force-abort actually aborts the target transaction
right away. If the target is in a query, the query is aborted, and upon
return to the sequencer, the transaction is aborted. If the target is in
between queries, the session thread is awakened to do the abort, but
the force-abort can't be reported to the target until the next time it
issues a query. It would seem that in your case, that "next query" is
a commit. Perhaps the commit is being delayed for some reason?


Quote:
Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.
You might be able to use the number of log writes per transaction,
although that's not a sure thing since it doesn't tell you the size of
the log records. ipm can tell you this, or logstat -user_transactions.
Logstat or ipm should also tell you when the force-abort actually
happened, so that you can correlate the time with what's running.
I imagine IMA can tell you as well, but I'm too lazy to look up the
query at the moment.

It would also be worth looking into the job being aborted to make
sure that it's issuing its commit in a timely manner.

Karl

Reply With Quote
  #4  
Old   
seldom isnice
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-26-2011 , 12:01 PM



On Oct 26, 1:15*pm, Karl Schendel <schen... (AT) kbcomputer (DOT) com> wrote:
Quote:
On Oct 26, 2011, at 5:59 AM, seldom isnice wrote:

In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!
...

(Also the job that is rolled back is only rolled back when it goes to
commit which is at the very end so that "time" is not very useful
either)

That might be a clue. *Force-abort actually aborts the target transaction
right away. *If the target is in a query, the query is aborted, and upon
return to the sequencer, the transaction is aborted. *If the target is in
between queries, the session thread is awakened to do the abort, but
the force-abort can't be reported to the target until the next time it
issues a query. *It would seem that in your case, that "next query" is
a commit. *Perhaps the commit is being delayed for some reason?

Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.

You might be able to use the number of log writes per transaction,
although that's not a sure thing since it doesn't tell you the size of
the log records. ipm can tell you this, or logstat -user_transactions.
Logstat or ipm should also tell you when the force-abort actually
happened, so that you can correlate the time with what's running.
I imagine IMA can tell you as well, but I'm too lazy to look up the
query at the moment.

It would also be worth looking into the job being aborted to make
sure that it's issuing its commit in a timely manner.

Karl
Guys

Thanks. That's certainly useful.

The transaction that is being rolled back is embedded SQL/C that
starts with:
set autocommit on
set nojournaling

It is doing the following:
truncate
copy in
indexing

The copy-in statement gives the error: E_QE0024 The transaction log
file is full. The transaction will be aborted.

Thanks.

Reply With Quote
  #5  
Old   
Roy Hann
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-26-2011 , 12:19 PM



seldom isnice wrote:


Quote:
The transaction that is being rolled back is embedded SQL/C that
starts with:
set autocommit on
set nojournaling

It is doing the following:
truncate
copy in
indexing

The copy-in statement gives the error: E_QE0024 The transaction log
file is full. The transaction will be aborted.
Ideally the COPY...FROM wouldn't be doing any logging at all. It
appears it may be getting tricked into doing an incremental load, which
is very slow and does lots of logging.

To ensure it does a bulk load the table needs to be empty (which it
will be, because it's just been truncated); not journaled; no secondary
index (which there won't be unless it was declared WITH PERSISTENCE, and
it mustn't have any system maintained object_key or table_key columns.

--
Roy

UK Ingres User Association Conference 2012 will be on Tuesday June 19 2012.
The latest information is available from www.uk-iua.org.uk.

Reply With Quote
  #6  
Old   
seldom isnice
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-26-2011 , 03:08 PM



On Oct 26, 6:19*pm, Roy Hann <specia... (AT) processed (DOT) almost.meat> wrote:
Quote:
seldom isnice wrote:
The transaction that is being rolled back is embedded SQL/C that
starts with:
set autocommit on
set nojournaling

It is doing the following:
* *truncate
* *copy in
* *indexing

The copy-in statement gives the error: E_QE0024 The transaction log
file is full. The transaction will be aborted.

Ideally the COPY...FROM wouldn't be doing any logging at all. *It
appears it may be getting tricked into doing an incremental load, which
is very slow and does lots of logging.

To ensure it does a bulk load the table needs to be empty (which it
will be, because it's just been truncated); not journaled; no secondary
index (which there won't be unless it was declared WITH PERSISTENCE, and
it mustn't have any system maintained object_key or table_key columns.

--
Roy

UK Ingres User Association Conference 2012 will be on Tuesday June 19 2012. *
The latest information is available fromwww.uk-iua.org.uk.
I'm not suggesting that the copy-in sql/c job is *causing* the
logging.
We have been running this job for years and it hasn't been a problem.
ie the table is truncated. There are no secondary indexes.
autocommit is on
nojournaling is on
So unless there is some kind of intermittent behaviour.

So the point I'm making is that when the copy in reaches the end of
loading all the 7m records it appears to do a commit
at which point the server says: tx aborted. (due to a log full that
happened a while earlier)

Just wondering if that is consistent with what Karl said about
aborting right away.

Thanks

Reply With Quote
  #7  
Old   
Karl Schendel
 
Posts: n/a

Default Re: [Info-Ingres] transaction logs filling up: detect offending job - 10-26-2011 , 03:51 PM



On Oct 26, 2011, at 4:08 PM, seldom isnice wrote:
Quote:

I'm not suggesting that the copy-in sql/c job is *causing* the
logging.
We have been running this job for years and it hasn't been a problem.
ie the table is truncated. There are no secondary indexes.
autocommit is on
nojournaling is on
If the job is in fact doing "set nojournaling" that will not disable journaling
for a table already being journaled. You need "set nojournaling on tablename"
for that.

I don' t know that the copy-in job is causing the problem either, but if the
target table is journaled than it is certainly doing its share of logging.

Quote:
So the point I'm making is that when the copy in reaches the end of
loading all the 7m records it appears to do a commit
at which point the server says: tx aborted. (due to a log full that
happened a while earlier)
The way a copy-from interacts with force abort is that when the server
decides that the copy-from session is the force abort target, it posts
an unsolicited message to the client. The copy-from code eventually
polls for this and replies with an interrupt ack, at which point the
server rolls back the transaction and ends the copy with the force-abort
messages. The poll time varies depending on Ingres version, but it's
unlikely to be more than a few hundred rows worth of input, i.e. a second
or three.

if the copy were to complete successfully, it would indeed commit since
autocommit is turned on.

If the copy-from is really doing a bulk copy, it will write a handful of log
records and then chug away doing its thing. At the end, it will write a few
more log records and finish up. Abort due to force-abort would work
the same way, though, bulk copy or no bulk copy.

Have you mentioned which Ingres version is being used?

Karl

Reply With Quote
  #8  
Old   
seldom isnice
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-27-2011 , 03:45 AM



On Oct 26, 9:51*pm, Karl Schendel <schen... (AT) kbcomputer (DOT) com> wrote:
Quote:
On Oct 26, 2011, at 4:08 PM, seldom isnice wrote:



I'm not suggesting that the copy-in sql/c job is *causing* the
logging.
We have been running this job for years and it hasn't been a problem.
ie the table is truncated. There are no secondary indexes.
autocommit is on
nojournaling is on

If the job is in fact doing "set nojournaling" that will not disable journaling
for a table already being journaled. *You need "set nojournaling on tablename"
for that.

I don' t know that the copy-in job is causing the problem either, but if the
target table is journaled than it is certainly doing its share of logging..



So the point I'm making is that when the copy in reaches the end of
loading all the 7m records it appears to do a commit
at which point the server says: tx aborted. (due to a log full that
happened a while earlier)

The way a copy-from interacts with force abort is that when the server
decides that the copy-from session is the force abort target, it posts
an unsolicited message to the client. *The copy-from code eventually
polls for this and replies with an interrupt ack, at which point the
server rolls back the transaction and ends the copy with the force-abort
messages. *The poll time varies depending on Ingres version, but it's
unlikely to be more than a few hundred rows worth of input, i.e. a second
or three.

if the copy were to complete successfully, it would indeed commit since
autocommit is turned on.

If the copy-from is really doing a bulk copy, it will write a handful of log
records and then chug away doing its thing. *At the end, it will write a few
more log records and finish up. * Abort due to force-abort would work
the same way, though, bulk copy or no bulk copy.

Have you mentioned which Ingres version is being used?

Karl
The table is set to nojournaling
The version of Ingres is 9.2

Roy: when I try to connect to imadb it says: The major database
compatibility does not match that of the server.
Be sure the server is compatible with the requested database.

Reply With Quote
  #9  
Old   
On net
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-27-2011 , 03:58 AM



On 26/10/2011 10:59, seldom isnice wrote:
Quote:
Hi

We have numerous batch jobs that run each night. Some jobs might run
weekly.

In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!
snip

Quote:
Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.
As a software developer, if I look back on the development of software
and the shift in emphasis from bare functionality to considering the
whole user experience, it rather dismays me that a DBA is even having to
ask this question. That's not a reflection on the DBA, but the fact that
it's not an easy thing to do, yet the world over, for twenty years or
more DBAs have probably encountered exactly the same issue and companies
have had upheaval and downtime because of it.

It would be good if Actian took a look at the user experience of their
DBAs and moved it forward from the distant past into the modern era.

DBAs shouldn't need to scrabble about on mailing lists to be able to
solve this very basic stuff. Ingres (as in the software) should provide
this information easily. Isn't it embarrassing?

Reply With Quote
  #10  
Old   
Roy Hann
 
Posts: n/a

Default Re: transaction logs filling up: detect offending job - 10-27-2011 , 04:32 AM



On net wrote:

Quote:
On 26/10/2011 10:59, seldom isnice wrote:
Hi

We have numerous batch jobs that run each night. Some jobs might run
weekly.

In the course of the week there are times when the batch jobs reach
72% and that causes *another* job to fail!

snip

Is there any way to identify the culprit that is causing the
transaction logs to fill ? Is there some kind of monitoring that would
cross-reference the tx logs to the unix process?

Please note there are hundreds of jobs.

As a software developer, if I look back on the development of software
and the shift in emphasis from bare functionality to considering the
whole user experience, it rather dismays me that a DBA is even having to
ask this question. That's not a reflection on the DBA, but the fact that
it's not an easy thing to do, yet the world over, for twenty years or
more DBAs have probably encountered exactly the same issue and companies
have had upheaval and downtime because of it.

It would be good if Actian took a look at the user experience of their
DBAs and moved it forward from the distant past into the modern era.

DBAs shouldn't need to scrabble about on mailing lists to be able to
solve this very basic stuff. Ingres (as in the software) should provide
this information easily. Isn't it embarrassing?
For close to 20 years DMT Inc. and Rational Commerce Ltd. have been
trying to sell DBAnalyzer which (among many other things) spots,
diagnoses, reports, and recommends a fix for exactly this problem. For
close 20 years I can count the number of requests for a free
evaluation I have received on the fingers of my hands.

http://www.dmt.com

I can understand people not wanting to pay money for a tool that comes
close to providing lights-out Ingres support, but the lack of any
interest at all tells me there's no particular appetite for the user
experience you describe. And without expressed interest, why would
anyone bother?

--
Roy

UK Ingres User Association Conference 2012 will be on Tuesday June 19 2012.
The latest information is available from www.uk-iua.org.uk.

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.