dbTalk Databases Forums  

[Info-ingres] commit takes very long time

comp.databases.ingres comp.databases.ingres


Discuss [Info-ingres] commit takes very long time in the comp.databases.ingres forum.



Reply
 
Thread Tools Display Modes
  #11  
Old   
Roy Hann
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-06-2005 , 08:40 AM






<martin.bowes (AT) ctsu (DOT) ox.ac.uk> wrote

Quote:
Hi Dudes,

Hi Piotr,

That looks pretty good. You clearly have a lot of buffers
configured. At least 3584 and have only managed to use that many once.

16 log writer threads may be a bit small for that number of
buffers. It might be worthwhile trying to increase that value
substantially given the number of times we see 'All logwriters busy'.
You might like to check how many threads your allowed to start first.

I'll take that back now. I just tried setting the number of
log_writer_threads up substantially (to 150) on a Linux IngresII2.6 and
promptlly put the server into a spin. 100% cpu! So if you do raise the
number of threads - don't go crazy.
That may be more of a Linux problem than an Ingres limit. The "standard"
Linux thread mechanism is notorious rubbish. Make sure you have a kernel
that implements Posix threads (e.g. 2.6).

Roy




Reply With Quote
  #12  
Old   
Betty & Karl Schendel
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-06-2005 , 09:15 AM






At 2:40 PM +0000 3/6/05, Roy Hann wrote:
Quote:
martin.bowes (AT) ctsu (DOT) ox.ac.uk> wrote in message
news:mailman.1110103382.9699.info-ingres (AT) cariboulake (DOT) com...
Hi Dudes,

Hi Piotr,

That looks pretty good. You clearly have a lot of buffers
configured. At least 3584 and have only managed to use that many once.

16 log writer threads may be a bit small for that number of
buffers. It might be worthwhile trying to increase that value
substantially given the number of times we see 'All logwriters busy'.
You might like to check how many threads your allowed to start first.

I'll take that back now. I just tried setting the number of
log_writer_threads up substantially (to 150) on a Linux IngresII2.6 and
promptlly put the server into a spin. 100% cpu! So if you do raise the
number of threads - don't go crazy.

That may be more of a Linux problem than an Ingres limit. The "standard"
Linux thread mechanism is notorious rubbish. Make sure you have a kernel
that implements Posix threads (e.g. 2.6).
That won't do much good, alas, with Ingres 2.6. Ingres 2.6 is compiled
to assume linuxthreads, not NPTL threads. The relevant code inside
the server is controlled by a compile time switch, not a runtime switch,
although I suppose in principle the detection could be done at runtime.
It's likely that a 2.6 kernel will run Ingres 2.6 a little better than
a 2.4 kernel, but not much better.

Marty's observation may also be caused by a couple of outright bugs
in the logwriter and log buffer chain handling in 2.6. I'm pretty sure
the bugs are fixed in recent patches, but I don't know what exactly "recent"
means. 100% CPU spinning certainly sounds like the description of one
of the bugs, though. I would avoid large numbers of log buffers and logwriters
in Ingres 2.6. (Where large is say more than a few dozen logwriters, and
more than a couple hundred log buffers. Your mileage may vary, test to
see what works for you.)

Karl


Reply With Quote
  #13  
Old   
Artur Fajok
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-07-2005 , 03:28 AM



Quote:
That's an awful lot of log buffers. I question whether that many buffers
are doing you any good. (It's not impossible, if you have really large
updating transactions, just unlikely.)

Starting with 2.6 (I think), multiple rollbacks can proceed concurrently,
and in any case I don't think that rollback would stall a commit.

I suspect that you ran into either some sort of hardware hiccup, or
a consistency-point wait (BCP wait) that took a really long time for
some unknown reason. But I don't really know.

Karl

There are 10000(4k) buffers configured. I didn't found any performace
gains in further increasing this amount. But it serves us much better
than previously configured 4000.


Those buffers are for:
4 primary dbms servers (16 log writers each),
2 secondary dbms servers (1 log writer each)
Recovery server (16 log writers).
Total: 80 log writers (125 buffers per thread).

Mystery of long commit resolved: transaction log was under heavy load -
see attachment for key values from logstat before, and after transaction.
BTW unfortunately transaction log is sharing devices with other
filesystems - configuring a server is always a compromise.

Regards,
Artur Fajok




+---------------+---------------+-------------------------+-------------+-------------+-------------+-------------+-------------+--------------+-------------+--------------+--------------+-------------------+-------------------+
Quote:
machine_name |machine_ip |date |log_read_ios |log_write_ios|log_writes |log_forces |log_split_wai|log_free_waits|log_bcp_waits|kbytes _written|write_complete|all_logwriters_busy|max_wr ite_queue_cnt|
+---------------+---------------+-------------------------+-------------+-------------+-------------+-------------+-------------+--------------+-------------+--------------+--------------+-------------------+-------------------+
sun480 |10.66.134.69 |050303 21:16:20 | 752082| 747115| 9678627| 103909| 6742| 83| 41| 1096434| 747113| 93972| 82|
sun480 |10.66.134.69 |050303 21:32:29 | 818072| 822283| 10164158| 105886| 10742| 115| 44| 1243593| 822279| 149460| 114|
+---------------+---------------+-------------------------+-------------+-------------+-------------+-------------+-------------+--------------+-------------+--------------+--------------+-------------------+-------------------+



Reply With Quote
  #14  
Old   
Gibson Jonathan
 
Posts: n/a

Default RE: [Info-ingres] commit takes very long time - 03-07-2005 , 04:21 AM



I'd seriously look at moving your log file to it's own device. Disks
are cheap cheap cheap. You might have a problem with finding a free bay
to put one in but there's always the refurbished option (we just did
that and saved a fortune and the refurbished disks were 18 months newer
than our originals anyway). As Roy stated last week, the transaction
log is the critical and most vulnerable part of an Ingres installation.


-----Original Message-----
From: info-ingres-admin (AT) cariboulake (DOT) com
[mailto:info-ingres-admin (AT) cariboulake (DOT) com] On Behalf Of Artur Fajok
Sent: 07 March 2005 09:28
To: info-ingres (AT) cariboulake (DOT) com
Subject: Re: [Info-ingres] commit takes very long time


Quote:
That's an awful lot of log buffers. I question whether that many
buffers
are doing you any good. (It's not impossible, if you have really
large
updating transactions, just unlikely.)

Starting with 2.6 (I think), multiple rollbacks can proceed
concurrently,
and in any case I don't think that rollback would stall a commit.

I suspect that you ran into either some sort of hardware hiccup, or
a consistency-point wait (BCP wait) that took a really long time for
some unknown reason. But I don't really know.

Karl

There are 10000(4k) buffers configured. I didn't found any performace
gains in further increasing this amount. But it serves us much better
than previously configured 4000.


Those buffers are for:
4 primary dbms servers (16 log writers each),
2 secondary dbms servers (1 log writer each)
Recovery server (16 log writers).
Total: 80 log writers (125 buffers per thread).

Mystery of long commit resolved: transaction log was under heavy load -
see attachment for key values from logstat before, and after
transaction.
BTW unfortunately transaction log is sharing devices with other
filesystems - configuring a server is always a compromise.

Regards,
Artur Fajok





__________________________________________________ ___________________
This message has been checked for all known viruses by bluesource. For
further information visit www.blue-source.com

powered by Messagelabs


<font size='1' face='arial'>************************************* *********************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. No one else is authorised to distribute, forward,
print, copy or act upon any information contained in this email.
If you have received this email in error, please notify the sender.

Hiscox Syndicates Limited, Hiscox Insurance Company Limited,
Hiscox Connect Limited, Hiscox Underwriting Limited and
Hiscox Investment Management Limited are authorised and
regulated by the Financial Services Authority. Hiscox plc is a
company registered in England and Wales under company
registration number 2837811 and registered office at
1 Great St Helen's, London EC3A 6HX
************************************************** ********************


__________________________________________________ ___________________
This message has been checked for all known viruses by blue-source. For further information visit www.blue-source.com

powered by Messagelabs



Reply With Quote
  #15  
Old   
Roy Hann
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-07-2005 , 04:48 AM



"Artur Fajok" <forrest_NOSPAM (AT) gazeta (DOT) pl> wrote


Quote:
Those buffers are for:
4 primary dbms servers (16 log writers each),
2 secondary dbms servers (1 log writer each)
Recovery server (16 log writers).
Total: 80 log writers (125 buffers per thread).

Mystery of long commit resolved: transaction log was under heavy load -
I don't think that really is the problem actually. I think it is even
simpler than that.

Quote:
see attachment for key values from logstat before, and after transaction.
BTW unfortunately transaction log is sharing devices with other
filesystems - configuring a server is always a compromise.
I couldn't really make sense of the logging stats you posted because of the
formatting (well, more accurately, I couldn't be bothered to try making
sense of them :-). However, let's do a little arithmetic. 10,000 4k log
buffers could take 10,000 I/Os to write, and they have to be written in
order and synchronously, so you are not going to manage more than about 100
per second at best, so that is 100 seconds of writing right there,
regardless of what else the file system is doing.

It would appear that those 10,000 log buffers have allowed you to save up
all the cost of logging a big update until it fills thousands of log
buffers, and then incur all the cost at COMMIT. I admit that I've not seen
that happen before, but it makes sense now. The good news is that your
total transaction running time is probably no different than if you'd used
fewer log buffers in a shorter cycle, and it might even be a tad faster
overall. Although this "slow" commit phenomenon is unfamiliar (to me), it
is probably to be expected and not in fact a sign of a problem.

You might benefit from using a RAM disk for your transaction log. The
reliability of RAM disks is very attractive all by itself, and the blazing
speed is a bonus. I would have suggested Platypus Technology, a company
that we did some work with a few years ago, but sadly they've gone out of
business. Still, I am sure that five minutes alone with Google will turn up
somebody selling 2Gb RAM disks, and they can only be getting cheaper.

Roy




Reply With Quote
  #16  
Old   
martin.bowes@ctsu.ox.ac.uk
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-07-2005 , 08:18 AM



Hi Roy et al,

Man this is just a fascinating thread...Sorry about the length of
this, I'm just free wheeling ideas around...

Artur/Piotr may have configured 10,000 log buffers but
according to the logstat output have never managed to queue up more
than 3584 of them - and anaged that only the once. Ingres 'should'
never attempt to write more than the currently used buffers to the log
file.

Furthermore we don't know with which querie(s) they managed
to do that. Its just the known high water mark. Nor do we know the
distribution of the log buffer queue over time. They could be in a
situation where the vast majority of writes to the log buffers are handled
in one or more pages, with only a few processes needing more.

This is why I wanted to know if this was a general failing or in response
to a specific query.

A couple of extra things to point out from Piotrs logstat was that
he isnt getting much out of the group commit feature. Ratio of 'Log
group count' to 'log group commit' is 1.14 So rarely does he have more
than one transaction in a buffer.

Further, checking the the various wait states is interesting in
that the Log waits= 209995 is half the log write i/o figure of 560558.
This would seem to indicate that Piotrs log is in a fair deal of bother,
writes have a very good chance of stalling.

We may be able to measure the disk performance with iostat.
However, each UNIX implementation has a different implementation of
this command - some notably more verbose/useful than others. For
example the DUNIX iostat merely notes the transfer rate in Kbytes/sec -
not differentiating between reads and writes.

But lets talk about the process in general, and how logstat can
be used. Feel free to jump in and kick my head where I'm wrong.

A transaction commits. It has associated dirty pages in the
DMF Cache. That Cache could be a shared memory segment for
shared servers or part of the server process memory for a sole server.
It can be firther complicated by being a 2k page, a 4k page etc. But
either way, something does a memory to memory write of the modifed
page to the log buffers. I don't know if that's part of the logwriter threads
duties. I suspect not, I'm pretty sure it just gets log buffers to the log file.
But as we configure threads in the the dbms server, I'm not sure. And
as I've written this up, I've swung from one view to the other on that.

The ' Log writes' field in logstat is supposed to show the
number of such writes from the DMF Cache to the log buffers. These
are memory to memory writes and should be x1000's time faster than
the disk i/o to log file. In Piotrs case this was 9391333. Refering to
Piotrs logstat, the log knows of 1102419 transactions. Which tends to
suggest that the majority of transactions are small.

I presume a single log writer thread starts up to write the committed
transaction to the log buffers. We only get 'All logwriters busy' if there
are more committing transactions in the system than threads to handle
them. Refering to Piotrs logstat, the log knows of 1102419 transactions
and has maxed out logwriters on 31101 occasions. About 3%. Is that
too much?

Now we know that log free wait is zero in Piotrs case. So there
are no occassions where the logwriters have to stall waiting for a free
buffer. But its an interesting meaure. Personally whenever I see a 'log
wait' or 'log free wait' or 'log split wait', I'd bump the number of log
buffers and use the 'Max write queue len' to tell me if I'm covered.

As each log buffer is started I presume a group commit feature
starts some ticking process to see if it can be used by another
transaction. But as stated earlier, Piotr isnt getting much out of that.
Now I cant recall how long it will wait for another transaction to require a
log buffer. IIRC there is some 'you must commit another transaction
within 3 'ticks', and in any case I'll write this log buffer after 5ticks
situation going on. How long is a tick? Could this be a source of
problems in Piotrs case? Because whether or not he ultimatly uses
group commit - he's going to have to wait to find out. I wonder if that's
on a transaction basis or on a per log buffer basis.

Thats enough for now. I'm getting back to work. But if anyone
can clear this up I'd be happy.

Martin Bowes

--
Random Duckman Quote #82:
Cornfed: We've got to get to the banquet and stop that toast.
Ajax: ...But I like toast...it's the muffins that must be stopped.
Duckman: Okay son, now you are starting to scare the viewers.



Reply With Quote
  #17  
Old   
Artur Fajok
 
Posts: n/a

Default Re: [Info-ingres] commit takes very long time - 03-07-2005 , 11:02 AM




Quote:
Thats enough for now. I'm getting back to work. But if
anyone can clear this up I'd be happy.

Martin Bowes

Just few words:
Logstat presented by Piotr is not quite accureate - as part of daily
maintenance, ingres is restarted on daily basis (remnants of few bugs).
Output that you have seen is representation of rather late beginning
of intensive data processing. If anyone would be interested, I can post
some logstat output from really heavy load (may be correlated with iostat).
For now I'm attaching a little better formatted output from archived
logstat values with proper timeframe.

Regards,
Artur Fajok

+---------------+-------------------------
Quote:
machine_name |date
+---------------+-------------------------
sun480 |050303 21:16:20
sun480 |050303 21:32:29
+---------------+---------------+-------------------------
+-------------+-------------+-------------+-------------
Quote:
log_read_ios |log_write_ios|log_writes |log_forces
+-------------+-------------+-------------+-------------
752082| 747115| 9678627| 103909
818072| 822283| 10164158| 105886
+-------------+-------------+-------------+-------------
+-------------+--------------+-------------+--------------
Quote:
log_split_wai|log_free_waits|log_bcp_waits|kbytes_ written
+-------------+--------------+-------------+--------------
6742| 83| 41| 1096434
10742| 115| 44| 1243593
+-------------+--------------+-------------+--------------
+--------------+-------------------+-------------------+
Quote:
write_complete|all_logwriters_busy|max_write_queue _cnt|
+--------------+-------------------+-------------------+
747113| 93972| 82|
822279| 149460| 114|
+--------------+-------------------+-------------------+




Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.