dbTalk Databases Forums  

[Info-ingres] Mutex on QSR sem

comp.databases.ingres comp.databases.ingres


Discuss [Info-ingres] Mutex on QSR sem in the comp.databases.ingres forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
martin.bowes@ctsu.ox.ac.uk
 
Posts: n/a

Default [Info-ingres] Mutex on QSR sem - 03-09-2006 , 08:09 AM






Hi Dudes,

I'm getting a LOT of mutexes on QSR sem, which is ultimatly trashing
the server.

Anyone got any idea what a QSR Sem is and why I should care?

Martin Bowes
--
Random Duckman Quote #115:
King Chicken: My morals are beyond reproach and I'll brutally kill
anyone who says otherwise.


Reply With Quote
  #2  
Old   
David Richard
 
Posts: n/a

Default RE: [Info-ingres] Mutex on QSR sem - 03-09-2006 , 08:24 AM






Hi Martin,

Two initial questions:-

1. What's your O/S version and Ingres version?

2. What's going on in your QSF Pool when you see QSR mutexes?

FWIW a quick work-around is to start another iidbms...

Looking forward to hearing your answers to 1 & 2.

Regards,

Richard

************************************************** **********************
DISCLAIMER
The information contained in this e-mail is confidential and is intended
for the recipient only.
If you have received it in error, please notify us immediately by reply
e-mail and then delete it from your system. Please do not copy it or
use it for any other purposes, or disclose the content of the e-mail
to any other person or store or copy the information in any medium.
The views contained in this e-mail are those of the author and not
necessarily those of Admenta UK Group.
************************************************** **********************


Reply With Quote
  #3  
Old   
martin.bowes@ctsu.ox.ac.uk
 
Posts: n/a

Default RE: [Info-ingres] Mutex on QSR sem - 03-09-2006 , 08:43 AM



Hi David

Quote:
Hi Martin,

Two initial questions:-

1. What's your O/S version and Ingres version?
Red Hat Linux release 7.1 (Seawolf)
IngresII 2.6/0305 patch11279

Does Linux have an equivalent to 'sem_opm'? If so how can I
determine what the setting is?

Quote:
2. What's going on in your QSF Pool when you see QSR mutexes?
I'm only monitoring with qs501 on a per 5 minute basis.
But it does look as if the %memory used is peaking, I get readings of
over 80% used from a normal base of 70%. But I'm not getting errors in
the errlog in regards to qsf pool.

QSF memory is set to 2660000.

I'll bump it to 10M and retry.

Quote:
FWIW a quick work-around is to start another iidbms...
Sadly not possible. The kernel doesnt have enough shared memory
configured. I can only start a sole server.
Quote:
Looking forward to hearing your answers to 1 & 2.

Regards,

Richard
Marty




Reply With Quote
  #4  
Old   
David Richard
 
Posts: n/a

Default RE: [Info-ingres] Mutex on QSR sem - 03-09-2006 , 09:33 AM



Hi Martin,

I'm not 100% sure but I think that RedHat's equivalent to Unix's
'sem_opm' is 'Max ops per semop call', and you can see see what its set
at as follows:-

ipcs -l | grep semop
or
cat /proc/sys/kernel/sem | awk '{ print $3 }'

If you want to change the value to say 55 you can try the following:-
$ su root
$ cat /proc/sys/kernel/sem | awk '{ print $1, $2, 55, $4 }' >|
/proc/sys/kernel/sem # Maek change to /proc
$ echo "cat /proc/sys/kernel/sem | awk '{ print $1, $2, 55, $4 }' >|
/proc/sys/kernel/sem" >> /etc/rc.local # Make change permanent after
reboot

You don't need to reboot for this setting to take effect.

Cheers,

Richard

************************************************** **********************
DISCLAIMER
The information contained in this e-mail is confidential and is intended
for the recipient only.
If you have received it in error, please notify us immediately by reply
e-mail and then delete it from your system. Please do not copy it or
use it for any other purposes, or disclose the content of the e-mail
to any other person or store or copy the information in any medium.
The views contained in this e-mail are those of the author and not
necessarily those of Admenta UK Group.
************************************************** **********************


Reply With Quote
  #5  
Old   
Betty & Karl Schendel
 
Posts: n/a

Default Re: [Info-ingres] Mutex on QSR sem - 03-09-2006 , 10:12 AM



At 2:09 PM +0000 3/9/06, martin.bowes (AT) ctsu (DOT) ox.ac.uk wrote:
Quote:
Hi Dudes,

I'm getting a LOT of mutexes on QSR sem, which is ultimatly trashing
the server.

Anyone got any idea what a QSR Sem is and why I should care?

If this is 2.6 or r3, my guess is that you are way short on QSF memory
and it's going thru the LRU reclaim a lot. QSF mutexing was split up
starting with 2.6 and the QSR mutex is held for only a short time in normal
allocation. We only need to hold QSR when working with the head of the
QSF memory lists, or when scanning the LRU reclaim list.

If you can raise QSF memory, try that first.

Karl


Reply With Quote
  #6  
Old   
martin.bowes@ctsu.ox.ac.uk
 
Posts: n/a

Default Re: [Info-ingres] Mutex on QSR sem - 03-10-2006 , 04:03 AM



Hi Karl,

IngresII2.6/0305 patch11279. The problem has come since I upgraded
to this patch from 11063 - which seemed to be cool in this area.

I bumped the QSF memory from 2.5M to 10M. But all this did was delay
the inevitable.

The qsf memory used builds up to around 80% used with trace point
qs505 showing no LRU objects destroyed. My guess is that as it finally
decides to reclaim some space and process the LRU list it never
releases the semaphore.

Marty

Quote:
At 2:09 PM +0000 3/9/06, martin.bowes (AT) ctsu (DOT) ox.ac.uk wrote:
Hi Dudes,

I'm getting a LOT of mutexes on QSR sem, which is ultimatly trashing
the server.

Anyone got any idea what a QSR Sem is and why I should care?


If this is 2.6 or r3, my guess is that you are way short on QSF memory
and it's going thru the LRU reclaim a lot. QSF mutexing was split up
starting with 2.6 and the QSR mutex is held for only a short time in
normal allocation. We only need to hold QSR when working with the
head of the QSF memory lists, or when scanning the LRU reclaim list.

If you can raise QSF memory, try that first.

Karl
--
Random Duckman Quote #62:
Cornfed: In the Judeo-Christian iconography, the apple represents
forbidden
fruit, the ultimate sin, implying a desire to engage in the forbidden
act, hence becoming a symbol of the ultimate harassment...
and they were such a good source of vitamin A.




Reply With Quote
  #7  
Old   
Peter Gale
 
Posts: n/a

Default RE: [Info-ingres] Mutex on QSR sem - 03-10-2006 , 04:16 AM



Hi Marty,

I cant speak with authority on what is going on inside ingres but I have
always used a rule of thumb that 70-80% utilisation = FULL.

10mb is still not that much memory in the grand scheme of things. We have
happily supported >1000 concurrent users with ~40mb and in R3 the defaults
for a "large" configuration would be 100mb and for "huge" it 250mb.

So I would suggest you bump this up by quite a lot until you are
consistently below 70% utilisation.

--
Peter
T: +44 (0)1398 341777
PGale (AT) Comp-Soln (DOT) co.uk

-----Original Message-----
From: info-ingres-admin (AT) cariboulake (DOT) com
[mailto:info-ingres-admin (AT) cariboulake (DOT) com] On Behalf Of
martin.bowes (AT) ctsu (DOT) ox.ac.uk
Sent: 10 March 2006 10:03
To: Betty & Karl Schendel
Cc: info-ingres (AT) cariboulake (DOT) com
Subject: Re: [Info-ingres] Mutex on QSR sem

Hi Karl,

IngresII2.6/0305 patch11279. The problem has come since I upgraded
to this patch from 11063 - which seemed to be cool in this area.

I bumped the QSF memory from 2.5M to 10M. But all this did was delay
the inevitable.

The qsf memory used builds up to around 80% used with trace point
qs505 showing no LRU objects destroyed. My guess is that as it finally
decides to reclaim some space and process the LRU list it never
releases the semaphore.

Marty

Quote:
At 2:09 PM +0000 3/9/06, martin.bowes (AT) ctsu (DOT) ox.ac.uk wrote:
Hi Dudes,

I'm getting a LOT of mutexes on QSR sem, which is ultimatly trashing
the server.

Anyone got any idea what a QSR Sem is and why I should care?


If this is 2.6 or r3, my guess is that you are way short on QSF memory
and it's going thru the LRU reclaim a lot. QSF mutexing was split up
starting with 2.6 and the QSR mutex is held for only a short time in
normal allocation. We only need to hold QSR when working with the
head of the QSF memory lists, or when scanning the LRU reclaim list.

If you can raise QSF memory, try that first.

Karl
--
Random Duckman Quote #62:
Cornfed: In the Judeo-Christian iconography, the apple represents
forbidden
fruit, the ultimate sin, implying a desire to engage in the
forbidden
act, hence becoming a symbol of the ultimate harassment...
and they were such a good source of vitamin A.


_______________________________________________
Info-ingres mailing list
Info-ingres (AT) cariboulake (DOT) com
http://mailman.cariboulake.com/mailm...py/info-ingres



Reply With Quote
  #8  
Old   
Peter Gale
 
Posts: n/a

Default RE: [Info-ingres] Mutex on QSR sem - 03-10-2006 , 05:57 AM



Hi Marty,

Don't get me wrong. I'm not saying there isn't an issue here with Ingres.
What I am saying is that if you bump up qsf_memory you will, in all
likelihood, reach a point where it stops filling up and therefore the LRU
discarding does not happen. 10M is quite a small amount of qsf_memory
especially when seen in the context of the new R3 defaults and based on our
experience. And remember 10m is the MAX it will take.
You will be able to bump up qsf_memory much sooner than any fix will arrive
and I have never known this to cause a problem (unless of course you went
mad and used up all available memory)

--
Peter
T: +44 (0)1398 341777
PGale (AT) Comp-Soln (DOT) co.uk


-----Original Message-----
From: martin.bowes (AT) ctsu (DOT) ox.ac.uk [mailto:martin.bowes (AT) ctsu (DOT) ox.ac.uk]
Sent: 10 March 2006 11:51
To: PGale (AT) Comp-Soln (DOT) co.uk
Cc: info-ingres (AT) cariboulake (DOT) com
Subject: RE: [Info-ingres] Mutex on QSR sem

Hi Peter,

But the usage of QSF is supposed to grow - like a goldfish - it
expands to fill the available space, and wont start ditching stuff until it
needs to reclaim the space.

Although I have to confess I'm not sure if the old objects stuck
on the LRU queue could be re-used like a repeatable query. I suspect
repeatable queries just mark the object as a higher save priority. Over
to Karl on that one.

Nonetheless this all used to work in the prior patch. Its clear to
me that the new patch has a bug in regards to reclaiming the space.

Marty

Quote:
Hi Marty,

I cant speak with authority on what is going on inside ingres but I
have always used a rule of thumb that 70-80% utilisation = FULL.

10mb is still not that much memory in the grand scheme of things. We
have happily supported >1000 concurrent users with ~40mb and in R3 the
defaults for a "large" configuration would be 100mb and for "huge" it
250mb.

So I would suggest you bump this up by quite a lot until you are
consistently below 70% utilisation.

--
Peter
T: +44 (0)1398 341777
PGale (AT) Comp-Soln (DOT) co.uk

-----Original Message-----
From: info-ingres-admin (AT) cariboulake (DOT) com
[mailto:info-ingres-admin (AT) cariboulake (DOT) com] On Behalf Of
martin.bowes (AT) ctsu (DOT) ox.ac.uk
Sent: 10 March 2006 10:03
To: Betty & Karl Schendel
Cc: info-ingres (AT) cariboulake (DOT) com
Subject: Re: [Info-ingres] Mutex on QSR sem

Hi Karl,

IngresII2.6/0305 patch11279. The problem has come since I upgraded to
this patch from 11063 - which seemed to be cool in this area.

I bumped the QSF memory from 2.5M to 10M. But all this did was delay
the inevitable.

The qsf memory used builds up to around 80% used with trace point
qs505 showing no LRU objects destroyed. My guess is that as it finally
decides to reclaim some space and process the LRU list it never
releases the semaphore.

Marty

At 2:09 PM +0000 3/9/06, martin.bowes (AT) ctsu (DOT) ox.ac.uk wrote:
Hi Dudes,

I'm getting a LOT of mutexes on QSR sem, which is ultimatly
trashing the server.

Anyone got any idea what a QSR Sem is and why I should care?


If this is 2.6 or r3, my guess is that you are way short on QSF
memory and it's going thru the LRU reclaim a lot. QSF mutexing was
split up starting with 2.6 and the QSR mutex is held for only a
short time in normal allocation. We only need to hold QSR when
working with the head of the QSF memory lists, or when scanning the
LRU reclaim list.

If you can raise QSF memory, try that first.

Karl
--
Random Duckman Quote #62:
Cornfed: In the Judeo-Christian iconography, the apple represents
forbidden
fruit, the ultimate sin, implying a desire to engage in the
forbidden
act, hence becoming a symbol of the ultimate harassment...
and they were such a good source of vitamin A.


_______________________________________________
Info-ingres mailing list
Info-ingres (AT) cariboulake (DOT) com
http://mailman.cariboulake.com/mailm...py/info-ingres





Reply With Quote
  #9  
Old   
Betty & Karl Schendel
 
Posts: n/a

Default Re: [Info-ingres] Mutex on QSR sem - 03-10-2006 , 06:11 AM



At 10:03 AM +0000 3/10/06, martin.bowes (AT) ctsu (DOT) ox.ac.uk wrote:
Quote:
Hi Karl,

IngresII2.6/0305 patch11279. The problem has come since I upgraded
to this patch from 11063 - which seemed to be cool in this area.

I bumped the QSF memory from 2.5M to 10M. But all this did was delay
the inevitable.

The qsf memory used builds up to around 80% used with trace point
qs505 showing no LRU objects destroyed. My guess is that as it finally
decides to reclaim some space and process the LRU list it never
releases the semaphore.
I wonder if that is the version that clearing session objects on
a rollback was introduced. Do your jobs do many rollbacks, or do
you have a high connect/disconnect rate?

10M isn't really all that much for QSF. Remember that it has to
hold named objects (repeated, dbp's) as well as unnamed objects
(on-the-fly query stuff, query plans, parse trees).

I doubt that it isn't releasing the QSR sem at all, because then
it would hang solid; but it sounds like it's holding it way too
long for some reason. How long is your LRU queue? (I forget which
QS5xx trace point tells you that.)

Karl


Reply With Quote
  #10  
Old   
martin.bowes@ctsu.ox.ac.uk
 
Posts: n/a

Default Re: [Info-ingres] Mutex on QSR sem - 03-10-2006 , 06:43 AM



Hi Karl ,

Quote:
I wonder if that is the version that clearing session objects on
a rollback was introduced. Do your jobs do many rollbacks, or do you
have a high connect/disconnect rate?
No rollbacks at all. The connect/disconnect rate is very small.

The primary job is doing a copydb - for a few databases (ie about 5).
The secondary job is simply running a select count(*) from iitables on
several databases (ie. about 8) on remote hosts ie its using a vnode so
the query isnt really running in this server. Yet it seems to be the onset
of that secondary job that causes the grief.

Quote:
10M isn't really all that much for QSF. Remember that it has to
hold named objects (repeated, dbp's) as well as unnamed objects
(on-the-fly query stuff, query plans, parse trees).
Given that it was 2.5M until the new patch went in...
Quote:
I doubt that it isn't releasing the QSR sem at all, because then
it would hang solid; but it sounds like it's holding it way too
long for some reason. How long is your LRU queue? (I forget which
QS5xx trace point tells you that.)
At the moment...

I've rerun the jobs (this has taken nearly 2hours!) and built the
qsf memory to 71.13% leaving 288700 bytes free.

#number of objects in LRU queue = 519 with 0 object
destroyed.
Most ever unnamed = 10
Most ever named = 519
Most ever index objs = 1038
Most ever = 1562
# on LRU chain = 519
# with wait 0 = 519
# with QSO_FREE = 519
# with both = 519
Also, Largest QSF piece allocated/requested = 8192

Just to prove the point about the secondary job, I'm going to rerun the
primary job with the secondary switched off and see what happens -
last time it worked okay!

Marty
--
Random Duckman Quote #114:
King Chicken: How dare you insult me in front of my wife, whose still
dangerously coherent.



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2013, Jelsoft Enterprises Ltd.