dbTalk Databases Forums  

ideas needed to track down a mysterious problem

comp.databases.oracle.server comp.databases.oracle.server


Discuss ideas needed to track down a mysterious problem in the comp.databases.oracle.server forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
prunoki
 
Posts: n/a

Default ideas needed to track down a mysterious problem - 05-13-2007 , 03:36 AM






Hi,

Currently we are in the process of establishing a diagnostic sheet in
order to help us track down a problem. One of our clients is running a
8.1.7.2 database on Solaris. Sometimes the application this database
supports comes to a halt. When this happened we made sure that there
were no locks to justify this situation and we started to log off
everyone, set job_queue_processes to 0, killed the jobs, so basically
stopped everyone from using the database. The database was still
unresponsive and we could not get a system state dump (it just would
not finish doing the dump). After restarting everything became normal.

Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc

Any ideas what to add to this list? Any queries that might help
finding out what happens? Or better, any ideas why this might happen?
What does the fact that a system state dump could not be taken mean to
you? From my point of view this could well be an Oracle bug, or
anything.

The application uses traditional heap organized tables, some global
temp tables, advanced queueing, some java to send letters using
javamail, dblinks to other systems and usually more than 70 users are
logged in. The clients are mainly Magic 9.3, but there are some web
apps logging in using PHP, and a java client is calling some stored
procedures from time to time. There are two sessions running
continously but they are sitting in a DBMS_AQ procedure waiting for
messages.

Regards,

Krisztian


Reply With Quote
  #2  
Old   
Matthias Hoys
 
Posts: n/a

Default Re: ideas needed to track down a mysterious problem - 05-13-2007 , 06:01 AM







"prunoki" <hegyvari (AT) ardents (DOT) hu> wrote

Quote:
Hi,

Currently we are in the process of establishing a diagnostic sheet in
order to help us track down a problem. One of our clients is running a
8.1.7.2 database on Solaris. Sometimes the application this database
supports comes to a halt. When this happened we made sure that there
were no locks to justify this situation and we started to log off
everyone, set job_queue_processes to 0, killed the jobs, so basically
stopped everyone from using the database. The database was still
unresponsive and we could not get a system state dump (it just would
not finish doing the dump). After restarting everything became normal.

Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc

Any ideas what to add to this list? Any queries that might help
finding out what happens? Or better, any ideas why this might happen?
What does the fact that a system state dump could not be taken mean to
you? From my point of view this could well be an Oracle bug, or
anything.

The application uses traditional heap organized tables, some global
temp tables, advanced queueing, some java to send letters using
javamail, dblinks to other systems and usually more than 70 users are
logged in. The clients are mainly Magic 9.3, but there are some web
apps logging in using PHP, and a java client is calling some stored
procedures from time to time. There are two sessions running
continously but they are sitting in a DBMS_AQ procedure waiting for
messages.

Regards,

Krisztian

Is it 8.1.7.2 ? Then you really should upgrade to 8.1.7.4 (or a higher
version of Oracle, since 8i is desupported).
To examine the problem, you could also use STATSPACK.
See http://www.oracle-base.com/articles/8i/Statspack8i.php. Easy to install
and use. Snapshots can be analysed on www.oraperf.com.


Matthias




Reply With Quote
  #3  
Old   
sybrandb@hccnet.nl
 
Posts: n/a

Default Re: ideas needed to track down a mysterious problem - 05-13-2007 , 06:22 AM



On 13 May 2007 01:36:44 -0700, prunoki <hegyvari (AT) ardents (DOT) hu> wrote:

Quote:
Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc

Any ideas what to add to this list? Any queries that might help
finding out what happens? Or better, any ideas why this might happen?
What does the fact that a system state dump could not be taken mean to
you? From my point of view this could well be an Oracle bug, or
anything.
8i is desupported, Oracle still has extended support for 8i, but only
for 8.1.7.4. You should upgrade to 8.1.7.4 asap.

Other than that you shouldn't resort to Microsoft tactics. Just
restarting the system isn't going to help you out.
The *very first* thing you should do is examine the alert log. You
should above all look for a database which can't archive it's redo
logs anymore, as this is one of the primary causes why a database
becomes 'hung'
Then you should check 'what it is waiting for'. If you can connect,
installing statspack might be handy, because it will provide all
relevant information.
Install it using $ORACLE_HOME/rdbms/admin/spcreate, run statspack.snap
two times within 15 minutes, and run
$ORACLE_HOME/rdbms/admin/spreport.sql to get relevant output.

It is most likely a hung archiver and NOT an Oracle bug.

--
Sybrand Bakker
Senior Oracle DBA


Reply With Quote
  #4  
Old   
prunoki
 
Posts: n/a

Default Re: ideas needed to track down a mysterious problem - 05-13-2007 , 06:46 AM




sybrandb (AT) hccnet (DOT) nl írta:
Quote:
On 13 May 2007 01:36:44 -0700, prunoki <hegyvari (AT) ardents (DOT) hu> wrote:

Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc

Any ideas what to add to this list? Any queries that might help
finding out what happens? Or better, any ideas why this might happen?
What does the fact that a system state dump could not be taken mean to
you? From my point of view this could well be an Oracle bug, or
anything.

8i is desupported, Oracle still has extended support for 8i, but only
for 8.1.7.4. You should upgrade to 8.1.7.4 asap.

Other than that you shouldn't resort to Microsoft tactics. Just
restarting the system isn't going to help you out.
The *very first* thing you should do is examine the alert log. You
should above all look for a database which can't archive it's redo
logs anymore, as this is one of the primary causes why a database
becomes 'hung'
Then you should check 'what it is waiting for'. If you can connect,
installing statspack might be handy, because it will provide all
relevant information.
Install it using $ORACLE_HOME/rdbms/admin/spcreate, run statspack.snap
two times within 15 minutes, and run
$ORACLE_HOME/rdbms/admin/spreport.sql to get relevant output.

It is most likely a hung archiver and NOT an Oracle bug.

--
Sybrand Bakker
Senior Oracle DBA
Thanks for the ideas so far. I will include checking the alert log in
the list.

This is a production database, so we must restart it as soon as
possible, if all else fails.

I also think we should upgrade to 8.1.7.4, but it requires testing
which the business people would provide more easily if we knew that
the upgrade would help. Since we do not know that for sure, they are a
bit reluctant, but we will proceed in this direction if enough people
vote for it.

Krisztian



Reply With Quote
  #5  
Old   
Vladimir M. Zakharychev
 
Posts: n/a

Default Re: ideas needed to track down a mysterious problem - 05-14-2007 , 02:15 AM



On May 13, 3:46 pm, prunoki <hegyv... (AT) ardents (DOT) hu> wrote:
Quote:
sybra... (AT) hccnet (DOT) nl írta:



On 13 May 2007 01:36:44 -0700, prunoki <hegyv... (AT) ardents (DOT) hu> wrote:

Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc

Any ideas what to add to this list? Any queries that might help
finding out what happens? Or better, any ideas why this might happen?
What does the fact that a system state dump could not be taken mean to
you? From my point of view this could well be an Oracle bug, or
anything.

8i is desupported, Oracle still has extended support for 8i, but only
for 8.1.7.4. You should upgrade to 8.1.7.4 asap.

Other than that you shouldn't resort to Microsoft tactics. Just
restarting the system isn't going to help you out.
The *very first* thing you should do is examine the alert log. You
should above all look for a database which can't archive it's redo
logs anymore, as this is one of the primary causes why a database
becomes 'hung'
Then you should check 'what it is waiting for'. If you can connect,
installing statspack might be handy, because it will provide all
relevant information.
Install it using $ORACLE_HOME/rdbms/admin/spcreate, run statspack.snap
two times within 15 minutes, and run
$ORACLE_HOME/rdbms/admin/spreport.sql to get relevant output.

It is most likely a hung archiver and NOT an Oracle bug.

--
Sybrand Bakker
Senior Oracle DBA

Thanks for the ideas so far. I will include checking the alert log in
the list.

This is a production database, so we must restart it as soon as
possible, if all else fails.

I also think we should upgrade to 8.1.7.4, but it requires testing
which the business people would provide more easily if we knew that
the upgrade would help. Since we do not know that for sure, they are a
bit reluctant, but we will proceed in this direction if enough people
vote for it.

Krisztian
Not having any kind of support from Oracle for your current version
isn't enough justification for patching it up to a supported version?
And to be more persuasive, you might want to show them decision makers
the list of defects fixed in 8.1.7.3 and 8.1.7.4 and all subsequent
CPUs and the most important one-offs since 8.1.7.4...

Regards,
Vladimir M. Zakharychev
N-Networks, makers of Dynamic PSP(tm)
http://www.dynamicpsp.com



Reply With Quote
  #6  
Old   
NetComrade
 
Posts: n/a

Default Re: ideas needed to track down a mysterious problem - 05-24-2007 , 03:14 AM



On Sun, 13 May 2007 13:01:03 +0200, "Matthias Hoys" <anti (AT) spam (DOT) com>
wrote:

Quote:
"prunoki" <hegyvari (AT) ardents (DOT) hu> wrote in message
news:1179045404.300974.260720 (AT) y80g2000hsf (DOT) googlegroups.com...
Hi,

Currently we are in the process of establishing a diagnostic sheet in
order to help us track down a problem. One of our clients is running a
8.1.7.2 database on Solaris. Sometimes the application this database
supports comes to a halt. When this happened we made sure that there
were no locks to justify this situation and we started to log off
everyone, set job_queue_processes to 0, killed the jobs, so basically
stopped everyone from using the database. The database was still
unresponsive and we could not get a system state dump (it just would
not finish doing the dump). After restarting everything became normal.

Right now I started to establish a protocol, organizing the todo-s
into steps which should be taken before restarting. It is something
like:

- examine the active sessions
- examine the logs
- check the CPU and disk resources
- etc
1) hire a dba that knows wtf he's doing
2) patch oracle so it's supported if you value your data.. upgrades
are not as scary as they seem.
........
We run Oracle 9iR2,10gR1/2 on RH4/RH3 and Solaris 10 (Sparc)
remove NSPAM to email


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.