dbTalk Databases Forums  

application failover with RDB access

comp.databases.rdb comp.databases.rdb


Discuss application failover with RDB access in the comp.databases.rdb forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Nazim
 
Posts: n/a

Default application failover with RDB access - 11-07-2006 , 10:52 AM






Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1


Reply With Quote
  #2  
Old   
gxys@uk2.net
 
Posts: n/a

Default Re: application failover with RDB access - 11-07-2006 , 10:54 AM






I'd guess the RDB software startup has not been run as it installs
those images.


Reply With Quote
  #3  
Old   
Norman Lastovica
 
Posts: n/a

Default Re: application failover with RDB access - 11-07-2006 , 11:08 AM





Nazim wrote:
Quote:
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
make sure that you've executed RMUSTART70 (or RMONSTART) on all
thenodes.

If you are using multi-version Rdb, then you'll probably need to execute
SYS$SHARE:RDB$SETVER prior to using RMU. If this is the case, you could
build a little DCL procedure to do:

$ @SYS$SHARE:RDB$SETVER 70
$ RMU/SHOW SYSTEM

and execute that procedure from SYSMAN.

Quote:
OpenVMS V7.3

site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser

SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1
--
- - - - -
opinions expressed here are mine and mine alone
and certainly are not intended in any way to
express or represent any opinions or commitment
of oracle corporation.

norman lastovica / oracle rdb engineering


Reply With Quote
  #4  
Old   
Nazim
 
Posts: n/a

Default Re: application failover with RDB access - 11-07-2006 , 11:29 AM




Norman Lastovica schrieb:

Quote:
Nazim wrote:

Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.

make sure that you've executed RMUSTART70 (or RMONSTART) on all
thenodes.

If you are using multi-version Rdb, then you'll probably need to execute
SYS$SHARE:RDB$SETVER prior to using RMU. If this is the case, you could
build a little DCL procedure to do:

$ @SYS$SHARE:RDB$SETVER 70
$ RMU/SHOW SYSTEM

and execute that procedure from SYSMAN.

OpenVMS V7.3

site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser

SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1

for our site RDB needs to run only on MP1 and OP1 nodes,

so on both nodes the startup seq is

.....
$ @SYS$STARTUP:RMONSTART ! RDB V7.0-6
$ @SYS$STARTUP:SQLSRV$STARTUP71
$ @SYS$STARTUPDAL$START_TR_MON.COM DISK$DTC_COMMON:[DDAL.DATABASE] -

.........

my question is, are there any issues when opening a database with
RMU/OPEN/WAIT/ACCESS=unrestricted <DB> on the node OP1 when the usual
node MP1 is down.

regards,

Nazim Manser


Quote:
--
- - - - -
opinions expressed here are mine and mine alone
and certainly are not intended in any way to
express or represent any opinions or commitment
of oracle corporation.

norman lastovica / oracle rdb engineering


Reply With Quote
  #5  
Old   
Richard Maher
 
Posts: n/a

Default Re: application failover with RDB access - 11-07-2006 , 04:30 PM



Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT return"
then I suspect that you (or the company you support) are in in big trouble!
Get yourself a professional DBA and pay them what they ask to do the job
properly. The questions turning up here (and more so in the ITRC) about Rdb
are truly frightening. I wish I could find out who these companies are and
turn up to their next risk-assessment or shareholders meeting :-(

Anyway no one can answer your question directly unless they know a bit more
about MP and OP. I suggest "yes" but if you've never tried a failover before
then what are the extra machines there for. The fact that you appear to be
running Data Distributor raises an eyebrow, but my advice is to open the
database on *all* nodes and use them *all* *all* of the time in possibly a
wide-are cluster configuration.

Rdb engineering hates clusters 'cos Norm doesn't get to use his beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote

Quote:
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1




Reply With Quote
  #6  
Old   
Nazim
 
Posts: n/a

Default Re: application failover with RDB access - 11-08-2006 , 04:26 AM




Richard Maher schrieb:

Quote:
Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT return"
then I suspect that you (or the company you support) are in in big trouble!
Get yourself a professional DBA and pay them what they ask to do the job
properly. The questions turning up here (and more so in the ITRC) about Rdb
are truly frightening. I wish I could find out who these companies are and
turn up to their next risk-assessment or shareholders meeting :-(
that is why i was assigned the task to ensure correct failover
strategy.

Quote:
Anyway no one can answer your question directly unless they know a bit more
about MP and OP. I suggest "yes" but if you've never tried a failover before
then what are the extra machines there for. The fact that you appear to be
running Data Distributor raises an eyebrow, but my advice is to open the
database on *all* nodes and use them *all* *all* of the time in possibly a
wide-are cluster configuration.

MP1 and OP1 are on 2 sites but share the samefile system.
to be precise

the file layout of the RDB stuff is as follows:

root file location : dsa618:[db_disk001.db]
RDA & SNP files: dsa618:[db_disk001.db]
dsa618:[db_disk002.db]
dsa618:[db_disk003.db]
AIJ files: dsa616:[db_diskA01.db]
dsa616:[db_diskA02.db]
RUJ files dsa617:[rdms$ruj]


MP1>sh dev dsa618

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA618: Mounted 0 DMG_DB 32582436
7696 4
$1$DGA230: (OP2) ShadowSetMember 0 (member of DSA618
$1$DGA430: (MP1) ShadowSetMember 0 (member of DSA618
MP1>sh dev dsa621

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA621: Mounted 0 DMG_DB2 12936924
5 4
$1$DGA214: (OP2) ShadowSetMember 0 (member of DSA621
$1$DGA414: (MP1) ShadowSetMember 0 (member of DSA621
MP1>sh dev dsa616

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA616: Mounted 0 DMG_AIJ 8673228
100 4
$1$DGA210: (OP2) ShadowSetMember 0 (member of DSA616
$1$DGA410: (MP1) ShadowSetMember 0 (member of DSA616
MP1>sh dev dsa617

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA617: Mounted 0 DMG_RUJ 17359776
165 4
$1$DGA211: (OP2) ShadowSetMember 0 (member of DSA617
$1$DGA411: (MP1) ShadowSetMember 0 (member of DSA617


usually a RMU/open on that DB is done only on MP1.
i would like to know what happens, when in case of failover (MP1
crashes) i do a RMU/open on OP1 node.

as it is a mission critical production DB, i want to be sure 100%
before updating our documentation.

so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

i an new (2 months) and i inherited, the task to support the
application and its underklying RDB.

regards,

Nazim Manser

Quote:
Rdb engineering hates clusters 'cos Norm doesn't get to use his beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162918356.022076.305610 (AT) b28g2000cwb (DOT) googlegroups.com...
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1



Reply With Quote
  #7  
Old   
Richard Maher
 
Posts: n/a

Default Re: application failover with RDB access - 11-08-2006 , 06:00 AM



Hi Nazim,

Quote:
that is why i was assigned the task to ensure correct failover
strategy.
And you're a contractor right? (Or you boss is a contractor?) Let's hope the
customers not reading this eh :-) I'd love to know how much the contract's
for, but then it's Cologne and not Munich and it's none of my business.

Anyway, is there not a UAT or other test environment that this can be tested
in first?

I'll assume not. When you do an RMU/DUMP/HEADER on the database(s) do they
say number of cluster nodes is "1"? If they do then you'll have to make sure
the databases are closed on MP1 before trying to open them on OP1. If not
just open them up on both nodes and fire up the application on both nodes
(if it's cluster tolerant) and get the application testing people involved.

Quote:
so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?
No, I was suggesting that the beauty of VMS clusters and Rdb is that you
don't have to "fail-over" because, personally, I would open the database and
the application on all of the nodes all of the time. If MP1 goes down then
there would be a pregnant-pause followed by MP1 users having to log in
again, but that's it. The cluster took a lickin' but it kept on tickin'.
With Rdb partitioned lock trees and all the work VMS engineering has been
doing with the DLM *and* the new interconnect stuff coming along, I see no
point in restricting a database to one node. (Never have :-)

The fact that you're using Data Distributor (why?) leeds me to suspect that
not all disks are accessible cluster wide or there's something dodgy with
the application. Our DR used to be copying RBFs over to the mirror machine
and restoring them and rolloing forward AIJs. Once every couple of years
we'd be forced to run in DR for a week and then switch back with no loss of
data. They were *never* able to get the Unix systems to achieve the same
thing! (They'd just get someone to log on and that would be that. i.e.
production never shifted) VMS guys were moving to a Disaster Tolerant set up
when I left.

My *guess* is everything will be ok except for DNS cache flushes and
hard-coded SQL/Services server names. (But then, if I was getting paid to do
it, I'd make sure :-)

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote

Quote:
Richard Maher schrieb:

Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT
return"
then I suspect that you (or the company you support) are in in big
trouble!
Get yourself a professional DBA and pay them what they ask to do the job
properly. The questions turning up here (and more so in the ITRC) about
Rdb
are truly frightening. I wish I could find out who these companies are
and
turn up to their next risk-assessment or shareholders meeting :-(

that is why i was assigned the task to ensure correct failover
strategy.


Anyway no one can answer your question directly unless they know a bit
more
about MP and OP. I suggest "yes" but if you've never tried a failover
before
then what are the extra machines there for. The fact that you appear to
be
running Data Distributor raises an eyebrow, but my advice is to open the
database on *all* nodes and use them *all* *all* of the time in possibly
a
wide-are cluster configuration.


MP1 and OP1 are on 2 sites but share the samefile system.
to be precise

the file layout of the RDB stuff is as follows:

root file location : dsa618:[db_disk001.db]
RDA & SNP files: dsa618:[db_disk001.db]
dsa618:[db_disk002.db]
dsa618:[db_disk003.db]
AIJ files: dsa616:[db_diskA01.db]
dsa616:[db_diskA02.db]
RUJ files dsa617:[rdms$ruj]


MP1>sh dev dsa618

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA618: Mounted 0 DMG_DB 32582436
7696 4
$1$DGA230: (OP2) ShadowSetMember 0 (member of DSA618
$1$DGA430: (MP1) ShadowSetMember 0 (member of DSA618
MP1>sh dev dsa621

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA621: Mounted 0 DMG_DB2 12936924
5 4
$1$DGA214: (OP2) ShadowSetMember 0 (member of DSA621
$1$DGA414: (MP1) ShadowSetMember 0 (member of DSA621
MP1>sh dev dsa616

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA616: Mounted 0 DMG_AIJ 8673228
100 4
$1$DGA210: (OP2) ShadowSetMember 0 (member of DSA616
$1$DGA410: (MP1) ShadowSetMember 0 (member of DSA616
MP1>sh dev dsa617

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA617: Mounted 0 DMG_RUJ 17359776
165 4
$1$DGA211: (OP2) ShadowSetMember 0 (member of DSA617
$1$DGA411: (MP1) ShadowSetMember 0 (member of DSA617


usually a RMU/open on that DB is done only on MP1.
i would like to know what happens, when in case of failover (MP1
crashes) i do a RMU/open on OP1 node.

as it is a mission critical production DB, i want to be sure 100%
before updating our documentation.

so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

i an new (2 months) and i inherited, the task to support the
application and its underklying RDB.

regards,

Nazim Manser

Rdb engineering hates clusters 'cos Norm doesn't get to use his beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162918356.022076.305610 (AT) b28g2000cwb (DOT) googlegroups.com...
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1





Reply With Quote
  #8  
Old   
Nazim
 
Posts: n/a

Default Re: application failover with RDB access - 11-08-2006 , 06:49 AM




Richard Maher schrieb:

Quote:
Hi Nazim,

that is why i was assigned the task to ensure correct failover
strategy.

And you're a contractor right? (Or you boss is a contractor?) Let's hope the
customers not reading this eh :-) I'd love to know how much the contract's
for, but then it's Cologne and not Munich and it's none of my business.

it is neither cologne nor munich.

yes i am contractor, my boss is permanent and only since 1 year, so he
inherited the stuff as it is.
My role is to implement the failover scenario of our app, including the
underlying RDB.
the RDB stuff was implemented long time ago, and the team left since
and the handover was not done correctly to my boss. (since he was there
all worked fine, last time the DB was opened is over a year ago.


MP1>rmu/show system sql$database
Oracle Rdb V7.0-61 on node EVAMP1 10-OCT-2006 11:22:47.97
- monitor started 6-NOV-2004 11:12:09.67 (uptime 703 00:10:38)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;77"

database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 11-NOV-2005 14:18:37.78 (elapsed 332 21:04:10)
* database is opened by an operator
- current after-image journal file is
TPZH_DAL_DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1559
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1

this is our prod DB

database DSA618:[DB_DISK001.DB]DMG_DB.RDB;1
- first opened 15-MAY-2005 07:08:43.83 (elapsed 513 04:14:04)
* database is opened by an operator
- current after-image journal file is DB_DISKA02:[AIJ]AIJ18.AIJ;1
- global buffer count is 30000; 20550 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 190 active database users


Quote:
Anyway, is there not a UAT or other test environment that this can be tested
in first?

unfortunately the UAT environment is on a standalone VMS machine


Quote:
I'll assume not. When you do an RMU/DUMP/HEADER on the database(s) do they
say number of cluster nodes is "1"? If they do then you'll have to make sure
the databases are closed on MP1 before trying to open them on OP1. If not
just open them up on both nodes and fire up the application on both nodes
(if it's cluster tolerant) and get the application testing people involved.

on the RMU/DUMP/HEADER there is no reference of cluster nodes, the only
reference is
in the case of the DDAL database is also open on OP1.

it is the node numbers.


MP1>search ddal_dump.txt node
Maximum node count is 16
- WARNING: Maximum node count is 16 instead of 1
MP1>search dmg_dump.txt node
Maximum node count is 1 ----> yes.


but what if MP1 crashes ? is there any danger to open the database on
the other node ?


our application is designed to be run only on 1 node at a time, but the
RDB can be opened also on OP1 as a standby solution.
OK before doing this i must close DB on MP1, then open on MP1 and OP1.

i have to implement the application failover scenario on the VMS side,
and the testing activities can only be done in a very restricted window
on the week end.

i have first to implement the theoretical stuff, then schedule a test
plan.

Quote:
so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

No, I was suggesting that the beauty of VMS clusters and Rdb is that you
don't have to "fail-over" because, personally, I would open the database and
the application on all of the nodes all of the time. If MP1 goes down then
there would be a pregnant-pause followed by MP1 users having to log in
again, but that's it. The cluster took a lickin' but it kept on tickin'.
With Rdb partitioned lock trees and all the work VMS engineering has been
doing with the DLM *and* the new interconnect stuff coming along, I see no
point in restricting a database to one node. (Never have :-)
this was done by other team, and they did not document why they did
that like this.

Quote:
The fact that you're using Data Distributor (why?) leeds me to suspect that
not all disks are accessible cluster wide or there's something dodgy with
the application. Our DR used to be copying RBFs over to the mirror machine
and restoring them and rolloing forward AIJs. Once every couple of years
we'd be forced to run in DR for a week and then switch back with no loss of
data. They were *never* able to get the Unix systems to achieve the same
thing! (They'd just get someone to log on and that would be that. i.e.
production never shifted) VMS guys were moving to a Disaster Tolerant set up
when I left.
do you mean by data distributor the DDAL$TR_DB.RDB ?

all the DSAn disks are accessible clustewide.

Quote:
My *guess* is everything will be ok except for DNS cache flushes and
hard-coded SQL/Services server names. (But then, if I was getting paid to do
it, I'd make sure :-)
the application specific sqlservices are setup identically on MP1 and
OP1

DNS cache switch needs also be checked with the downstram applications
which connects to our RDB, but thats another story.


regards,

Nazim Manser

Quote:
Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162981579.381769.299700 (AT) k70g2000cwa (DOT) googlegroups.com...

Richard Maher schrieb:

Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT
return"
then I suspect that you (or the company you support) are in in big
trouble!
Get yourself a professional DBA and pay them what they ask to do the job
properly. The questions turning up here (and more so in the ITRC) about
Rdb
are truly frightening. I wish I could find out who these companies are
and
turn up to their next risk-assessment or shareholders meeting :-(

that is why i was assigned the task to ensure correct failover
strategy.


Anyway no one can answer your question directly unless they know a bit
more
about MP and OP. I suggest "yes" but if you've never tried a failover
before
then what are the extra machines there for. The fact that you appear to
be
running Data Distributor raises an eyebrow, but my advice is to open the
database on *all* nodes and use them *all* *all* of the time in possibly
a
wide-are cluster configuration.


MP1 and OP1 are on 2 sites but share the samefile system.
to be precise

the file layout of the RDB stuff is as follows:

root file location : dsa618:[db_disk001.db]
RDA & SNP files: dsa618:[db_disk001.db]
dsa618:[db_disk002.db]
dsa618:[db_disk003.db]
AIJ files: dsa616:[db_diskA01.db]
dsa616:[db_diskA02.db]
RUJ files dsa617:[rdms$ruj]


MP1>sh dev dsa618

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA618: Mounted 0 DMG_DB 32582436
7696 4
$1$DGA230: (OP2) ShadowSetMember 0 (member of DSA618
$1$DGA430: (MP1) ShadowSetMember 0 (member of DSA618
MP1>sh dev dsa621

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA621: Mounted 0 DMG_DB2 12936924
5 4
$1$DGA214: (OP2) ShadowSetMember 0 (member of DSA621
$1$DGA414: (MP1) ShadowSetMember 0 (member of DSA621
MP1>sh dev dsa616

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA616: Mounted 0 DMG_AIJ 8673228
100 4
$1$DGA210: (OP2) ShadowSetMember 0 (member of DSA616
$1$DGA410: (MP1) ShadowSetMember 0 (member of DSA616
MP1>sh dev dsa617

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA617: Mounted 0 DMG_RUJ 17359776
165 4
$1$DGA211: (OP2) ShadowSetMember 0 (member of DSA617
$1$DGA411: (MP1) ShadowSetMember 0 (member of DSA617


usually a RMU/open on that DB is done only on MP1.
i would like to know what happens, when in case of failover (MP1
crashes) i do a RMU/open on OP1 node.

as it is a mission critical production DB, i want to be sure 100%
before updating our documentation.

so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

i an new (2 months) and i inherited, the task to support the
application and its underklying RDB.

regards,

Nazim Manser

Rdb engineering hates clusters 'cos Norm doesn't get to use his beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162918356.022076.305610 (AT) b28g2000cwb (DOT) googlegroups.com...
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212 19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1




Reply With Quote
  #9  
Old   
Richard Maher
 
Posts: n/a

Default Re: application failover with RDB access - 11-09-2006 , 06:33 AM



Hi Nazim,

Quote:
it is neither cologne nor munich.
Frankfurt? How's the Spring workload shaping up? :-)

Anyway, what downtime window do you have? All I can suggest, on the
information that you've given, is that you go in early one Sunday morning
and shut down the application on MP1 followed by a close of all the
databases. Then *before anything else* do a full off-line backup of all
databases (probably followed by a complete rmu/verify if you haven't been
doing them. On second thoughts, best not to ask too many questions eh :-)
Then open the database(s) and applications up on OP1 and let the testers do
their work. If the System Startups/UAFs/logicals/configs and specs are the
same then I forsee no problems. Does the RDMS$RUJ logical point to the same
place on all nodes? Anything in sys$specific?

In summary Nazim, apart from the suck-it-and-see approach, I see no way
forward.

The one question I'd be sure to ask yourself before attempting the fail-over
is "when was the last time that I've had to do a production restore in
anger?". If the answer ends up "Buggered if I know!" then I suggest that you
practice restoring the database to the test box, maybe rolling forward AIJs,
enabling AIJs again.

Are you running circular AIJs or single/extensible? ALS? You don't say
you're running hot-standby but you are running DDAL; what transfers will
stop when you switch over?

Do you have a support contract? If so call Oracle Rdb support for help. If
not, someone should bring this to the attention of the manager of the
dickhead that made that decision! Probably the same dickhead that sacked all
the real DBAs in the first place :-(

You're on your own. Good-Luck.

Regards Richard Maher

$ pipe rmu/dump/head mf_personnel | sea sys$pipe node
Maximum node count is 16
- WARNING: Maximum node count is 16 instead of 1

"Nazim" <nmanser (AT) progis (DOT) de> wrote

Quote:
Richard Maher schrieb:

Hi Nazim,

that is why i was assigned the task to ensure correct failover
strategy.

And you're a contractor right? (Or you boss is a contractor?) Let's hope
the
customers not reading this eh :-) I'd love to know how much the
contract's
for, but then it's Cologne and not Munich and it's none of my business.


it is neither cologne nor munich.

yes i am contractor, my boss is permanent and only since 1 year, so he
inherited the stuff as it is.
My role is to implement the failover scenario of our app, including the
underlying RDB.
the RDB stuff was implemented long time ago, and the team left since
and the handover was not done correctly to my boss. (since he was there
all worked fine, last time the DB was opened is over a year ago.


MP1>rmu/show system sql$database
Oracle Rdb V7.0-61 on node EVAMP1 10-OCT-2006 11:22:47.97
- monitor started 6-NOV-2004 11:12:09.67 (uptime 703 00:10:38)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;77"

database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 11-NOV-2005 14:18:37.78 (elapsed 332 21:04:10)
* database is opened by an operator
- current after-image journal file is
TPZH_DAL_DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1559
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1

this is our prod DB

database DSA618:[DB_DISK001.DB]DMG_DB.RDB;1
- first opened 15-MAY-2005 07:08:43.83 (elapsed 513 04:14:04)
* database is opened by an operator
- current after-image journal file is DB_DISKA02:[AIJ]AIJ18.AIJ;1
- global buffer count is 30000; 20550 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 190 active database users


Anyway, is there not a UAT or other test environment that this can be
tested
in first?


unfortunately the UAT environment is on a standalone VMS machine


I'll assume not. When you do an RMU/DUMP/HEADER on the database(s) do
they
say number of cluster nodes is "1"? If they do then you'll have to make
sure
the databases are closed on MP1 before trying to open them on OP1. If
not
just open them up on both nodes and fire up the application on both
nodes
(if it's cluster tolerant) and get the application testing people
involved.


on the RMU/DUMP/HEADER there is no reference of cluster nodes, the only
reference is
in the case of the DDAL database is also open on OP1.

it is the node numbers.


MP1>search ddal_dump.txt node
Maximum node count is 16
- WARNING: Maximum node count is 16 instead of 1
MP1>search dmg_dump.txt node
Maximum node count is 1 ----> yes.


but what if MP1 crashes ? is there any danger to open the database on
the other node ?


our application is designed to be run only on 1 node at a time, but the
RDB can be opened also on OP1 as a standby solution.
OK before doing this i must close DB on MP1, then open on MP1 and OP1.

i have to implement the application failover scenario on the VMS side,
and the testing activities can only be done in a very restricted window
on the week end.

i have first to implement the theoretical stuff, then schedule a test
plan.


so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

No, I was suggesting that the beauty of VMS clusters and Rdb is that you
don't have to "fail-over" because, personally, I would open the database
and
the application on all of the nodes all of the time. If MP1 goes down
then
there would be a pregnant-pause followed by MP1 users having to log in
again, but that's it. The cluster took a lickin' but it kept on tickin'.
With Rdb partitioned lock trees and all the work VMS engineering has
been
doing with the DLM *and* the new interconnect stuff coming along, I see
no
point in restricting a database to one node. (Never have :-)

this was done by other team, and they did not document why they did
that like this.


The fact that you're using Data Distributor (why?) leeds me to suspect
that
not all disks are accessible cluster wide or there's something dodgy
with
the application. Our DR used to be copying RBFs over to the mirror
machine
and restoring them and rolloing forward AIJs. Once every couple of years
we'd be forced to run in DR for a week and then switch back with no loss
of
data. They were *never* able to get the Unix systems to achieve the same
thing! (They'd just get someone to log on and that would be that. i.e.
production never shifted) VMS guys were moving to a Disaster Tolerant
set up
when I left.

do you mean by data distributor the DDAL$TR_DB.RDB ?

all the DSAn disks are accessible clustewide.


My *guess* is everything will be ok except for DNS cache flushes and
hard-coded SQL/Services server names. (But then, if I was getting paid
to do
it, I'd make sure :-)

the application specific sqlservices are setup identically on MP1 and
OP1

DNS cache switch needs also be checked with the downstram applications
which connects to our RDB, but thats another story.


regards,

Nazim Manser


Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162981579.381769.299700 (AT) k70g2000cwa (DOT) googlegroups.com...

Richard Maher schrieb:

Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT
return"
then I suspect that you (or the company you support) are in in big
trouble!
Get yourself a professional DBA and pay them what they ask to do the
job
properly. The questions turning up here (and more so in the ITRC)
about
Rdb
are truly frightening. I wish I could find out who these companies
are
and
turn up to their next risk-assessment or shareholders meeting :-(

that is why i was assigned the task to ensure correct failover
strategy.


Anyway no one can answer your question directly unless they know a
bit
more
about MP and OP. I suggest "yes" but if you've never tried a
failover
before
then what are the extra machines there for. The fact that you appear
to
be
running Data Distributor raises an eyebrow, but my advice is to open
the
database on *all* nodes and use them *all* *all* of the time in
possibly
a
wide-are cluster configuration.


MP1 and OP1 are on 2 sites but share the samefile system.
to be precise

the file layout of the RDB stuff is as follows:

root file location : dsa618:[db_disk001.db]
RDA & SNP files: dsa618:[db_disk001.db]
dsa618:[db_disk002.db]
dsa618:[db_disk003.db]
AIJ files: dsa616:[db_diskA01.db]
dsa616:[db_diskA02.db]
RUJ files dsa617:[rdms$ruj]


MP1>sh dev dsa618

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA618: Mounted 0 DMG_DB 32582436
7696 4
$1$DGA230: (OP2) ShadowSetMember 0 (member of DSA618
$1$DGA430: (MP1) ShadowSetMember 0 (member of DSA618
MP1>sh dev dsa621

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA621: Mounted 0 DMG_DB2 12936924
5 4
$1$DGA214: (OP2) ShadowSetMember 0 (member of DSA621
$1$DGA414: (MP1) ShadowSetMember 0 (member of DSA621
MP1>sh dev dsa616

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA616: Mounted 0 DMG_AIJ 8673228
100 4
$1$DGA210: (OP2) ShadowSetMember 0 (member of DSA616
$1$DGA410: (MP1) ShadowSetMember 0 (member of DSA616
MP1>sh dev dsa617

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA617: Mounted 0 DMG_RUJ 17359776
165 4
$1$DGA211: (OP2) ShadowSetMember 0 (member of DSA617
$1$DGA411: (MP1) ShadowSetMember 0 (member of DSA617


usually a RMU/open on that DB is done only on MP1.
i would like to know what happens, when in case of failover (MP1
crashes) i do a RMU/open on OP1 node.

as it is a mission critical production DB, i want to be sure 100%
before updating our documentation.

so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

i an new (2 months) and i inherited, the task to support the
application and its underklying RDB.

regards,

Nazim Manser

Rdb engineering hates clusters 'cos Norm doesn't get to use his
beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162918356.022076.305610 (AT) b28g2000cwb (DOT) googlegroups.com...
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on
a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems
doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the
DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and
spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212
19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is
DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1






Reply With Quote
  #10  
Old   
Nazim
 
Posts: n/a

Default Re: application failover with RDB access - 11-16-2006 , 01:05 PM




Richard Maher schrieb:

Quote:
Hi Nazim,

it is neither cologne nor munich.
Frankfurt? How's the Spring workload shaping up? :-)
neither, located outside Germany.


Quote:
Anyway, what downtime window do you have?
in production, the maintenance window must be scheduled in advance and
involving a lot of other teams (downstream applications)
before doing such test in prod, it must habe been tested successfully
in the test env.


but i have to do it first in the test environment, problem is that it
consists of only 1 standalone machine.
i have to ask responsables for configuring the test environment similar
to prod (.ie. 2 node cluster with quorum duisk and shared storage)

Quote:
All I can suggest, on the
information that you've given, is that you go in early one Sunday morning
and shut down the application on MP1 followed by a close of all the
databases. Then *before anything else* do a full off-line backup of all
databases (probably followed by a complete rmu/verify if you haven't been
doing them.
we backup daily the RDB in hot backup mode.
the DB remains open.

if i do $rmu/verify/root is that sufficient ? it takes 13 min.

Quote:
On second thoughts, best not to ask too many questions eh :-)
Then open the database(s) and applications up on OP1 and let the testers do
their work. If the System Startups/UAFs/logicals/configs and specs are the
same then I forsee no problems.
i have checked

system startups from a common source
UAF , rightslist from a common area
logicals and config are centralized, only a parameter decides on which
node the app runs.


Quote:
Does the RDMS$RUJ logical point to the same
place on all nodes? Anything in sys$specific?
SYSMAN> do show log /all /full rdms$ruj
%SYSMAN-I-OUTPUT, command execution on node QRM
"RDMS$RUJ" [exec] = "DB_DISKRUJ" (LNMicrosoftSYSTEM_TABLE)
1 "DB_DISKRUJ" [exec] = "DSA617:[RDMS$RUJ]" [terminal]
(LNMicrosoftSYSTEM_TABLE)
%SYSMAN-I-OUTPUT, command execution on node OP2
"RDMS$RUJ" [exec] = "DB_DISKRUJ" (LNMicrosoftSYSTEM_TABLE)
1 "DB_DISKRUJ" [exec] = "DSA617:[RDMS$RUJ]" [terminal]
(LNMicrosoftSYSTEM_TABLE)
%SYSMAN-I-OUTPUT, command execution on node OP1
"RDMS$RUJ" [exec] = "DB_DISKRUJ" (LNMicrosoftSYSTEM_TABLE)
1 "DB_DISKRUJ" [exec] = "DSA617:[RDMS$RUJ]" [terminal]
(LNMicrosoftSYSTEM_TABLE)
%SYSMAN-I-OUTPUT, command execution on node MP2
"RDMS$RUJ" [exec] = "DB_DISKRUJ" (LNMicrosoftSYSTEM_TABLE)
1 "DB_DISKRUJ" [exec] = "DSA617:[RDMS$RUJ]" [terminal]
(LNMicrosoftSYSTEM_TABLE)
%SYSMAN-I-OUTPUT, command execution on node MP1
"RDMS$RUJ" [exec] = "DB_DISKRUJ" (LNMicrosoftSYSTEM_TABLE)
1 "DB_DISKRUJ" [exec] = "DSA617:[RDMS$RUJ]" [terminal]
(LNMicrosoftSYSTEM_TABLE)


Quote:
In summary Nazim, apart from the suck-it-and-see approach, I see no way
forward.

The one question I'd be sure to ask yourself before attempting the fail-over
is "when was the last time that I've had to do a production restore in
anger?". If the answer ends up "Buggered if I know!" then I suggest that you
practice restoring the database to the test box, maybe rolling forward AIJs,
enabling AIJs again.

on the test box AIJ is disabled. i have to enable it, when they agree
to increae the disk space

disk space status in test.

root file , SNP and RDA ---> DSA1: 2,5 GB free of a total of 18 GB
SNP and RDA ----> DSA2: 1,6 GB free of a total of 18 GB
SNP and RDA -----> DSA3: 2,8 GB free of a total of 18 GB
and RUJ


disk space status on prod

root file, RDA & SNP ------> DSA618 15,7 GB free of a total of 69,5
GB
RDA & SNP ------> DSA621 6,3 GB free of a total of 8 GB
AIJ -------> DSA616 4,2 GB free of a total
of 8 GB
RUJ --------> DSA617 almost all free of 8 GB
total



Quote:
Are you running circular AIJs or single/extensible?
- After-image journaling is enabled
- Database is configured for 70 journals
- Reserved journal count is 70
- Available journal count is 36
- LogMiner is disabled
- Journal switches to next available when full
- 1 journal has been modified with transaction data
- 34 journals can be created while database is active
- Journal "AIJ35" is current
- All journals are accessible
- Shutdown time is 120 minutes
- Backup operation is automatic via server
- Backup uses no-quiet-point
- Default backup filename edits are not used
- Log server startup is AUTOMATIC
- Operator notification is enabled for the following operators
Central
Cluster
- Journal overwrite is disabled
- AIJ cache on "electronic disk" is disabled
- Default journal allocation is 250000 blocks
- Default journal extension is 25000 blocks
Default extension ignored because multiple journals active
- Default journal initialization is 250000 blocks

Quote:
ALS?
OpenVMS V7.3 on node MP1 16-NOV-2006 19:51:47.90 Uptime 14 12:25:31
Pid Process Name State Pri I/O CPU Page flts
Pages
23200430 RDMS_MONITOR LEF 15 160402 0 00:01:09.79 110471
88
23200561 RDM_ALS_1 HIB 15 211180 0 00:02:56.15 219
314
23200643 RDM_ALS_2 HIB 15 6253 0 00:01:18.45 402
491



Quote:
You don't say
you're running hot-standby but you are running DDAL; what transfers will
stop when you switch over?
the DDAL is only replicating a subset of data destined to the public.
its purpose is only selective nature not for availability.

Quote:
Do you have a support contract?
i was engaged to do the work. :-)

Quote:
If so call Oracle Rdb support for help. If
not, someone should bring this to the attention of the manager of the
dickhead that made that decision! Probably the same dickhead that sacked all
the real DBAs in the first place :-(

You're on your own. Good-Luck.

Regards Richard Maher

$ pipe rmu/dump/head mf_personnel | sea sys$pipe node
Maximum node count is 16
- WARNING: Maximum node count is 16 instead of 1

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162990172.492625.213000 (AT) f16g2000cwb (DOT) googlegroups.com...

Richard Maher schrieb:

Hi Nazim,

that is why i was assigned the task to ensure correct failover
strategy.

And you're a contractor right? (Or you boss is a contractor?) Let's hope
the
customers not reading this eh :-) I'd love to know how much the
contract's
for, but then it's Cologne and not Munich and it's none of my business.


it is neither cologne nor munich.

yes i am contractor, my boss is permanent and only since 1 year, so he
inherited the stuff as it is.
My role is to implement the failover scenario of our app, including the
underlying RDB.
the RDB stuff was implemented long time ago, and the team left since
and the handover was not done correctly to my boss. (since he was there
all worked fine, last time the DB was opened is over a year ago.


MP1>rmu/show system sql$database
Oracle Rdb V7.0-61 on node EVAMP1 10-OCT-2006 11:22:47.97
- monitor started 6-NOV-2004 11:12:09.67 (uptime 703 00:10:38)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;77"

database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 11-NOV-2005 14:18:37.78 (elapsed 332 21:04:10)
* database is opened by an operator
- current after-image journal file is
TPZH_DAL_DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1559
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1

this is our prod DB

database DSA618:[DB_DISK001.DB]DMG_DB.RDB;1
- first opened 15-MAY-2005 07:08:43.83 (elapsed 513 04:14:04)
* database is opened by an operator
- current after-image journal file is DB_DISKA02:[AIJ]AIJ18.AIJ;1
- global buffer count is 30000; 20550 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 190 active database users


Anyway, is there not a UAT or other test environment that this can be
tested
in first?


unfortunately the UAT environment is on a standalone VMS machine


I'll assume not. When you do an RMU/DUMP/HEADER on the database(s) do
they
say number of cluster nodes is "1"? If they do then you'll have to make
sure
the databases are closed on MP1 before trying to open them on OP1. If
not
just open them up on both nodes and fire up the application on both
nodes
(if it's cluster tolerant) and get the application testing people
involved.


on the RMU/DUMP/HEADER there is no reference of cluster nodes, the only
reference is
in the case of the DDAL database is also open on OP1.

it is the node numbers.


MP1>search ddal_dump.txt node
Maximum node count is 16
- WARNING: Maximum node count is 16 instead of 1
MP1>search dmg_dump.txt node
Maximum node count is 1 ----> yes.


but what if MP1 crashes ? is there any danger to open the database on
the other node ?


our application is designed to be run only on 1 node at a time, but the
RDB can be opened also on OP1 as a standby solution.
OK before doing this i must close DB on MP1, then open on MP1 and OP1.

i have to implement the application failover scenario on the VMS side,
and the testing activities can only be done in a very restricted window
on the week end.

i have first to implement the theoretical stuff, then schedule a test
plan.


so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

No, I was suggesting that the beauty of VMS clusters and Rdb is that you
don't have to "fail-over" because, personally, I would open the database
and
the application on all of the nodes all of the time. If MP1 goes down
then
there would be a pregnant-pause followed by MP1 users having to log in
again, but that's it. The cluster took a lickin' but it kept on tickin'.
With Rdb partitioned lock trees and all the work VMS engineering has
been
doing with the DLM *and* the new interconnect stuff coming along, I see
no
point in restricting a database to one node. (Never have :-)

this was done by other team, and they did not document why they did
that like this.


The fact that you're using Data Distributor (why?) leeds me to suspect
that
not all disks are accessible cluster wide or there's something dodgy
with
the application. Our DR used to be copying RBFs over to the mirror
machine
and restoring them and rolloing forward AIJs. Once every couple of years
we'd be forced to run in DR for a week and then switch back with no loss
of
data. They were *never* able to get the Unix systems to achieve the same
thing! (They'd just get someone to log on and that would be that. i.e.
production never shifted) VMS guys were moving to a Disaster Tolerant
set up
when I left.

do you mean by data distributor the DDAL$TR_DB.RDB ?

all the DSAn disks are accessible clustewide.


My *guess* is everything will be ok except for DNS cache flushes and
hard-coded SQL/Services server names. (But then, if I was getting paid
to do
it, I'd make sure :-)

the application specific sqlservices are setup identically on MP1 and
OP1

DNS cache switch needs also be checked with the downstram applications
which connects to our RDB, but thats another story.


regards,

Nazim Manser


Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162981579.381769.299700 (AT) k70g2000cwa (DOT) googlegroups.com...

Richard Maher schrieb:

Hi Nazim,

If this system runs anything other "Mom and Dad's corner Deli VAT
return"
then I suspect that you (or the company you support) are in in big
trouble!
Get yourself a professional DBA and pay them what they ask to do the
job
properly. The questions turning up here (and more so in the ITRC)
about
Rdb
are truly frightening. I wish I could find out who these companies
are
and
turn up to their next risk-assessment or shareholders meeting :-(

that is why i was assigned the task to ensure correct failover
strategy.


Anyway no one can answer your question directly unless they know a
bit
more
about MP and OP. I suggest "yes" but if you've never tried a
failover
before
then what are the extra machines there for. The fact that you appear
to
be
running Data Distributor raises an eyebrow, but my advice is to open
the
database on *all* nodes and use them *all* *all* of the time in
possibly
a
wide-are cluster configuration.


MP1 and OP1 are on 2 sites but share the samefile system.
to be precise

the file layout of the RDB stuff is as follows:

root file location : dsa618:[db_disk001.db]
RDA & SNP files: dsa618:[db_disk001.db]
dsa618:[db_disk002.db]
dsa618:[db_disk003.db]
AIJ files: dsa616:[db_diskA01.db]
dsa616:[db_diskA02.db]
RUJ files dsa617:[rdms$ruj]


MP1>sh dev dsa618

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA618: Mounted 0 DMG_DB 32582436
7696 4
$1$DGA230: (OP2) ShadowSetMember 0 (member of DSA618
$1$DGA430: (MP1) ShadowSetMember 0 (member of DSA618
MP1>sh dev dsa621

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA621: Mounted 0 DMG_DB2 12936924
5 4
$1$DGA214: (OP2) ShadowSetMember 0 (member of DSA621
$1$DGA414: (MP1) ShadowSetMember 0 (member of DSA621
MP1>sh dev dsa616

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA616: Mounted 0 DMG_AIJ 8673228
100 4
$1$DGA210: (OP2) ShadowSetMember 0 (member of DSA616
$1$DGA410: (MP1) ShadowSetMember 0 (member of DSA616
MP1>sh dev dsa617

Device Device Error Volume Free
Trans Mnt
Name Status Count Label Blocks
Count Cnt
DSA617: Mounted 0 DMG_RUJ 17359776
165 4
$1$DGA211: (OP2) ShadowSetMember 0 (member of DSA617
$1$DGA411: (MP1) ShadowSetMember 0 (member of DSA617


usually a RMU/open on that DB is done only on MP1.
i would like to know what happens, when in case of failover (MP1
crashes) i do a RMU/open on OP1 node.

as it is a mission critical production DB, i want to be sure 100%
before updating our documentation.

so as you say, the RMU/open should be done on both MP1 and OP1 as soon
as they reboot. correct ?

i an new (2 months) and i inherited, the task to support the
application and its underklying RDB.

regards,

Nazim Manser

Rdb engineering hates clusters 'cos Norm doesn't get to use his
beloved
Row-Ca$h, but don't let that bother you.

Regards Richard Maher

"Nazim" <nmanser (AT) progis (DOT) de> wrote in message
news:1162918356.022076.305610 (AT) b28g2000cwb (DOT) googlegroups.com...
Hi guys,

we are running RDB (Oracle Rdb V7.0-61), SQLSERVICES (v7.1-59) on
a 5
node cluster on 2 sites.
OpenVMS V7.3


site 1:

MP1 sys$sysroot = DSA200:[SYS0.]
MP2 sys$sysroot = DSA200:[SYS1.]

site 2:

OP1 sys$sysroot = DSA100:[SYS0.]
OP2 sys$sysroot = DSA100:[SYS1.]
QRM sys$sysroot = DSA300:[SYS0.]

our application runs on node MP1 and uses the following DB
database DSA618:[DB_DISK001.DB]DB.RDB

but for failover scenario we need to do a RMU/OPEN
DSA618:[DB_DISK001.DB]DB.RDB on node OP1, are there any problems
doing
this ?

RDB is started on nodes MP1 and OP1 but in normal operations the
DB
database DSA618:[DB_DISK001.DB]DB.RDB is opened only on node MP1

thanks for your answers

N.Manser



SYSMAN> do rmu/show system
%SYSMAN-I-OUTPUT, command execution on node QRM
%DCL-W-IVVERB, unrecognized command verb - check validity and
spelling
\RMU\
%SYSMAN-I-OUTPUT, command execution on node OP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA100:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node OP1
Oracle Rdb V7.0-61 on node OP1 7-NOV-2006 17:40:14.74
- monitor started 8-APR-2006 22:29:10.31 (uptime 212
19:11:04)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;107"
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 8-APR-2006 22:30:00.82 (elapsed 212 19:10:13)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1575
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
MP1
%SYSMAN-I-OUTPUT, command execution on node MP2
%DCL-W-ACTIMAGE, error activating image RDMPRV
-CLI-E-IMGNAME, image file
DSA200:[SYS1.SYSCOMMON.][SYSLIB]RDMPRV.EXE;8
-SYSTEM-F-PROTINSTALL, protected images must be installed
%SYSMAN-I-OUTPUT, command execution on node MP1
Oracle Rdb V7.0-61 on node MP1 7-NOV-2006 17:40:13.59
- monitor started 2-NOV-2006 07:36:25.92 (uptime 5 10:03:47)
- monitor log filename is "SYS$SYSROOT:[SYSEXE]RDMMON.LOG;79"
database DSA618:[DB_DISK001.DB]DB.RDB;1
- first opened 2-NOV-2006 08:32:01.47 (elapsed 5 09:08:12)
* database is opened by an operator
- current after-image journal file is
DB_DISKA01:[AIJ]AIJ25.AIJ;1
- global buffer count is 30000; 22250 global buffers free
- maximum global buffer count per user is 100
- global section resides in system space
- AIJ Log Server is active
- 156 active database users
database DSA0:[DDAL.DATABASE]DDAL$TR_DB.RDB;1
- first opened 2-NOV-2006 07:37:17.02 (elapsed 5 10:02:56)
- current after-image journal file is
DB_DISKA01:[AIJ]DDAL_AIJ001.AIJ;1578
- AIJ Log Server is active
- 2 active database users
- database also open on these nodes:
OP1





Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.