dbTalk Databases Forums  

Replication question

comp.databases.informix comp.databases.informix


Discuss Replication question in the comp.databases.informix forum.



Reply
 
Thread Tools Search this Thread Display Modes
  #1  
Old   
Laurie Gustin
 
Posts: n/a

Default Replication question - 11-21-2007 , 11:08 AM






IDS 10FC6 HP-UX 11.11


My server went into DDR Block mode last night.. in the process of getting the rest of my systems back online, the instance got re-started. There was a bunch of data in the recv queue on the recieving server that finally got processed, but it appears that no data is replicating now. I think it is because the instance got rebooted while in DDR Block, so it kinda lost its mind and didnt know it was supposed to go into catch up mode.. I can deal with that -

My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...

I have already resolved to stop/delete and recreate all my replicates (good thing I saved my scripts!) - but any additional light someone can shed on this would be appreciated.


IBM Informix Dynamic Server Version 10.00.FC6 -- On-Line -- Up 11:21:51 -- 1949156 Kbytes

DDR -- Running --

# Event Snoopy Snoopy Replay Replay Current Current
Buffers ID Position ID Position ID Position
3088 38464 c7fe700 38464 c7f919c 38501 4c654000

Log Pages Snooped:
From From Tossed
Cache Disk (LBC full)
0 51199 2538285

Total dynamic log requests: 0

DDR events queue

Type TX id Partnum Row id




Thanks

Laurie!




Reply With Quote
  #2  
Old   
jprenaut@yahoo.com
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 12:38 PM







Quote:
My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...

The value under Tossed is that as your current system is generating
log records and putting them in the logical log buffer, when the
logical log buffer is getting flushed to disk, CDR will attempt to
copy that logical log buffer from memory into a cdr specific log
record buffer cache for the snoopy thread so it wouldn't have to read
those records from disk. However, there is a limited amount of those
buffers and if that cache is already full, then we increment the
tossed count and then those records would just have to be snooped from
disk. I don't believe that would be a reason why stuff isn't
replicating. It looks to be more of a performance sort of value where
ideally the snooper would be running more quickly if it was able to
get it's work from this cache and not have to retrieve it from disk.

Did the replay postion get over written? When the server was bounced
was there any messages in the online.log concerning the ability of CDR
to start backup? If logical log id 38464 (uniqid from onstat -l
output) is no longer on disk (ie the reply position getting over
written), then yeah CDR is in big trouble, as that would be the spot
at which the snoopy thread needs to get to start back up snooping the
logs, and if it can't then I don't believe anything will get
replicated since the snooper isn't reading any log records. However,
if logical log id 38464 is still on disk and the replay position
hasn't been over written, without seeing other information it would be
hard to say why the replay position isn't advancing.

Jacques
IBM Informix


Reply With Quote
  #3  
Old   
Laurie Gustin
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 01:05 PM



Hmmm - the replay position hasnt been overwritten. I have attached onstat -l and onstat -g ddr outputs.

I added a bunch of logs last night to make sure we didnt block again until I could do more research this morning (and think more clearly..)
as you can see from the onstat -l output, there are tons of new logs out there, but the system just dynamically created log #89. That really confused me.

so maybe the problem is just with my logs...

Thanks.
Laurie

Quote:
jprenaut (AT) yahoo (DOT) com> 11/21/07 10:38 AM


My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...

The value under Tossed is that as your current system is generating
log records and putting them in the logical log buffer, when the
logical log buffer is getting flushed to disk, CDR will attempt to
copy that logical log buffer from memory into a cdr specific log
record buffer cache for the snoopy thread so it wouldn't have to read
those records from disk. However, there is a limited amount of those
buffers and if that cache is already full, then we increment the
tossed count and then those records would just have to be snooped from
disk. I don't believe that would be a reason why stuff isn't
replicating. It looks to be more of a performance sort of value where
ideally the snooper would be running more quickly if it was able to
get it's work from this cache and not have to retrieve it from disk.

Did the replay postion get over written? When the server was bounced
was there any messages in the online.log concerning the ability of CDR
to start backup? If logical log id 38464 (uniqid from onstat -l
output) is no longer on disk (ie the reply position getting over
written), then yeah CDR is in big trouble, as that would be the spot
at which the snoopy thread needs to get to start back up snooping the
logs, and if it can't then I don't believe anything will get
replicated since the snooper isn't reading any log records. However,
if logical log id 38464 is still on disk and the replay position
hasn't been over written, without seeing other information it would be
hard to say why the replay position isn't advancing.

Jacques
IBM Informix
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list




Reply With Quote
  #4  
Old   
TBP
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 03:16 PM



Laurie Gustin wrote:
Quote:
Hmmm - the replay position hasnt been overwritten. I have attached onstat -l and onstat -g ddr outputs.

I added a bunch of logs last night to make sure we didnt block again until I could do more research this morning (and think more clearly..)
as you can see from the onstat -l output, there are tons of new logs out there, but the system just dynamically created log #89. That really confused me.

so maybe the problem is just with my logs...

Thanks.
Laurie

Do a cdr list server

onstat -g rqm brief

onstat -g nif

Perhaps attach an onstat -g cat

With the above output, may be able to determine if the problem is on the
source or target(s) of this server.


Reply With Quote
  #5  
Old   
Madison Pruet
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 05:14 PM



Laurie Gustin wrote:
Quote:
Hmmm - the replay position hasnt been overwritten. I have attached onstat -l and onstat -g ddr outputs.

I added a bunch of logs last night to make sure we didnt block again until I could do more research this morning (and think more clearly..)
as you can see from the onstat -l output, there are tons of new logs out there, but the system just dynamically created log #89. That really confused me.

so maybe the problem is just with my logs...
onstat -g stk of the ddr thread?


Quote:
Thanks.
Laurie

jprenaut (AT) yahoo (DOT) com> 11/21/07 10:38 AM

My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...


The value under Tossed is that as your current system is generating
log records and putting them in the logical log buffer, when the
logical log buffer is getting flushed to disk, CDR will attempt to
copy that logical log buffer from memory into a cdr specific log
record buffer cache for the snoopy thread so it wouldn't have to read
those records from disk. However, there is a limited amount of those
buffers and if that cache is already full, then we increment the
tossed count and then those records would just have to be snooped from
disk. I don't believe that would be a reason why stuff isn't
replicating. It looks to be more of a performance sort of value where
ideally the snooper would be running more quickly if it was able to
get it's work from this cache and not have to retrieve it from disk.

Did the replay postion get over written? When the server was bounced
was there any messages in the online.log concerning the ability of CDR
to start backup? If logical log id 38464 (uniqid from onstat -l
output) is no longer on disk (ie the reply position getting over
written), then yeah CDR is in big trouble, as that would be the spot
at which the snoopy thread needs to get to start back up snooping the
logs, and if it can't then I don't believe anything will get
replicated since the snooper isn't reading any log records. However,
if logical log id 38464 is still on disk and the replay position
hasn't been over written, without seeing other information it would be
hard to say why the replay position isn't advancing.

Jacques
IBM Informix
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list



Reply With Quote
  #6  
Old   
Laurie Gustin
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 05:43 PM



UPDATE - The engine went into DDR Block mode again - taking down all my systems (again...) This time I restarted the engine, cdr stop on both servers.. then when I started cdr again, the snoopy and replay position appeared to be correct, and they were 'moving'. However, I am trying to cdr check-repair one replicate and it is taking forever (there are only 348 rows in the table and it has been running for over an hour.

How do I know which CDR thread? I did onstat -g stk all and just deleted all but the CDR stuff

Thing appear to be moving through... but my sync jobs still dont seem to be working.

Thanks in advance for any insight..

Laurie

Laurie Gustin
IT Programmer Analyst
Department of Public Safety
lgustin (AT) utah (DOT) gov
801-965-4410

Quote:
Madison Pruet <mpruet1 (AT) verizon (DOT) net> 11/21/07 3:14 PM
Laurie Gustin wrote:
Hmmm - the replay position hasnt been overwritten. I have attached onstat -l and onstat -g ddr outputs.

I added a bunch of logs last night to make sure we didnt block again until I could do more research this morning (and think more clearly..)
as you can see from the onstat -l output, there are tons of new logs out there, but the system just dynamically created log #89. That really confused me.

so maybe the problem is just with my logs...
onstat -g stk of the ddr thread?


Quote:
Thanks.
Laurie

jprenaut (AT) yahoo (DOT) com> 11/21/07 10:38 AM

My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...


The value under Tossed is that as your current system is generating
log records and putting them in the logical log buffer, when the
logical log buffer is getting flushed to disk, CDR will attempt to
copy that logical log buffer from memory into a cdr specific log
record buffer cache for the snoopy thread so it wouldn't have to read
those records from disk. However, there is a limited amount of those
buffers and if that cache is already full, then we increment the
tossed count and then those records would just have to be snooped from
disk. I don't believe that would be a reason why stuff isn't
replicating. It looks to be more of a performance sort of value where
ideally the snooper would be running more quickly if it was able to
get it's work from this cache and not have to retrieve it from disk.

Did the replay postion get over written? When the server was bounced
was there any messages in the online.log concerning the ability of CDR
to start backup? If logical log id 38464 (uniqid from onstat -l
output) is no longer on disk (ie the reply position getting over
written), then yeah CDR is in big trouble, as that would be the spot
at which the snoopy thread needs to get to start back up snooping the
logs, and if it can't then I don't believe anything will get
replicated since the snooper isn't reading any log records. However,
if logical log id 38464 is still on disk and the replay position
hasn't been over written, without seeing other information it would be
hard to say why the replay position isn't advancing.

Jacques
IBM Informix
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list







Reply With Quote
  #7  
Old   
Madison Pruet
 
Posts: n/a

Default Re: Replication question - 11-21-2007 , 05:57 PM



Laurie Gustin wrote:
Quote:
UPDATE - The engine went into DDR Block mode again - taking down all my systems (again...) This time I restarted the engine, cdr stop on both servers.. then when I started cdr again, the snoopy and replay position appeared to be correct, and they were 'moving'. However, I am trying to cdr check-repair one replicate and it is taking forever (there are only 348 rows in the table and it has been running for over an hour.

How do I know which CDR thread? I did onstat -g stk all and just deleted all but the CDR stuff

Thing appear to be moving through... but my sync jobs still dont seem to be working.

Thanks in advance for any insight..
The cdr check --repair runs as an sqlexec thread, not as a CDR thread.
Are there blobs/sblobs ? That's the only reason that I can think of
that the repair would take so long.


Quote:
Laurie

Laurie Gustin
IT Programmer Analyst
Department of Public Safety
lgustin (AT) utah (DOT) gov
801-965-4410

Madison Pruet <mpruet1 (AT) verizon (DOT) net> 11/21/07 3:14 PM
Laurie Gustin wrote:
Hmmm - the replay position hasnt been overwritten. I have attached onstat -l and onstat -g ddr outputs.

I added a bunch of logs last night to make sure we didnt block again until I could do more research this morning (and think more clearly..)
as you can see from the onstat -l output, there are tons of new logs out there, but the system just dynamically created log #89. That really confused me.

so maybe the problem is just with my logs...

onstat -g stk of the ddr thread?


Thanks.
Laurie

jprenaut (AT) yahoo (DOT) com> 11/21/07 10:38 AM
My question has to do with the onstat -g ddr output (see below..) just what is the value under Tossed (LBC full) Im thinking that is all my data that is just being tossed and not replicating. Also the snoopy and replay position are not moving at all...

The value under Tossed is that as your current system is generating
log records and putting them in the logical log buffer, when the
logical log buffer is getting flushed to disk, CDR will attempt to
copy that logical log buffer from memory into a cdr specific log
record buffer cache for the snoopy thread so it wouldn't have to read
those records from disk. However, there is a limited amount of those
buffers and if that cache is already full, then we increment the
tossed count and then those records would just have to be snooped from
disk. I don't believe that would be a reason why stuff isn't
replicating. It looks to be more of a performance sort of value where
ideally the snooper would be running more quickly if it was able to
get it's work from this cache and not have to retrieve it from disk.

Did the replay postion get over written? When the server was bounced
was there any messages in the online.log concerning the ability of CDR
to start backup? If logical log id 38464 (uniqid from onstat -l
output) is no longer on disk (ie the reply position getting over
written), then yeah CDR is in big trouble, as that would be the spot
at which the snoopy thread needs to get to start back up snooping the
logs, and if it can't then I don't believe anything will get
replicated since the snooper isn't reading any log records. However,
if logical log id 38464 is still on disk and the replay position
hasn't been over written, without seeing other information it would be
hard to say why the replay position isn't advancing.

Jacques
IBM Informix
_______________________________________________
Informix-list mailing list
Informix-list (AT) iiug (DOT) org
http://www.iiug.org/mailman/listinfo/informix-list






Reply With Quote
  #8  
Old   
Superboer
 
Posts: n/a

Default Re: Replication question - 11-22-2007 , 04:09 AM




maybe a trigger which updates a table without where clause which is
also replicated or executing an spl which does a lot of work...
no long trx rollbacks??????

i would check the schema of the tables...

Superboer

Reply With Quote
Reply




Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.