dbTalk Databases Forums  

rep_process_message and Invalid argument

comp.databases.berkeley-db comp.databases.berkeley-db


Discuss rep_process_message and Invalid argument in the comp.databases.berkeley-db forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
sarah
 
Posts: n/a

Default rep_process_message and Invalid argument - 10-14-2005 , 07:08 PM






Hi,
I have two nodes (one master and one client). The application will
create four databases (a b c d). Suppose my database environment and
database file directory is TEST. If I start master with empty TEST,
and later start client. They will synchronize each other. At this
time, before I make any change to the database files in the TEST on the
master (insert or delete record), I shut down client application. When
i start client again, the master begin to synchronize client. However,
the client gives DBException (c++) from rep_process_message method:

DbEnv::rep_process_message: Invalid argument

Here is the log file for master:
...............
Three database files are already created

[1][26426]__db_debug: rec: 47 txnid 80000008 prevlsn [0][0]
op:
fileid: 0
key:
data:
arg_flags: 0

[1][26474]__fop_create: rec: 143 txnid 80000009 prevlsn [0][0]
name: /TEST/1.673a0
appname: 1
mode: 600

[1][26545]__fop_write: rec: 145 txnid 80000009 prevlsn [1][26474]
name: /TEST/1.673a0
appname: 1
pgsize: 4096
pageno: 0
offset: 0
page: long data (omitted)
flag: 1

[1][30728]__fop_write: rec: 145 txnid 80000009 prevlsn [1][26545]
name: /TEST/1.673a0
appname: 1
pgsize: 4096
pageno: 1
offset: 0
page: long data (omitted)
flag: 1

[1][34911]__fop_rename: rec: 146 txnid 80000009 prevlsn [1][30728]
oldname: /TEST/1.673a0
newname: /TEST/d0
fileid: 0xa0 0xe9 0x9 0 0x3 0x3 0 0 aSx0xdf 0xfd 0xe1 0x4 0 0 0 0 0
appname: 1

[1][35045]__txn_child: rec: 12 txnid 80000008 prevlsn [1][26426]
child: 0x80000009
c_lsn: [1][34911]

[1][35085]__dbreg_register: rec: 2 txnid 80000008 prevlsn [1][35045]
opcode: 3
name: /TEST/d0
uid: 0xa0 0xe9 0x9 0 0x3 0x3 0 0 aSx0xdf 0xfd 0xe1 0x4 0 0 0 0 0
fileid: 3
ftype: 0x1
meta_pgno: 0
id: 0x80000009

[1][35200]__txn_regop: rec: 10 txnid 80000008 prevlsn [1][35085]
opcode: 1
timestamp: 1129331029 (Fri Oct 14 16:03:49 2005, 200510141603.49)
locks:

[1][35240]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/d0
uid: 0xa0 0xe9 0x9 0 0x3 0x3 0 0 aSx0xdf 0xfd 0xe1 0x4 0 0 0 0 0
fileid: 3
ftype: 0x1
meta_pgno: 0
id: 0x0

[1][35355]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/c0
uid: 0x9f 0xe9 0x9 0 0x3 0x3 0 0 T0xd7 0xb1 0x99 ][0x3 0 0 0 0 0
fileid: 2
ftype: 0x3
meta_pgno: 0
id: 0x0

[1][35463]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/b0
uid: 0x9e 0xe9 0x9 0 0x3 0x3 0 0 k0x80 &0xc6 0xbd 0xd4 0x1 0 0 0 0 0
fileid: 1
ftype: 0x1
meta_pgno: 0
id: 0x0

[1][35576]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/a0
uid: 0x9d 0xe9 0x9 0 0x3 0x3 0 0 0x8f 0xfa 0x9f 0xa6 0x1d N0 0 0 0 0 0

fileid: 0
ftype: 0x3
meta_pgno: 0
id: 0x0

[1][35682]__txn_ckp: rec: 11 txnid 0 prevlsn [0][0]
ckp_lsn: [1][35200]
last_ckp: [0][0]
timestamp: 1129331044 (Fri Oct 14 16:04:04 2005, 200510141604.04)
envid: -1851516827
rep_gen: 1

The client log file have all LSN except the last one 35682.

What could cause the "invalid argument"? Also why there are
transactions with id 0 (35240-35682)?

Thank you for your help.


Reply With Quote
  #2  
Old   
Susan LoVerso
 
Posts: n/a

Default Re: rep_process_message and Invalid argument - 10-19-2005 , 12:57 PM






sarah wrote:
Quote:
Hi,
I have two nodes (one master and one client). The application will
create four databases (a b c d). Suppose my database environment and
database file directory is TEST. If I start master with empty TEST,
and later start client. They will synchronize each other. At this
time, before I make any change to the database files in the TEST on the
master (insert or delete record), I shut down client application. When
i start client again, the master begin to synchronize client. However,
the client gives DBException (c++) from rep_process_message method:

DbEnv::rep_process_message: Invalid argument

Here is the log file for master:
..............
Three database files are already created

[snip]
[1][35240]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/d0
uid: 0xa0 0xe9 0x9 0 0x3 0x3 0 0 aSx0xdf 0xfd 0xe1 0x4 0 0 0 0 0
fileid: 3
ftype: 0x1
meta_pgno: 0
id: 0x0

[1][35355]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/c0
uid: 0x9f 0xe9 0x9 0 0x3 0x3 0 0 T0xd7 0xb1 0x99 ][0x3 0 0 0 0 0
fileid: 2
ftype: 0x3
meta_pgno: 0
id: 0x0

[1][35463]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/b0
uid: 0x9e 0xe9 0x9 0 0x3 0x3 0 0 k0x80 &0xc6 0xbd 0xd4 0x1 0 0 0 0 0
fileid: 1
ftype: 0x1
meta_pgno: 0
id: 0x0

[1][35576]__dbreg_register: rec: 2 txnid 0 prevlsn [0][0]
opcode: 1
name: /TEST/a0
uid: 0x9d 0xe9 0x9 0 0x3 0x3 0 0 0x8f 0xfa 0x9f 0xa6 0x1d N0 0 0 0 0 0

fileid: 0
ftype: 0x3
meta_pgno: 0
id: 0x0

[1][35682]__txn_ckp: rec: 11 txnid 0 prevlsn [0][0]
ckp_lsn: [1][35200]
last_ckp: [0][0]
timestamp: 1129331044 (Fri Oct 14 16:04:04 2005, 200510141604.04)
envid: -1851516827
rep_gen: 1

The client log file have all LSN except the last one 35682.

What could cause the "invalid argument"?
If you configure for error messages you should get a message giving
more
information about the error. See the docs for dbenv->set_errfile.
However,
I suspect I know why. When a client joins a replication group, in 4.3,
it
synchronizes on a checkpoint record. It does this so it can compare
the
envid field in a checkpoint in its log to the master's log. This will
guarantee
that this client was (at some point) part of this replication group.
In DB if
there is no checkpoint in the client's log then DB will synchronize.
If there is
a checkpoint, that doesn't match, and we end up backing up all the way
to
the beginning of the log without a matching one, DB determines these
sites were never part of the same group.

Your statement that the client log has all the LSNs except the
checkpoint
record indicates to me that when you then restart the client you are
using
DB_RECOVER and recovery is writing its own checkpoint record. So, when
the client rejoins it does find a checkpoint (that doesn't match) and
no other,
so it gives that error. If you turn on error messages you'll see:
Client was never part of master's environment.

Quote:
Also why there are
transactions with id 0 (35240-35682)?
Only operations that are transactionally protected have transaction IDs
in
their log records. The records generated by checkpoint do not.

Sue LoVerso
Sleepycat Software



Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.