Hi All,
On Ingres II 10.0.0 (a64.lnx/132)NPTL + p14139
We have a procedure which when we execute it several times in succession fails with a SIGSEGV (see below)...and at different points in that sequence. I've seen situations where we do the sequence and got no errors, rollback, repeat and get errors aplenty.
The SIGSEGV seems to implicate the Parallel Query Thread and the server in question has opf_pq_dop = 8. I've recovered the database to another installation and duplicated the results there. I then reset opf_pq_dop to zero on the recovery installation and the sequence always executes OK.
Having reset the recovery box to use opf_pq_dop = 8, I then altered the sequence to commence with set noparallel and it works fine every time.
So I'm pretty convined we have a bug related to parallel query processing.
Now the fun part of building a test case and sending to support....
The database is way too large and holds way too much sensitive data to be sent so I've tried creating empty tables of the same structure as the originals, copying the stats over to the new tables and repeating the procedure sequence (modifed for the new tables)...and it works every time regardless of the parallelism.
Anyone got any ideas on what might be going wrong?
The error log showing a SIGSEGV typically looks like:
E_QE0002_INTERNAL_ERROR A QEF internal error occurred.
Associated error messages which provide more detailed information about theproblem can be found in the error log (errlog.log)
An error occurred in the following session:
Quote:
Session 00002AAAD92383C0:1080228160
DB Name: ace_trove_live (Owned by: ace
|
)
User: dispatch ( <Parallel Query Thread> )
User Name at Session Startup: <Parallel Query Thread>
Terminal: pts/2
Group Id: ace_group
Role Id:
Application Code: 00000000 Current Facility: DMF (00000003)
Description:
Query: EXECUTE PROCEDURE value_summary( fmin=36250, fmax=36499 )
Last Query: EXECUTE PROCEDURE value_summary( fmin=36000, fmax=36249 )
bb5.ctsu::[53935 , d92383c0]: Mon Oct 10 11:16:34 2011 Segmentation Violation (SIGSEGV) @PC 0000000000814031
RSP 00000000406227b0 RBP 0000000040622840 RSI 00002aaad902eaa0
RDI 00002aaad902e7c0 RAX 00002aaad902ff00 RBX 0000000000000001
RCX 0000000000000000 RDX 0000000000000000
-----------BEGIN STACK TRACE------------
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:0:40622840 iidbms(qen_position+0x269) [0x814031]( ... )
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:1:406229b0 iidbms(qen_orig+0xc9a) [0x81579b]( ... )
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:2:40622d90 iidbms(qen_exchange_child+0x1b44) [0x80c2f4]( ... )
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:3:40622e50 iidbms(scs_dbms_task+0xcdf) [0x78f013]( ... )
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:4:40627030 iidbms(scs_sequencer+0x349) [0x48cf7e]( ... )
bb5_ctsu::[53935 , 2aaad92383c0]: pid 22966:5:4062f120 iidbms(CSMT_setup+0x528) [0x74eb27]( ... )
-----------END STACK TRACE----------
Martin Bowes