dbTalk Databases Forums  

transaction tables consistent reads - undo records applied

comp.databases.oracle.server comp.databases.oracle.server


Discuss transaction tables consistent reads - undo records applied in the comp.databases.oracle.server forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Steve Howard
 
Posts: n/a

Default transaction tables consistent reads - undo records applied - 06-24-2010 , 09:20 PM






Hi All,

10.2.0.4 three node cluster EE on SLES 10

Can someone give me a good definition of *exactly* what this means and
what causes it (mainly the latter). The documentation is not very
descriptive, with "Number of undo records applied to transaction
tables that have been rolled back for consistent read purposes".

It sounds like undo on undo, but we don't have any larger number for
rollbacks (or commits) when this happens than we do at any other time.

We have been plagued by this for over a year, and after multilpe SR's
where the support analyst just reads us the documentation, I am at my
wits end.

We have a fairly large table (almost 1TB with about 300 million rows)
with a large XMLTYPE column. Once a day, a job scans this table for
records added that day for propagation to an external system. The
longer the query runs, the more we see the session doing single block
reads against the undo tablespace, with the stat in the subject
climbing into the millions. Eventually, after several hours, an
ORA-01555 is thrown.

I even grabbed one of the P1/P2 parameters for the session querying
and dumped the undo block in the P2 value. While it was a second or
two after the event was posted, the block itself didn't even contain
any references to the table being queried!

Can anyone shed some light?

Thanks,

Steve

Reply With Quote
  #2  
Old   
Steve Howard
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-24-2010 , 09:25 PM






On Jun 24, 10:20*pm, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:
Quote:
Hi All,

10.2.0.4 three node cluster EE on SLES 10

....I should also mention the explain plan for the query is OK. It is
a range scan for the time period being retrieved, which may be two or
three hours, representing 300,000 rows (out of 300 million).

Also, when I look at "table fetch by rowid" for the querying session,
it will periodically just stall. It is as if the session is reading
thousands upon thousands of undo blocks for that next row. The
arraysize is 15 (standard SQL*Plus)

Sometimes this happens, sometimes it doesn't.

We did apply patch 7527908

Reply With Quote
  #3  
Old   
joel garry
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-25-2010 , 11:01 AM



On Jun 24, 7:25*pm, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:
Quote:
On Jun 24, 10:20*pm, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:

Hi All,

10.2.0.4 three node cluster EE on SLES 10

...I should also mention the explain plan for the query is OK. *It is
a range scan for the time period being retrieved, which may be two or
three hours, representing 300,000 rows (out of 300 million).

Also, when I look at "table fetch by rowid" for the querying session,
it will periodically just stall. *It is as if the session is reading
thousands upon thousands of undo blocks for that next row. *The
arraysize is 15 (standard SQL*Plus)

Sometimes this happens, sometimes it doesn't.

We did apply patch 7527908
I have no idea, but I speculate you can use Tanel Poders' latchprofx
and poke around on his site about in memory undo to figure this out.
Does your plan show lots of recursion?

jg
--
@home.com is bogus.
http://www.businessweek.com/news/201...ofit-gain.html

Reply With Quote
  #4  
Old   
Jonathan Lewis
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-25-2010 , 11:37 AM



This happens when your query hits a block that was updated
"a long time" in the past by a transaction that has committed
but not been cleaned out.

Your query can tell that it is a committed transaction because
the ITL entry for the transaction points to transaction table slot
(in an undo segment header block) that has been re-used for
a newer transaction. (Part of the transaction id is the "transaction
sequence number", which is counting the number of times a transaction
slot has been used).

Your query therefore needs to know WHEN the transaction committed,
so that it can decide whether or not it's supposed to see the new version
or the old version of the row. (If the transaction committed before the
query then the query doesn't need to know exactly when the transaction
committed, if it started after the query then it has to be rolled back -
and it's possible that the "snapshot too old" is the result of the data
rollback
than the transaction table rollback.)

To find out when the transaction committed, your query copies the undo
segment header block and starts rolling it back. The number of times this
happens is recorded as:
"transaction tables consistent read rollbacks"

To perform the rollback, your query will read the transaction control block
(another part of the undo segment header) which contains a number of
important
details - including the first undo block address of the most recent
transaction
to use that undo segment header. This undo block address will hold the
first
record of that transaction *** - which include information about the
PREVIOUS
state of the transaction control block. By using this undo record your
query
can take the undo segment header block backwards in time by one step -
at which point it reads the older version of the transaction control block
and
repeats the process until it reaches the point where the transaction slot
it's
interested in has been taken back to the correct sequence number (or a
change
has taken the undo segment header block back to a point in time before the
start of the query). Each record it reads in this process is counted in
the
"transaction tables consistent reads - undo records applied"


(*** This is why the block you dumped had nothing to do with your table.)

The trouble with your requirement is that we really need to do a backwards
tablescan - because it's probably the data near the end of the table that
is
changing while you are "wasting" time reading all the data from the start
of
the table.

Unfortunately there is no such hint - but if it's really critical, you
could write
some code to scan the table one extent at a time in reverse order.


--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com


"Steve Howard" <stevedhoward (AT) gmail (DOT) com> wrote

Quote:
Hi All,

10.2.0.4 three node cluster EE on SLES 10

Can someone give me a good definition of *exactly* what this means and
what causes it (mainly the latter). The documentation is not very
descriptive, with "Number of undo records applied to transaction
tables that have been rolled back for consistent read purposes".

It sounds like undo on undo, but we don't have any larger number for
rollbacks (or commits) when this happens than we do at any other time.

We have been plagued by this for over a year, and after multilpe SR's
where the support analyst just reads us the documentation, I am at my
wits end.

We have a fairly large table (almost 1TB with about 300 million rows)
with a large XMLTYPE column. Once a day, a job scans this table for
records added that day for propagation to an external system. The
longer the query runs, the more we see the session doing single block
reads against the undo tablespace, with the stat in the subject
climbing into the millions. Eventually, after several hours, an
ORA-01555 is thrown.

I even grabbed one of the P1/P2 parameters for the session querying
and dumped the undo block in the P2 value. While it was a second or
two after the event was posted, the block itself didn't even contain
any references to the table being queried!

Can anyone shed some light?

Thanks,

Steve

Reply With Quote
  #5  
Old   
joel garry
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-25-2010 , 03:31 PM



On Jun 25, 9:37*am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk>
wrote:
Quote:
This happens when your query hits a block that was updated
"a long time" in the past by a transaction that has committed
but not been cleaned out.

Your query can tell that it is a committed transaction because
the ITL entry for the transaction points to transaction table slot
(in an undo segment header block) that has been re-used for
a newer transaction. *(Part of the transaction id is the "transaction
sequence number", which is counting the number of times a transaction
slot has been used).

Your query therefore needs to know WHEN the transaction committed,
so that it can decide whether or not it's supposed to see the new version
or the old version of the row. *(If the transaction committed before the
query then the query doesn't need to know exactly when the transaction
committed, if it started after the query then it has to be rolled back -
and it's possible that the "snapshot too old" is the result of the data
rollback
than the transaction table rollback.)

To find out when the transaction committed, your query copies the undo
segment header block and starts rolling it back. The number of times this
happens is recorded as:
* * "transaction tables consistent read rollbacks"

To perform the rollback, your query will read the transaction control block
(another part of the undo segment header) which contains a number of
important
details - including the first undo block address of the most recent
transaction
to use that undo segment header. *This undo block address will hold the
first
record of that transaction *** - which include information about the
PREVIOUS
state of the transaction control block. *By using this undo record your
query
can take the undo segment header block backwards in time by one step -
at which point it reads the older version of the transaction control block
and
repeats the process until it reaches the point where the transaction slot
it's
interested in has been taken back to the correct sequence number (or a
change
has taken the undo segment header block back to a point in time before the
start of the query). *Each record it reads in this process is counted in
the
* * "transaction tables consistent reads - undo records applied"

(*** This is why the block you dumped had nothing to do with your table.)

The trouble with your requirement is that we really need to do a backwards
tablescan - because it's probably the data near the end of the table that
is
changing while you are "wasting" time reading all the data from the start
of
the table.
Excellent explanation, but I lost you here. He says plan says doing a
range scan, for 1% of the table? (Maybe you hadn't seen subsequent
post yet, where he mentions a fetch suddenly exhibiting the
characteristics you describe.)

Quote:
Unfortunately there is no such hint - but if it's really critical, you
could write
some code to scan the table one extent at a time in reverse order.
This cleaning makes perfect sense, but I'm wondering if there is some
administrative tuning like adjusting undo size or retention or some
fiddling with initrans? Sounds critical if it's interrupting data
extraction. I'm wondering if the mysterious translation of xmltype
from a column might be a problem here. Steve, how exactly are you
inserting and accessing this column?

jg
--
@home.com is bogus.
snake oil 2.0 http://www.gapingvoidgallery.com/pro...oducts_id=1614

Reply With Quote
  #6  
Old   
Jonathan Lewis
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-26-2010 , 01:05 AM



"joel garry" <joel-garry (AT) home (DOT) com> wrote

]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk>
]wrote:
]>
]> The trouble with your requirement is that we really need to do a
backwards
]> tablescan - because it's probably the data near the end of the table
that
]> is
]> changing while you are "wasting" time reading all the data from the
start
]> of
]> the table.
]
]Excellent explanation, but I lost you here. He says plan says doing a
]range scan, for 1% of the table? (Maybe you hadn't seen subsequent
]post yet, where he mentions a fetch suddenly exhibiting the
]characteristics you describe.)
]

By the time I'd written this much, I'd forgotten that he'd added the note
about the index - but it doesn't really make any difference (a) to the
explanation or (b) to the concept in the solution - except that you
can put in an "index_desc()" hint and that might be enough to help.
It depends on the how the query is written, what index it uses, and
the distribution of the changed data.

]>
]> Unfortunately there is no such hint - but if it's really critical, you
]> could write
]> some code to scan the table one extent at a time in reverse order.
]
]This cleaning makes perfect sense, but I'm wondering if there is some
]administrative tuning like adjusting undo size or retention or some
]fiddling with initrans? Sounds critical if it's interrupting data
]extraction.

The error is "just" the same as a traditional 1555 problem when it gets
that far so a "large enough" undo retention should stop the 1555 - but
that won't stop the amount of work it takes. Thinking about initrans is
a good idea - but that won't have any effect either because the problem
is the number of backward steps that have to be taken and the value of
initrans only eliminates the first few (i.e. a few relating to the size of
INITRANS).


--
Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com

Reply With Quote
  #7  
Old   
Steve Howard
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-26-2010 , 08:02 AM



On Jun 25, 12:01*pm, joel garry <joel-ga... (AT) home (DOT) com> wrote:
Quote:
On Jun 24, 7:25*pm, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:



On Jun 24, 10:20*pm, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:

Hi All,

10.2.0.4 three node cluster EE on SLES 10

...I should also mention the explain plan for the query is OK. *It is
a range scan for the time period being retrieved, which may be two or
three hours, representing 300,000 rows (out of 300 million).

Also, when I look at "table fetch by rowid" for the querying session,
it will periodically just stall. *It is as if the session is reading
thousands upon thousands of undo blocks for that next row. *The
arraysize is 15 (standard SQL*Plus)

Sometimes this happens, sometimes it doesn't.

We did apply patch 7527908

I have no idea, but I speculate you can use Tanel Poders' latchprofx
and poke around on his site about in memory undo to figure this out.
Does your plan show lots of recursion?

jg
--
@home.com is bogus.http://www.businessweek.com/news/201...s-after-sun-ac...
Hi Joel,

I got a tip from Dion Cho regarding this that was also suggesting
dc_rollbacks as latching issue, but I could never prove any similarity
between his test case and our situation.

http://dioncho.wordpress.com/2009/04...a-01555-error/

Thanks,

Steve

Reply With Quote
  #8  
Old   
Steve Howard
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-26-2010 , 08:10 AM



On Jun 26, 2:05*am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk>
wrote:
Quote:
"joel garry" <joel-ga... (AT) home (DOT) com> wrote in message

news:29c017c9-7c3a-40db-b422-1b1f2d861431 (AT) i9g2000prn (DOT) googlegroups.com...
]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk
]wrote:
]
]> The trouble with your requirement is that we really need to do a
backwards
]> tablescan - because it's probably the data near the end of the table
that
]> is
]> changing while you are "wasting" time reading all the data from the
start
]> of
]> the table.
]
]Excellent explanation, but I lost you here. *He says plan says doing a
]range scan, for 1% of the table? *(Maybe you hadn't seen subsequent
]post yet, where he mentions a fetch suddenly exhibiting the
]characteristics you describe.)
]

By the time I'd written this much, I'd forgotten that he'd added the note
about the index - but it doesn't really make any difference (a) to the
explanation or (b) to the concept in the solution - except that you
can put in an "index_desc()" hint and that might be enough to help.
It depends on the how the query is written, what index it uses, and
the distribution of the changed data.

]
]> Unfortunately there is no such hint - but if it's really critical, you
]> could write
]> some code to scan the table one extent at a time in reverse order.
]
]This cleaning makes perfect sense, but I'm wondering if there is some
]administrative tuning like adjusting undo size or retention or some
]fiddling with initrans? *Sounds critical if it's interrupting data
]extraction.

The error is "just" the same as a traditional 1555 problem when it gets
that far so a "large enough" undo retention should stop the 1555 - but
that won't stop the amount of work it takes. *Thinking about initrans is
a good idea - but that won't have any effect either because the problem
is the number of backward steps that have to be taken and the value of
initrans only eliminates the first few (i.e. a few relating to the size of
INITRANS).

--
Regards

Jonathan Lewishttp://jonathanlewis.wordpress.com
Thanks much , Jonathan (and Joel).

I found a comment by you several months ago similar to this that I
always go back to. I can't tell you how many times I would go for a
cup of coffee with someone and mention that link and the fact that I
would like to send you a note for clarification, but just never did.

http://www.orafaq.com/usenet/comp.da...02/01/0022.htm

I really appreciate you taking the time to respond, as this has been
driving me nutty.

Let me take some time to digest what you posted and then I will
respond.

Thanks,

Steve

Reply With Quote
  #9  
Old   
Steve Howard
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-28-2010 , 08:37 AM



On Jun 26, 2:05*am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk>
wrote:
Quote:
"joel garry" <joel-ga... (AT) home (DOT) com> wrote in message

news:29c017c9-7c3a-40db-b422-1b1f2d861431 (AT) i9g2000prn (DOT) googlegroups.com...
]On Jun 25, 9:37 am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk
]wrote:
]
]> The trouble with your requirement is that we really need to do a
backwards
]> tablescan - because it's probably the data near the end of the table
that
]> is
]> changing while you are "wasting" time reading all the data from the
start
]> of
]> the table.
]
]Excellent explanation, but I lost you here. *He says plan says doing a
]range scan, for 1% of the table? *(Maybe you hadn't seen subsequent
]post yet, where he mentions a fetch suddenly exhibiting the
]characteristics you describe.)
]

By the time I'd written this much, I'd forgotten that he'd added the note
about the index - but it doesn't really make any difference (a) to the
explanation or (b) to the concept in the solution - except that you
can put in an "index_desc()" hint and that might be enough to help.
It depends on the how the query is written, what index it uses, and
the distribution of the changed data.

]
]> Unfortunately there is no such hint - but if it's really critical, you
]> could write
]> some code to scan the table one extent at a time in reverse order.
]
]This cleaning makes perfect sense, but I'm wondering if there is some
]administrative tuning like adjusting undo size or retention or some
]fiddling with initrans? *Sounds critical if it's interrupting data
]extraction.

The error is "just" the same as a traditional 1555 problem when it gets
that far so a "large enough" undo retention should stop the 1555 - but
that won't stop the amount of work it takes. *Thinking about initrans is
a good idea - but that won't have any effect either because the problem
is the number of backward steps that have to be taken and the value of
initrans only eliminates the first few (i.e. a few relating to the size of
INITRANS).

--
Regards

Jonathan Lewishttp://jonathanlewis.wordpress.com
What is really odd about this is that several months ago, I started
running a job to “pre-scan” all the rows we would need, before ‘the
“real” job got there. My assumption was this had something to do with
block cleanout, even though none of the cleanout statistics were
incremented like the “transaction tables consistent reads – undo
records applied” counter was.

This doesn’t seem to help, though. My “pre-scan” job never has an a
issue, but I run one hour windows for the range to scan.

A little more background. This is a “transaction history” table of
sorts. It is partitioned by month, and records are only added, never
updated.

SQL> desc big_table
Name Null? Type
----------------------------------------- --------
----------------------------
PK NOT NULL NUMBER
FK NOT NULL NUMBER
COL3 NOT NULL NUMBER(3)
CREATE_TIME TIMESTAMP(6)
COL5 NOT NULL VARCHAR2(50)
COL6 VARCHAR2(50)
COL7 XMLTYPE

SQL>

We query as follows:

SELECT concatenated_xml_string_of_columns_from_big_table,
a.xml_col.getClobVal()
FROM big_table a
WHERE create_time between trunc(sysdate) + (:1 / 1440) and
trunc(sysdate) + (:2 / 1440)

…where the window is three hours. This does a range scan on the
create_time column, which is good as it is by far the most selective
filter.

The selected records are retrieved in PL/SQL (no bulk collect), and
run through a few more XML tagging operations and written to a file.
They are then propagated to a mainframe for additional business usage
to which I am not privy.

If the query runs “fast enough” (less than 30 minutes or so), we don’t
see the issue. If it starts to “get slow” for whatever reason, we
start reading tons of undo.

Based on what you wrote, and the fact that I “pre-scan” the rows,
shouldn’t I pay the price for the cleanout? Or could it be we *do*
have other transactions hitting this table of which I am not aware?
In other words,

* I pre-scan
* A row *is* changed after my query finishes
* They run the “real” query

Thanks,

Steve

Reply With Quote
  #10  
Old   
joel garry
 
Posts: n/a

Default Re: transaction tables consistent reads - undo records applied - 06-28-2010 , 12:15 PM



On Jun 28, 6:37*am, Steve Howard <stevedhow... (AT) gmail (DOT) com> wrote:
Quote:
On Jun 26, 2:05*am, "Jonathan Lewis" <jonat... (AT) jlcomp (DOT) demon.co.uk
wrote:


This doesn’t seem to help, though. *My “pre-scan” job never has an a
issue, but I run one hour windows for the range to scan.

A little more background. *This is a “transaction history” table of
sorts. *It is partitioned by month, and records are only added, never
updated.

SQL> desc big_table
*Name * * * * * * * * * * * * * * * * * * *Null? * *Type
*----------------------------------------- --------
----------------------------
PK * * * * * * * * * * * * * * * * * NOT NULL NUMBER
FK * * * * * * * * * * * * * * * * *NOTNULL NUMBER
COL3 * * * * * * * * * * * * * * * * * *NOT NULL NUMBER(3)
*CREATE_TIME * * * * * * * * * * * * * * * * * * * *TIMESTAMP(6)
COL5 * * * * * * * * * * * * * * * *NOT NULL VARCHAR2(50)
COL6 * * * * * * * * * * * * * * * * * * VARCHAR2(50)
COL7 * * * * * * * * * * * * * * * * * * * * *XMLTYPE

SQL

We query as follows:

SELECT concatenated_xml_string_of_columns_from_big_table,
* * * *a.xml_col.getClobVal()
* FROM big_table a
* WHERE create_time between trunc(sysdate) + (:1 / 1440) and
trunc(sysdate) + (:2 / 1440)

…where the window is three hours. *This does a range scan on the
create_time column, which is good as it is by far the most selective
filter.

The selected records are retrieved in PL/SQL (no bulk collect), and
run through a few more XML tagging operations and written to a file.
They are then propagated to a mainframe for additional business usage
to which I am not privy.

If the query runs “fast enough” (less than 30 minutes or so), we don’t
see the issue. *If it starts to “get slow” for whatever reason, we
start reading tons of undo.
Something old but new to me I learned today (from Lob retention not
changing when undo_retention is changed [ID 563470.1]):

"...It is assumed that when UNDO_RETENTION is changed the lobs
connected to that retention are also changed which is not the case .

If a lob is modified from RETENTION to PCTVERSION and back to
RETENTION again then the lob retention is updated. ..."

Of course I have no idea if it is related to your problem, unless you
say something like you've changed your undo retention from 30 minutes
or so and didn't know about this...

A bit more of a reach, maybe Bug 2931779 - False ORA-1555 accessing
"cache read" LOBs in RAC [ID 2931779.8] or related has reanimated in
some form.

I guess you need to start tracing and digging deep to figure this one
out. Those mysterious xml packages may be doing something strange...
(I've run into 3rd party app code at times that does stuff like update
and rollback, unexpectedly).

jg
--
@home.com is bogus.
http://thehill.com/blogs/hillicon-va...-cybersecurity

Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.