dbTalk Databases Forums  

Apparent DB engine bug in SQL Server 2005

comp.databases.ms-sqlserver comp.databases.ms-sqlserver


Discuss Apparent DB engine bug in SQL Server 2005 in the comp.databases.ms-sqlserver forum.



Reply
 
Thread Tools Display Modes
  #1  
Old   
Dimitri Furman
 
Posts: n/a

Default Apparent DB engine bug in SQL Server 2005 - 08-31-2007 , 06:03 PM






SQL Server 2005 SP2 (build 3054)

Consider the following scenario:

- A complex multi-statement table valued function is created. Let's call
it dbo.tfFunc(@Param1, @Param2)
- A SELECT statement is executed, that calls the above function twice,
each time with a different set of parameters. In pseudocode:

SELECT <column list>
FROM dbo.tfFunc(1, 2) AS f1
<some JOIN operator> dbo.tfFunc(3, 4) AS f2
ON f1.col = f2.col
INNER JOIN dbo.Table1 AS t1
ON ...
etc.

The exact statement is probably irrelevant, as long as the same table-
valued function is called twice (I have observed the issue in two very
different statements calling the same function). The statement is
executed in a SNAPSHOT isolation level transaction, although this may
also be irrelevant.

- The statement continues executing for a long time. If sp_who2 is run at
that time, the following row is returned for the statement connection
(only relevant columns are shown):

SPID Status BlkBy Command CPUTime DiskIO LastBatch
63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37

The statement appears to be blocked by itself. If sp_lock is run at that
time, the following rows are returned:

spid dbid ObjId IndId Type Resource Mode Status
63 2 1316624641 0 TAB Sch-S GRANT
63 2 1316624641 0 TAB Sch-M WAIT

It appears that SQL Server waits indefinitely trying to obtain a schema-
modification lock on a resource which already has a schema-stability lock
placed on it by the same connection.

The following is pure speculation, but it seems reasonable to assume that
the server has materialized the result of the first call to the function
using a temporary table in tempdb, and is trying to materialize the
result of the second call using the same temporary table (same ObjId in
sp_lock results).

I do not know why this does not cause a deadlock error.

Unfortunately, I do not have a simple repro script for this. The actual
code is rather complex. While I can devise a workaround, this does look
like a bug. I am posting it here before submitting a bug on Connect, in
case anyone can shed some light. Thanks.

--
remove a 9 to reply by email

Reply With Quote
  #2  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-01-2007 , 07:39 AM






Dimitri Furman (dfurman (AT) cloud99 (DOT) net) writes:
Quote:
- The statement continues executing for a long time. If sp_who2 is run at
that time, the following row is returned for the statement connection
Long time? But does it ever complete?

Quote:
SPID Status BlkBy Command CPUTime DiskIO LastBatch
63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37

The statement appears to be blocked by itself. If sp_lock is run at that
time, the following rows are returned:

spid dbid ObjId IndId Type Resource Mode
Status
63 2 1316624641 0 TAB Sch-S GRANT
63 2 1316624641 0 TAB Sch-M WAIT

It appears that SQL Server waits indefinitely trying to obtain a schema-
modification lock on a resource which already has a schema-stability lock
placed on it by the same connection.
Is this a parallel plan? In that case different threads could be
blocking each other.

Quote:
The following is pure speculation, but it seems reasonable to assume that
the server has materialized the result of the first call to the function
using a temporary table in tempdb, and is trying to materialize the
result of the second call using the same temporary table (same ObjId in
sp_lock results).
The table in question is likely to be the return table for the UDF.
You should be able to find out more about this table by looking in
sys.objects and sys.columns.

Quote:
Unfortunately, I do not have a simple repro script for this. The actual
code is rather complex. While I can devise a workaround, this does look
like a bug. I am posting it here before submitting a bug on Connect, in
case anyone can shed some light. Thanks.
Without a repro it will of course be difficult to address the issue.
I would suggest that when you file the bug that you include:

1) The query.
2) The code for the UDF.
3) If possible also table definitions.
4) The XML showplan. (You can save this from the graphical plan in Mgmt
Studio.)
5) The output from sys.dm_os_waiting_tasks and sys.tran_locks.


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #3  
Old   
Dan Guzman
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-01-2007 , 08:10 AM



Regarding Erland's comment about a parallel plan, try running the query with
an OPTION (MAXDOP 1) hint if you see parallelism. That might provide an
easier workaround and/or provide additional info for the Connect bug report.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"Dimitri Furman" <dfurman (AT) cloud99 (DOT) net> wrote

Quote:
SQL Server 2005 SP2 (build 3054)

Consider the following scenario:

- A complex multi-statement table valued function is created. Let's call
it dbo.tfFunc(@Param1, @Param2)
- A SELECT statement is executed, that calls the above function twice,
each time with a different set of parameters. In pseudocode:

SELECT <column list
FROM dbo.tfFunc(1, 2) AS f1
some JOIN operator> dbo.tfFunc(3, 4) AS f2
ON f1.col = f2.col
INNER JOIN dbo.Table1 AS t1
ON ...
etc.

The exact statement is probably irrelevant, as long as the same table-
valued function is called twice (I have observed the issue in two very
different statements calling the same function). The statement is
executed in a SNAPSHOT isolation level transaction, although this may
also be irrelevant.

- The statement continues executing for a long time. If sp_who2 is run at
that time, the following row is returned for the statement connection
(only relevant columns are shown):

SPID Status BlkBy Command CPUTime DiskIO LastBatch
63 SUSPENDED 63 SELECT 29282 683 08/31 18:17:37

The statement appears to be blocked by itself. If sp_lock is run at that
time, the following rows are returned:

spid dbid ObjId IndId Type Resource Mode Status
63 2 1316624641 0 TAB Sch-S GRANT
63 2 1316624641 0 TAB Sch-M WAIT

It appears that SQL Server waits indefinitely trying to obtain a schema-
modification lock on a resource which already has a schema-stability lock
placed on it by the same connection.

The following is pure speculation, but it seems reasonable to assume that
the server has materialized the result of the first call to the function
using a temporary table in tempdb, and is trying to materialize the
result of the second call using the same temporary table (same ObjId in
sp_lock results).

I do not know why this does not cause a deadlock error.

Unfortunately, I do not have a simple repro script for this. The actual
code is rather complex. While I can devise a workaround, this does look
like a bug. I am posting it here before submitting a bug on Connect, in
case anyone can shed some light. Thanks.

--
remove a 9 to reply by email


Reply With Quote
  #4  
Old   
Dimitri Furman
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-02-2007 , 10:55 AM



On Sep 01 2007, 08:39 am, Erland Sommarskog <esquel (AT) sommarskog (DOT) se> wrote
in news:Xns999E95B819D01Yazorman (AT) 127 (DOT) 0.0.1:

Quote:
Dimitri Furman (dfurman (AT) cloud99 (DOT) net) writes:
- The statement continues executing for a long time.

Long time? But does it ever complete?
The longest time I let it run for is 40 minutes. Considering that it
usually runs in less than 10 seconds, the likely answer is no.

Quote:
Is this a parallel plan?
Hard to tell. I forgot to mention that the problem is intermittent. When
the statement completes successfully, there is no indication of parallelism
in the actual plan. When it does not, there is obviously no plan to look at
(in fact, the only way to kill the connection in that case is to restart
the server). The estimated plan doesn't show any parallelism either. I am
talking here about the plan for the statement, not the plan for the called
function, which I apparently cannot see.

I did try OPTION (MAXDOP 1) in both the statement and the function, and
have not been able to reproduce the issue so far. But this is inconclusive,
sometimes it works for days without a problem.

Quote:
The table in question is likely to be the return table for the UDF.
You should be able to find out more about this table by looking in
sys.objects and sys.columns.
I did, and this is where it gets a bit interesting. The UDF in question
includes a table variable, and it turns out that the mentioned schema locks
are placed on the table in tempdb corresponding to that table variable, not
the return table for the UDF. I am not sure if this makes any substantive
difference though.

Quote:
4) The XML showplan. (You can save this from the graphical plan in
Mgmt
Studio.)
I'm not sure how I could save the plan if the statement never completes...

--
remove a 9 to reply by email


Reply With Quote
  #5  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-02-2007 , 12:39 PM



Dimitri Furman (dfurman (AT) cloud99 (DOT) net) writes:
Quote:
The longest time I let it run for is 40 minutes. Considering that it
usually runs in less than 10 seconds, the likely answer is no.
If you have to restart the server to resolve the situation, it certainly
sounds that the prospects for completion are utterly bleak.

Quote:
Hard to tell. I forgot to mention that the problem is intermittent. When
the statement completes successfully, there is no indication of
parallelism in the actual plan. When it does not, there is obviously no
plan to look at (in fact, the only way to kill the connection in that
case is to restart the server). The estimated plan doesn't show any
parallelism either. I am talking here about the plan for the statement,
not the plan for the called function, which I apparently cannot see.
If you run the function alone, you should see its plan I think.

But I was mainly interested in whether the main query had any parallelism.
In that case it could be one thread blocking another. Hm, then again,
if the UDF causes parallelism, I guess that could also be an issue.
But I don't think this is likely, since if you insert into a table
variable, there cannot be parallelism. And it's difficult to anything
in a UDF without modifying a table variable.

Anyway, you can easily examine this next time it happens by running

SELECT * FROM sys.dm_os_tasks WHERE session_id = <trouble spid>

If there are rows with non-zero exec_context_id, there are parallel
threads.

The output from sys.dm_os_waiting_tasks would also be interesting.

Quote:
I did, and this is where it gets a bit interesting. The UDF in question
includes a table variable, and it turns out that the mentioned schema
locks are placed on the table in tempdb corresponding to that table
variable, not the return table for the UDF. I am not sure if this makes
any substantive difference though.
At least it is a clue for anyone who is trying to produce a repro.
Given that you say it's intermittent, I am not going to try.

Quote:
4) The XML showplan. (You can save this from the graphical plan in
Mgmt Studio.)

I'm not sure how I could save the plan if the statement never completes...
It's also available in sys.dm_exec_text_query_plan. A way to get the
plan, sys.os_waiting_tasks and more packaged into one result set, is
to use my beta_lockinfo, available at
http://www.sommarskog.se/sqlutil/beta_lockinfo.html


There have been some bugs around temp-table caching, I don't if they
could be related to what you see. There is a Cumultative Update, including
these two bugfixes at http://support.microsoft.com/kb/939537.


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #6  
Old   
Dimitri Furman
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-02-2007 , 10:44 PM



On Sep 02 2007, 01:39 pm, Erland Sommarskog <esquel (AT) sommarskog (DOT) se> wrote
in news:Xns999FC8952760AYazorman (AT) 127 (DOT) 0.0.1:

Quote:
Anyway, you can easily examine this next time it happens by running

SELECT * FROM sys.dm_os_tasks WHERE session_id = <trouble spid

If there are rows with non-zero exec_context_id, there are parallel
threads.
There are not, so I guess we can rule out parallelism.

Quote:
It's also available in sys.dm_exec_text_query_plan. A way to get the
plan, sys.os_waiting_tasks and more packaged into one result set, is
to use my beta_lockinfo, available at
http://www.sommarskog.se/sqlutil/beta_lockinfo.html
This should be handy. Thanks.

Quote:
There have been some bugs around temp-table caching, I don't if they
could be related to what you see. There is a Cumultative Update,
including these two bugfixes at
http://support.microsoft.com/kb/939537.
I'll get that and watch how it goes for a few days. If it still happens,
will try to find some time to work on a repro. Will follow-up with any
news.

--
remove a 9 to reply by email


Reply With Quote
  #7  
Old   
Dimitri Furman
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-27-2007 , 04:25 PM



On Sep 02 2007, 11:44 pm, Dimitri Furman <dfurman (AT) cloud99 (DOT) net> wrote in
news:Xns999FF19D6AD96dfurmancloud99 (AT) 127 (DOT) 0.0.1:

Quote:
If it still happens,
will try to find some time to work on a repro. Will follow-up with any
news.
Submitted feedback on Connect that includes a repro:
https://connect.microsoft.com/SQLSer...Feedback.aspx?
FeedbackID=300465

--
remove a 9 to reply by email


Reply With Quote
  #8  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-27-2007 , 05:01 PM



Dimitri Furman (dfurman (AT) cloud99 (DOT) net) writes:
Quote:
On Sep 02 2007, 11:44 pm, Dimitri Furman <dfurman (AT) cloud99 (DOT) net> wrote in
news:Xns999FF19D6AD96dfurmancloud99 (AT) 127 (DOT) 0.0.1:

If it still happens,
will try to find some time to work on a repro. Will follow-up with any
news.

Submitted feedback on Connect that includes a repro:
https://connect.microsoft.com/SQLSer...Feedback.aspx?
FeedbackID=300465
Thanks Dimitri. Looks like an excellent bug report. I hope that it will
be sufficient for the SQL Server people to track down the bug.

Unfortunately, it is not possible to access attachments on Connect, so
I cannot try the repro. I tried to compose my own from your description,
but it was not really that simple. Given the trouble you had in recreating
it, I wasn't suprised.

If it's possible for you to post the repro files here, I'd be interested.



--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
  #9  
Old   
Dimitri Furman
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-27-2007 , 09:05 PM



On Sep 27 2007, 06:01 pm, Erland Sommarskog <esquel (AT) sommarskog (DOT) se> wrote in
news:Xns99B9FF37989Yazorman (AT) 127 (DOT) 0.0.1:

Quote:
If it's possible for you to post the repro files here, I'd be interested.
Here it is:
http://iridule.net/cu/files/SS2005LockingBugRepro1.zip

Thanks for helping me nail it down.

--
remove a 9 to reply by email


Reply With Quote
  #10  
Old   
Erland Sommarskog
 
Posts: n/a

Default Re: Apparent DB engine bug in SQL Server 2005 - 09-28-2007 , 04:28 PM



Dimitri Furman (dfurman (AT) cloud99 (DOT) net) writes:
Quote:
On Sep 27 2007, 06:01 pm, Erland Sommarskog <esquel (AT) sommarskog (DOT) se> wrote
in news:Xns99B9FF37989Yazorman (AT) 127 (DOT) 0.0.1:
If it's possible for you to post the repro files here, I'd be interested.

Here it is:
http://iridule.net/cu/files/SS2005LockingBugRepro1.zip
Got it, and indeed I had to reboot myserver. What was missing from your
description on Connect was the RECOMPILE hint. When I remove it, the
procedure completes.

I looked in the SQL Server error log, and I found that there is a
stack dump for an unresolved deadlock.


--
Erland Sommarskog, SQL Server MVP, esquel (AT) sommarskog (DOT) se

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/pro...ads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinf...ons/books.mspx


Reply With Quote
Reply




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Powered by vBulletin Version 3.5.3
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.