Query :
select * from Item1 a (NOLOCK, Index(ItemIn1))
join Item2 b (NOLOCK, Index(ItemData1))
Please explain the above nolock etc after the alias in the given query. What it does ?
Sam.They are table hints (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/instsql/in_backcomp_6rzt.asp), basically user supplied suggestions for how the optimizer ought to process the table.
NOLOCK requests that no new locks be placed, and existing locks be ignored. In my opinion, NOLOCK is a simple way to get wrong answers quickly.
Index(foo) requests that the optimizer select the index named foo on this table for processing the query. This is almost always a sign of poorly maintained database statistics or a brute force attempt to speed up a particular query. Maybe one instance in 500 is actually needed/helpful because of oddly skewed data that the developer understands but that the optimizer isn't smart enough to recognize.
As you've probably guessed, I don't think highly of table hints. There were times that they were absolutely crucial to getting work done. There are still times that they are helpful. In the VAST majority of cases with SQL-2000, while they might help in a specific circumstance, they are actually counter-productive in the long run.
-PatP|||Well, from what I've seen (quite a bit actually) about 90% of all SELECTs in a business system are or should be done with the lowest possible transaction isolation level. Reading COMMITTED ONLY data is absolutely required in situations where the most recent data vs. data being altered makes the most sense. But what is the true definition of "the most recent data"? Is it when the users clicks on OK button? If that's the case, than the result that could have been produced a split of a second earlier is different from the result that was actually acquired which in turn may be different from the result that would have been acquired if the user waited for another second...I don't believe in dogmatic approach to isolation levels, it all needs to be carefully tailored to the business needs and the way of application usage.|||See Pat...I thought this was a test or interview question...
especially since it didn't have a real example...
So how would they have know to get it syntactically correct?|||Well, from what I've seen (quite a bit actually) about 90% of all SELECTs in a business system are or should be done with the lowest possible transaction isolation level.Granted, but it should be done at the spid level using SET TRANSACTION ISOLATION LEVEL (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_set-set_74bw.asp), not for one query in a batch or even worse than that for one table within a query.
I think that you and I are on the "same page" here, but that is a tough concept to explain to someone asking what NOLOCK means. As long as the isolation level is applied consistantly for a transaction, it can be done safely. I don't agree that reading uncommitted data is anywhere near safe, but that's because I've spent literally months proving out statements that were corrupted due to the big rush at the end of the year where people were frantically entering and querying data... Those kind of memories tend to stick with me!
I feel very strongly that changing the isolation level for a single table within a query is a receipe for disaster. Every time I've found it, it has been a culprit.
-PatP|||Please explain to me how the example below may be used as a "receipe for disaster"? And activities like this one are the most typical for cut-over manufacturing+sales analysis for the day ;)
select o.order_id, o.order_date, o.customer_id, c.customer_name, d.item_id, im.item_description
from orders o (/*!!!*/TABLOCK/*!!!*/)
inner join customers c (/*!!!*/NOLOCK/*!!!*/)
on o.customer_id = c.customer_id
inner join order_detail d (/*!!!*/NOLOCK/*!!!*/)
on o.order_id = d.order_id
inner join item_master im (/*!!!*/NOLOCK/*!!!*/)
on d.item_id = im.item_id
where ship_date = convert(char(10), getdate(), 101)|||The easiest way to imagine a disaster is to think about how this kind of query is normally used. The "batch rollover" typically happens at a point in time when there is very little or no user interaction with the system... Other batch updates are often in progress. This transaction takes a full table lock on the orders table, but it does not take or respect any other locks.
Several customers that were input today got their names switched during the entry process. Two vendors sent updated product descriptions that need to be incorporated into your inventory master. As all these queries are running, you get everything absolutely correct from the orders table, but only some of the customer and item master information is corrupted... This can take months to find, but it destroys the user's faith in the system.
A more technical and more insideous problem is the "phantom deadlock"... If one of these queries is run by a scheduled task (with no UI) and another is run by a user task (with a UI) and both of them have existing locks on different tables that this query accesses with NOLOCK, even though it shouldn't produce a deadlock, on multiple CPU machines it can still produce one (unless you set MAXDOP to 1). This can cause a spooky error that is nearly impossible to resolve because of its timing sensitivity.
There are more ways to hurt yourself with this kind of programming practice, but they get progressively more difficult to reproduce.
Using NOLOCK with UPDATE statements is actually a larger problem, because it can cause code to break in ways that you can't reproduce. This is more visible to the average user, so it draws more attention and further deteriorates user faith in your application.
I don't mind running low isolation levels for "yardstick" queries. When you need a quick guestimate of today's volume, they are great. When you need to balance, to the penny, with absolutely accurate data, they are just problems waiting to happen. Think about how you'd feel about your bank if they promised you statements that were "pretty close, most of the time"!
-PatP|||Sorry, can't write such large posts, very little memory and information retention. But just to pick the bone, - phantom deadlock?? How can it happen with tables that don't have any footprint in syslocks? It's called PARANOIA, Pat. Relax, it must be something else that you're mistaking for the reasult of TABLOCK/NOLOCK combination ;)|||That's what the first two MS-PSS folks thought... My TAM was confident enough in me and my observations to push the issue until we reached a developer. After a LOT of work, we figured out how it could happen and what they had to change so we could remove the MAXDOP hint and have the code work without "assistance".
-PatP
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment