SQL Server Query Performance

Home » SQL and Databases » SQL Server Query Performance
Photograph of a SSD Drive Array

Database systems such as SQL/Server are getting an immense set of features these days, to the point where its impossible to be an expert in all areas. However a lot of the basics are similar to what they were 20 years ago. These basics are an area where a lot of software companies get it wrong, even in 2015. The sheer number of poor database and query designs that affect sql server query performance and application design that I have seen over the years from different software houses that really should know how to do so much better continues to amaze me, but the one I’m going to talk about in this post really takes the biscuit.

Designing a database is mostly science but part art as well, in the sense that you can’t just design everything to sixth normal form. You have to design a database that works in the real world, so this generally means on making tables that reflect real world business objects and then developing the design from there. Just putting in six normal form generic, abstract objects that don’t mean anything, so to store a business object in five tables just because you can, is not good practice in the real world. If you did this because you are being anal about performance, did you check performance first and see what you are saving by your approach? An approach that probably makes the code more complicated and less readable? No I thought not.

Just that one tip alone could save a lot of heartache for many people, developers and users. Don’t optimise anything without checking first, because computer hardware relatively speaking is cheap. Finding the wrong balance, or just simply getting it wrong, can be expensive.

The other tip I would give is make sure you get the level of table normal form optimisation right and all naming of objects and columns right. You should use table and column names that your users would understand where possible – again doing that is a really good start that can prevent many mistakes when designing databases.

Anyway, in this post I’d like to focus on one query that I came across the other year. Its long enough ago now, and I’ve changed all the names, so it will not be possible for you to work out who this is. All I will say its the supplier of a package to one of my customers of the past few years.

I don’t know how many people actually read this blog, but I know my brothers dog does (he is half poodle half border terrier), and I’m sure even he will agree how poor this SQL query is I’m about to show you.

I was running a query to try to find out why a database was slow. If users are complaining about speed, and its not your queries, or your package, or your database for that matter, then it can be hard to find out what the worst thing is. I run the following query and came across probably the worse SQL I have ever come across in my career, and that really is saying something, because my SQL career goes back to 1988:

This query was used to check for the longest running queries:

This produced the following query in the top 5, which was taking a long long time, 11 minutes plus, to run. It was also being run quite regularly by the third party application (reminder: all names have been changed for confidentiality reasons):

The functions and views that the above view called were trivial and not worth repeating, but one must question why such a view has to call other views in the way it was. At one time this would have confused the optimiser, although thankfully its less of a problem in recent years.

I strongly feel that NOLOCK is an anti pattern that does a poor job of doing READ UNCOMMITTED in SQL/Server (the database default isolation level was READ COMMITTED), it does not do what it says, using NOLOCK still generates locks that can lead to lock promotion. There are many posts and a little bit of controversy about NOLOCK if you google for it so I won’t repeat any here apart from to say LOCK PROMOTION and if it is occurring understanding why it is occurring is the answer, not putting NOLOCK on all your queries. DOH! Not only that, I can imagine this software company interviewing developers and rejecting anybody who doesn’t use NOLOCK as being stupid, so the real result being that they only hire inexperienced developers and/or developers who don’t bother to understand fully what they are doing. Double DOH!

I think its fair to say that the above code generated the largest number of WTF’s per line from me from any query, anywhere, anytime ever. Why 11 minutes? The WHERE clause was in the third party application, thats why, so it was doing at least one full table scan without a where clause of any kind and the CROSS and LEFT JOINs would ensure that it did quite a bit more inefficient reading.

The solution was to rewrite the query so as not to use the CROSS JOIN and read all those records:

When I ran my new query with explain plan, it suggested that I index the DateTo column which I did. This was not a recommendation for the old query because its very likely it couldn’t use indexing efficiently no matter what indexes were on the tables.

The new query took around a second to return 250,000 rows, and if a customerId where clause was added, the time was 20ms. I couldn’t get the original to run in less than 11 minutes although I ran it a few times.

The main thing to take away from this post is that if you read many rows, its better doing it in a query (in SQL) rather than in IL (e.g. C# or VB.NET code behind), but its very bad all the same and you should minimize the number of rows being retrieved by good database and query design as far as possible. I managed to make such improvements just by altering the query but sometimes you have to alter the design of the tables just to reduce the sheer complexity of some of the SQL required to get at the data. Like your code, the database design should reflect, and show the user requirements wherever possible. You can always change it if user requirements change and this way you end up with good simple, reliable code in your business layer rather than trying to sort the mess out there (at a possible impact on performance and readability of more of your code).

As an example, if look at one of your most common queries – e.g. searching for customers, and you don’t use SKIP/TAKE just to return the page of information that you are reading, then consider doing so. This alone might remove a potential lock promotion problem and make everything run faster. See this link for more details – http://stackoverflow.com/questions/548475/efficient-way-to-implement-paging. I might write a post on SKIP/TAKE at some point in the future because its a good technique that’s well worth knowing about.

So Boo, when you read this post, don’t forget that when other humans are around you must always pronounce WTF as WOOF. Thanks mate!

About Phil

I have been working as a software developer since 1983. This blog could have been called "From Fortran 77, C and Cobol to C# in 20 (not so) easy years", but it doesn't sound quite right somehow. Besides I'm talking about what's happened since 2003, not before!

Leave a Reply

Your email address will not be published.


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">




Top Posts & Pages

Categories

Recent Posts

Recent Comments

Archives

Blogroll

  • Mike Cohn's Blog
  • Scott Hanselman's Blog
- mike@mountaingoatsoftware.com

Too often, agile teams are expected to finish everything every iteration. This expectation is harmfu [...]

- mike@mountaingoatsoftware.com

When a product backlog becomes too big, it hinders agility. Discover four steps your team can take t [...]

- mike@mountaingoatsoftware.com

Pervasive myths about agile get in the way of success. It’s time to bust six of those myths. [...]

- mike@mountaingoatsoftware.com

Agile teams strive to finish work in the same iteration in which its begun. Here’s why that is so im [...]

- mike@mountaingoatsoftware.com

I wrote 25 blog posts during 2018. In case you missed some of them, here are the most popular. [...]

- Scott Hanselman

My Xbox user name is Glucose for a reason. This is a passion project of mine. You've likely see [...]

- Scott Hanselman

I've been really enjoying my Xbox lately (when the family is asleep) as well as some fun Retrog [...]

- Scott Hanselman

So you've been asked to parse some dates, except the years are two digit years. For example, da [...]

- Scott Hanselman

I've been working on a little idea where I'd have an app (maybe a mobile app with Xamarin [...]

- Scott Hanselman

"EditorConfig helps maintain consistent coding styles for multiple developers working on the sa [...]

Meta