Filtered Indexes: The Conversation

A few months back as one of my very first blog posts I showed off some functionality that I had learned in Bruce Payette’s book “PowerShell In Action”  (Coincidentally a new update to the MEAP for the Second Edition was released tonight).  I showed how it was not too difficult to Archive the wisdom of Paul Randal’s Tweets for the PASS Summit.  Tonight I had a conversation with Adam Machanic and wanted to save that off as well. 

I got an idea when watching Allen White talk about Filtered Indexes on Tuesday (insert obligatory AppDev VC of PASS plug here: “Allen White Presents:  SQL Server Indexing” ).  I have a situation where I have six or seven columns that I want to index but I would prefer to only index the rows that have been created after the first 90 days.  After the first 90 days our people just don’t query those columns anywhere near as much.  My original idea was to create a filtered index on the ‘CreationDate’ column and then INCLUDE the 5 or 6 other columns that tend to get queried heavily during those first 90 days.  I also have some flash drives on my SAN that I want to use for these indexes but the drives aren’t very big and if I create the indexes across the whole table they will use up a lot of space and only a fraction of the entire index would get hit hard (very hard). 

The conversation that transpired between Adam and I was a great dive into the topic of Filtered Indexes for me; I’ve worked with them before but I’ve never tried this particular scenario before.  Adam said that this convo gave him idea to blog about.  I can’t wait to see it.  Until then you can read up on what we discussed below.
adam_machanic

Having worked with PowerShell for several more months now I have learned a few things and was able to implement them on the fly to make this process even easier.  This time instead of using the format options I just select-ed the columns that I wanted to keep out of the Pipeline and exported it to CSV.  The process of favorite’ing the individual tweets in the conversation is still a manual process; but I’m sure that a few more weeks of PowerShell learning will fix that. 

Here’s the code that I used tonight:

([xml] (new-object net.webclient).DownloadString(http://twitter.com/favorites/76699854.rss)).rss.channel.item | select description, pubDate, link | Export-CSV c:\Temp\FilteredIndexes.CSV

And now, you can read the conversation that I had with Adam Machanic without all the clutter of the other tweets during that time period:

descriptionpubDatelink
SQLvariant: Listening to Allen White speak about Indexing for the AppDev VC of #SQLPASS http://bit.ly/bgM06yTue, 09 Feb 2010 17:03:37http://twitter.com/SQLvariant/statuses/8862223289
MidnightDBA: I’m behind the times, work on SQL05. Can’t wait to work w/sql 08 professionally, filtered indexes are awesomesauce.Tue, 09 Feb 2010 17:10:55http://twitter.com/MidnightDBA/statuses/8862493660
AdamMachanic: @MidnightDBA They’re not as great in practice as they seem when you first learn about them, IMO. Lots of limitations and few use casesTue, 09 Feb 2010 17:18:54http://twitter.com/AdamMachanic/statuses/8862783462
SQLvariant: @AdamMachanic Well then you should probably tell Allen to quit giving us good ideas on how to use them 😉Tue, 09 Feb 2010 17:24:34http://twitter.com/SQLvariant/statuses/8862986987
AdamMachanic: @SQLvariant Such as?Tue, 09 Feb 2010 17:49:13http://twitter.com/AdamMachanic/statuses/8863863635
SQLvariant: @AdamMachanic Only indexing the last quarter’s worth of data.Tue, 09 Feb 2010 17:59:54http://twitter.com/SQLvariant/statuses/8864239382
SQLvariant: @AdamMachanic … I realize that table partitioning would normally present a better solution but I can still see some useful scenariosTue, 09 Feb 2010 18:01:14http://twitter.com/SQLvariant/statuses/8864294018
AdamMachanic: @SQLvariant I was hoping you would send me a link or twoTue, 09 Feb 2010 18:01:43http://twitter.com/AdamMachanic/statuses/8864311604
AdamMachanic: @SQLvariant Hmm, I’ve been thinking about only indexing last qtr w/ filtered index, and that strikes me as a really bad idea. Not only 1/?Tue, 09 Feb 2010 20:25:45http://twitter.com/AdamMachanic/statuses/8869378189
AdamMachanic: @SQLvariant will you have to hardcode a date in the index–meaning that you’ll have to keep it up to date somehow (manually? a process?) 2/?Tue, 09 Feb 2010 20:26:15http://twitter.com/AdamMachanic/statuses/8869395660
AdamMachanic: @SQLvariant your queries on the last qtr will also have to include that date, hardcoded, or else your plans won’t use the index 3/?Tue, 09 Feb 2010 20:27:32http://twitter.com/AdamMachanic/statuses/8869441985
AdamMachanic: @SQLvariant ex. SELECT * FROM tbl WHERE dt > @dt — would have to either not use the idx or recompile every time 4/4Tue, 09 Feb 2010 20:29:06http://twitter.com/AdamMachanic/statuses/8869496991
AdamMachanic: @SQLvariant Any thoughts on my comments yesterday regarding filtered indexes and indexing for the last quarter?Thu, 11 Feb 2010 01:04:53http://twitter.com/AdamMachanic/statuses/8928304859
SQLvariant: @AdamMachanic yes. Chewing on those now. (Just read them a few minutes ago.)Thu, 11 Feb 2010 01:05:39http://twitter.com/SQLvariant/statuses/8928338093
SQLvariant: @AdamMachanic So on 4/4: Were you saying rebuild every time you run the query? or rebuild query every time you rebuild the index?Thu, 11 Feb 2010 01:06:55http://twitter.com/SQLvariant/statuses/8928391399
AdamMachanic: @SQLvariant Your query has to have literals for it to use the index–I tried. It might work w/ the RECOMPILE hint but I didn’t test thatThu, 11 Feb 2010 01:41:02http://twitter.com/AdamMachanic/statuses/8929858439
AdamMachanic: @SQLvariant So rebuild query every time you rebuild the index. And you have to have more than one query: One for most recent quarter, 1/2Thu, 11 Feb 2010 01:41:35http://twitter.com/AdamMachanic/statuses/8929881524
AdamMachanic: @SQLvariant … and one for all of history … and logic in your app to choose the right query. PITA! 2/2Thu, 11 Feb 2010 01:41:54http://twitter.com/AdamMachanic/statuses/8929894671
SQLvariant: @AdamMachanic If I only have to rebuild the queries every time that I rebuild the index that’s very doable.Thu, 11 Feb 2010 01:43:12http://twitter.com/SQLvariant/statuses/8929949599
SQLvariant: @AdamMachanic Actually…. We already have more complex logic our app. Just put it there last month, before it was doing full scans.Thu, 11 Feb 2010 01:44:13http://twitter.com/SQLvariant/statuses/8929992415
AdamMachanic: @SQLvariant And you want to have to re-create your indexes every quarter? Not simply run ALTER REBUILD, but actually re-createThu, 11 Feb 2010 01:44:17http://twitter.com/AdamMachanic/statuses/8929995070
AdamMachanic: @SQLvariant And maintain every query in the system that might rely on that index, and make sure the app chooses the right query every time?Thu, 11 Feb 2010 01:44:46http://twitter.com/AdamMachanic/statuses/8930015395
SQLvariant: @AdamMachanic Why it was doing full scans is a long story itself. But this is definitely giving me ideas to test out.Thu, 11 Feb 2010 01:45:08http://twitter.com/SQLvariant/statuses/8930031641
AdamMachanic: @SQLvariant But a normal index could be used instead, without all of this maintenance overhead, at the cost of storage spaceThu, 11 Feb 2010 01:45:36http://twitter.com/AdamMachanic/statuses/8930051825
SQLvariant: @AdamMachanic Not rebuild them every quarter, rebuild them every weekend to reflect the last 13 weeks.Thu, 11 Feb 2010 01:46:05http://twitter.com/SQLvariant/statuses/8930072522
SQLvariant: @AdamMachanic Right, see that’s the thing. I’m pondering this because I want to save on storage.Thu, 11 Feb 2010 01:47:02http://twitter.com/SQLvariant/statuses/8930112826
AdamMachanic: @SQLvariant Even worse!Thu, 11 Feb 2010 01:47:05http://twitter.com/AdamMachanic/statuses/8930115641
SQLvariant: @AdamMachanic Here’s my situation: I have a handful of Flash drives for my @EMCCLARiiON but they’re not enough to store all of my indexes.Thu, 11 Feb 2010 01:48:09http://twitter.com/SQLvariant/statuses/8930161731
AdamMachanic: @SQLvariant I don’t think the chance of screwing up and missing a query is worth the benefit… unless you use a macro system to 1/2Thu, 11 Feb 2010 01:48:35http://twitter.com/AdamMachanic/statuses/8930180307
SQLvariant: @AdamMachanic …2/2 but if some of the indexes were filtered indexes they could all fit.Thu, 11 Feb 2010 01:48:51http://twitter.com/SQLvariant/statuses/8930191531
AdamMachanic: @SQLvariant maintain the queries… hmm… could be a use case for this: http://datamanipulation.net/tsqlmacro/Thu, 11 Feb 2010 01:49:09http://twitter.com/AdamMachanic/statuses/8930204248
SQLvariant: @AdamMachanic Are you suggesting the TSQLMacro for rebuilding the queries or something?Thu, 11 Feb 2010 01:50:48http://twitter.com/SQLvariant/statuses/8930274857
AdamMachanic: @SQLvariant But are they filtered for something like “last quarter” or something more static? I’m not saying filtered indexes are bad, 1/2Thu, 11 Feb 2010 01:51:12http://twitter.com/AdamMachanic/statuses/8930292580
AdamMachanic: @SQLvariant but rather that it’s dangerous and tricky to properly filter on a moving target 2/2Thu, 11 Feb 2010 01:51:32http://twitter.com/AdamMachanic/statuses/8930307008
AdamMachanic: @SQLvariant Right, you could create a macro for the dynamic predicate, then just re-process all of your procs along w/ the queriesThu, 11 Feb 2010 01:51:59http://twitter.com/AdamMachanic/statuses/8930325441
AdamMachanic: @SQLvariant As a matter of fact, I’m about to release a new version of the framework, and this could be interesting to add to the docs!Thu, 11 Feb 2010 01:52:30http://twitter.com/AdamMachanic/statuses/8930348061
SQLvariant: @AdamMachanic I have a very static use pattern for records in the last 90 days; after that, they don’t get queried as much.Thu, 11 Feb 2010 01:53:18http://twitter.com/SQLvariant/statuses/8930382479
SQLvariant: @AdamMachanic Oh. Well I was just going to do it with PowerShell. We have a TSQL query that rebuilds them now (a bit manual though).Thu, 11 Feb 2010 01:54:46http://twitter.com/SQLvariant/statuses/8930445413
AdamMachanic: @SQLvariant That rebuilds queries? You’re already doing that work?Thu, 11 Feb 2010 01:55:34http://twitter.com/AdamMachanic/statuses/8930479667
SQLvariant: @AdamMachanic … Sorry, re-phrase: We already have a TSQL Query that rebuilds other things based on current month.Thu, 11 Feb 2010 01:55:36http://twitter.com/SQLvariant/statuses/8930481170
AdamMachanic: @SQLvariant So what are you doing now? Maintaining a separate table for the previous 90 days worth of data?Thu, 11 Feb 2010 01:56:27http://twitter.com/AdamMachanic/statuses/8930517242
SQLvariant: @AdamMachanic um, kinda-sorta. But it’s for the B/I system, not for the relational system.Thu, 11 Feb 2010 01:57:55http://twitter.com/SQLvariant/statuses/8930578819
AdamMachanic: @SQLvariant Seems like partitioning (as you mentioned yesterday) would be the easiest solution for the case you mentioned… if you have 1/2Thu, 11 Feb 2010 02:00:06http://twitter.com/AdamMachanic/statuses/8940513547
SQLvariant: @AdamMachanic I was already working on replacing that semi-automatic T-SQL process with a fully-automatic PowerShell script.Thu, 11 Feb 2010 02:00:22http://twitter.com/SQLvariant/statuses/8940528365
AdamMachanic: @SQLvariant less than 20 years worth of data, you could even partition by week and it would be dead simple to maintain 2/2Thu, 11 Feb 2010 02:00:25http://twitter.com/AdamMachanic/statuses/8940530606
SQLvariant: @AdamMachanic Right. Because I can create indexes that are only on certain partitions right?Thu, 11 Feb 2010 02:01:46http://twitter.com/SQLvariant/statuses/8940593964
AdamMachanic: @SQLvariant I don’t believe so, but you can put partitions on whatever filegroup you want and easily move them around at will, so if you 1/2Thu, 11 Feb 2010 02:02:35http://twitter.com/AdamMachanic/statuses/8940630194
AdamMachanic: @SQLvariant had one partition per week you could slide the new week onto the flash drives and the old week off in two lines of code 2/2Thu, 11 Feb 2010 02:03:01http://twitter.com/AdamMachanic/statuses/8940648133
SQLvariant: @AdamMachanic Sorry, must be mixing SSAS aggregations. But I dont want to even backup the full indexes since I won’t really need all of themThu, 11 Feb 2010 02:06:24http://twitter.com/SQLvariant/statuses/8940794933
AdamMachanic: @SQLvariant No one will ever ask for old data? Or you just don’t care if the old data query takes 100x longer?Thu, 11 Feb 2010 02:07:24http://twitter.com/AdamMachanic/statuses/8940839200
SQLvariant: @AdamMachanic … I know what you’re going to say: File Group level backup.Thu, 11 Feb 2010 02:07:51http://twitter.com/SQLvariant/statuses/8940859327
AdamMachanic: @SQLvariant Wrong, I wouldn’t say that. Because I barely ever deal with backups and it wouldn’t occur to me 🙂Thu, 11 Feb 2010 02:08:35http://twitter.com/AdamMachanic/statuses/8940891847
SQLvariant: @AdamMachanic more like noone does same type of range searches after 3 months.If they’re looking for an 18 month old record they know the idThu, 11 Feb 2010 02:10:17http://twitter.com/SQLvariant/statuses/8940966189
AdamMachanic: @SQLvariant Hmm, that might change things–you wouldn’t necessarily have to bear the query maintenance nightmare. What if, instead of 1/2Thu, 11 Feb 2010 02:13:55http://twitter.com/AdamMachanic/statuses/8941122014
SQLvariant: @AdamMachanic lets put it this way, I don’t care if ‘old-record’ queries take 100x longer because they are out-numbered 12,000:1 by ‘new’Thu, 11 Feb 2010 02:14:31http://twitter.com/SQLvariant/statuses/8941147347
AdamMachanic: @SQLvariant filtering on the date col, you create a BIT col, IsInLast90Days, and update it nightly? Then filter your index on that? And 2/3Thu, 11 Feb 2010 02:14:34http://twitter.com/AdamMachanic/statuses/8941149460
AdamMachanic: @SQLvariant use that column in the WHERE clause in all of the range queries 3/3Thu, 11 Feb 2010 02:14:50http://twitter.com/AdamMachanic/statuses/8941160597
SQLvariant: @AdamMachanic oh, so what I was thinking was filter on last 90 days, then INCLUDE like 6 additional columns. Can I do that?Thu, 11 Feb 2010 02:15:50http://twitter.com/SQLvariant/statuses/8941204408
AdamMachanic: @SQLvariant Yes, you can have included columns in a filtered indexThu, 11 Feb 2010 02:16:46http://twitter.com/AdamMachanic/statuses/8941244702
AdamMachanic: @SQLvariant the important thing is you can’t filter on “last 90 days” — you have to include an actual date. That’s why I suggested BIT colThu, 11 Feb 2010 02:18:32http://twitter.com/AdamMachanic/statuses/8941317917
AdamMachanic: @SQLvariant Then it becomes 100% data-driven and you don’t have to change any indexes or queries to keep it workingThu, 11 Feb 2010 02:19:02http://twitter.com/AdamMachanic/statuses/8941338868
SQLvariant: @AdamMachanic ok. Well doing the bit column would make the filtered indexes even smaller too. I like your solution .Thu, 11 Feb 2010 02:19:22http://twitter.com/SQLvariant/statuses/8941352821
AdamMachanic: @SQLvariant I think I’ll do a blog post on this 🙂 … thanks for the interesting discussionThu, 11 Feb 2010 02:20:07http://twitter.com/AdamMachanic/statuses/8941384877
SQLvariant: @AdamMachanic I would just have to do heavy re-indexing on the weekends right? Oh no wait, I don’t!! because they’re on SSD drives 😀Thu, 11 Feb 2010 02:21:33http://twitter.com/SQLvariant/statuses/8941449444
SQLvariant: @AdamMachanic Awesome. I think we may have solved more problems than we currently realize. please remember to mention me 🙂Thu, 11 Feb 2010 02:25:00http://twitter.com/SQLvariant/statuses/8941594839
SQLvariant: @AdamMachanic Hey maybe we could do this again next week and you could help me solve some of my XML dilemmas 😀Thu, 11 Feb 2010 02:42:36http://twitter.com/SQLvariant/statuses/8942342844
AdamMachanic: @SQLvariant Sure thing! XML is something I love to hate…Thu, 11 Feb 2010 03:12:23http://twitter.com/AdamMachanic/statuses/8943583606

Please Share This:

You may also like:

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Follow:

Subcribe to Blog Via Email

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

What I'm Saying on Twitter

Subscribe via feedburner

%d bloggers like this: