A few months back as one of my very first blog posts I showed off some functionality that I had learned in Bruce Payette’s book “PowerShell In Action” (Coincidentally a new update to the MEAP for the Second Edition was released tonight). I showed how it was not too difficult to Archive the wisdom of Paul Randal’s Tweets for the PASS Summit. Tonight I had a conversation with Adam Machanic and wanted to save that off as well.
I got an idea when watching Allen White talk about Filtered Indexes on Tuesday (insert obligatory AppDev VC of PASS plug here: “Allen White Presents: SQL Server Indexing” ). I have a situation where I have six or seven columns that I want to index but I would prefer to only index the rows that have been created after the first 90 days. After the first 90 days our people just don’t query those columns anywhere near as much. My original idea was to create a filtered index on the ‘CreationDate’ column and then INCLUDE the 5 or 6 other columns that tend to get queried heavily during those first 90 days. I also have some flash drives on my SAN that I want to use for these indexes but the drives aren’t very big and if I create the indexes across the whole table they will use up a lot of space and only a fraction of the entire index would get hit hard (very hard).
The conversation that transpired between Adam and I was a great dive into the topic of Filtered Indexes for me; I’ve worked with them before but I’ve never tried this particular scenario before. Adam said that this convo gave him idea to blog about. I can’t wait to see it. Until then you can read up on what we discussed below.
Having worked with PowerShell for several more months now I have learned a few things and was able to implement them on the fly to make this process even easier. This time instead of using the format options I just select-ed the columns that I wanted to keep out of the Pipeline and exported it to CSV. The process of favorite’ing the individual tweets in the conversation is still a manual process; but I’m sure that a few more weeks of PowerShell learning will fix that.
Here’s the code that I used tonight:
([xml] (new-object net.webclient).DownloadString(“http://twitter.com/favorites/76699854.rss”)).rss.channel.item | select description, pubDate, link | Export-CSV c:\Temp\FilteredIndexes.CSV
And now, you can read the conversation that I had with Adam Machanic without all the clutter of the other tweets during that time period:
description | pubDate | link |
SQLvariant: Listening to Allen White speak about Indexing for the AppDev VC of #SQLPASS http://bit.ly/bgM06y | Tue, 09 Feb 2010 17:03:37 | http://twitter.com/SQLvariant/statuses/8862223289 |
MidnightDBA: I’m behind the times, work on SQL05. Can’t wait to work w/sql 08 professionally, filtered indexes are awesomesauce. | Tue, 09 Feb 2010 17:10:55 | http://twitter.com/MidnightDBA/statuses/8862493660 |
AdamMachanic: @MidnightDBA They’re not as great in practice as they seem when you first learn about them, IMO. Lots of limitations and few use cases | Tue, 09 Feb 2010 17:18:54 | http://twitter.com/AdamMachanic/statuses/8862783462 |
SQLvariant: @AdamMachanic Well then you should probably tell Allen to quit giving us good ideas on how to use them 😉 | Tue, 09 Feb 2010 17:24:34 | http://twitter.com/SQLvariant/statuses/8862986987 |
AdamMachanic: @SQLvariant Such as? | Tue, 09 Feb 2010 17:49:13 | http://twitter.com/AdamMachanic/statuses/8863863635 |
SQLvariant: @AdamMachanic Only indexing the last quarter’s worth of data. | Tue, 09 Feb 2010 17:59:54 | http://twitter.com/SQLvariant/statuses/8864239382 |
SQLvariant: @AdamMachanic … I realize that table partitioning would normally present a better solution but I can still see some useful scenarios | Tue, 09 Feb 2010 18:01:14 | http://twitter.com/SQLvariant/statuses/8864294018 |
AdamMachanic: @SQLvariant I was hoping you would send me a link or two | Tue, 09 Feb 2010 18:01:43 | http://twitter.com/AdamMachanic/statuses/8864311604 |
AdamMachanic: @SQLvariant Hmm, I’ve been thinking about only indexing last qtr w/ filtered index, and that strikes me as a really bad idea. Not only 1/? | Tue, 09 Feb 2010 20:25:45 | http://twitter.com/AdamMachanic/statuses/8869378189 |
AdamMachanic: @SQLvariant will you have to hardcode a date in the index–meaning that you’ll have to keep it up to date somehow (manually? a process?) 2/? | Tue, 09 Feb 2010 20:26:15 | http://twitter.com/AdamMachanic/statuses/8869395660 |
AdamMachanic: @SQLvariant your queries on the last qtr will also have to include that date, hardcoded, or else your plans won’t use the index 3/? | Tue, 09 Feb 2010 20:27:32 | http://twitter.com/AdamMachanic/statuses/8869441985 |
AdamMachanic: @SQLvariant ex. SELECT * FROM tbl WHERE dt > @dt — would have to either not use the idx or recompile every time 4/4 | Tue, 09 Feb 2010 20:29:06 | http://twitter.com/AdamMachanic/statuses/8869496991 |
AdamMachanic: @SQLvariant Any thoughts on my comments yesterday regarding filtered indexes and indexing for the last quarter? | Thu, 11 Feb 2010 01:04:53 | http://twitter.com/AdamMachanic/statuses/8928304859 |
SQLvariant: @AdamMachanic yes. Chewing on those now. (Just read them a few minutes ago.) | Thu, 11 Feb 2010 01:05:39 | http://twitter.com/SQLvariant/statuses/8928338093 |
SQLvariant: @AdamMachanic So on 4/4: Were you saying rebuild every time you run the query? or rebuild query every time you rebuild the index? | Thu, 11 Feb 2010 01:06:55 | http://twitter.com/SQLvariant/statuses/8928391399 |
AdamMachanic: @SQLvariant Your query has to have literals for it to use the index–I tried. It might work w/ the RECOMPILE hint but I didn’t test that | Thu, 11 Feb 2010 01:41:02 | http://twitter.com/AdamMachanic/statuses/8929858439 |
AdamMachanic: @SQLvariant So rebuild query every time you rebuild the index. And you have to have more than one query: One for most recent quarter, 1/2 | Thu, 11 Feb 2010 01:41:35 | http://twitter.com/AdamMachanic/statuses/8929881524 |
AdamMachanic: @SQLvariant … and one for all of history … and logic in your app to choose the right query. PITA! 2/2 | Thu, 11 Feb 2010 01:41:54 | http://twitter.com/AdamMachanic/statuses/8929894671 |
SQLvariant: @AdamMachanic If I only have to rebuild the queries every time that I rebuild the index that’s very doable. | Thu, 11 Feb 2010 01:43:12 | http://twitter.com/SQLvariant/statuses/8929949599 |
SQLvariant: @AdamMachanic Actually…. We already have more complex logic our app. Just put it there last month, before it was doing full scans. | Thu, 11 Feb 2010 01:44:13 | http://twitter.com/SQLvariant/statuses/8929992415 |
AdamMachanic: @SQLvariant And you want to have to re-create your indexes every quarter? Not simply run ALTER REBUILD, but actually re-create | Thu, 11 Feb 2010 01:44:17 | http://twitter.com/AdamMachanic/statuses/8929995070 |
AdamMachanic: @SQLvariant And maintain every query in the system that might rely on that index, and make sure the app chooses the right query every time? | Thu, 11 Feb 2010 01:44:46 | http://twitter.com/AdamMachanic/statuses/8930015395 |
SQLvariant: @AdamMachanic Why it was doing full scans is a long story itself. But this is definitely giving me ideas to test out. | Thu, 11 Feb 2010 01:45:08 | http://twitter.com/SQLvariant/statuses/8930031641 |
AdamMachanic: @SQLvariant But a normal index could be used instead, without all of this maintenance overhead, at the cost of storage space | Thu, 11 Feb 2010 01:45:36 | http://twitter.com/AdamMachanic/statuses/8930051825 |
SQLvariant: @AdamMachanic Not rebuild them every quarter, rebuild them every weekend to reflect the last 13 weeks. | Thu, 11 Feb 2010 01:46:05 | http://twitter.com/SQLvariant/statuses/8930072522 |
SQLvariant: @AdamMachanic Right, see that’s the thing. I’m pondering this because I want to save on storage. | Thu, 11 Feb 2010 01:47:02 | http://twitter.com/SQLvariant/statuses/8930112826 |
AdamMachanic: @SQLvariant Even worse! | Thu, 11 Feb 2010 01:47:05 | http://twitter.com/AdamMachanic/statuses/8930115641 |
SQLvariant: @AdamMachanic Here’s my situation: I have a handful of Flash drives for my @EMCCLARiiON but they’re not enough to store all of my indexes. | Thu, 11 Feb 2010 01:48:09 | http://twitter.com/SQLvariant/statuses/8930161731 |
AdamMachanic: @SQLvariant I don’t think the chance of screwing up and missing a query is worth the benefit… unless you use a macro system to 1/2 | Thu, 11 Feb 2010 01:48:35 | http://twitter.com/AdamMachanic/statuses/8930180307 |
SQLvariant: @AdamMachanic …2/2 but if some of the indexes were filtered indexes they could all fit. | Thu, 11 Feb 2010 01:48:51 | http://twitter.com/SQLvariant/statuses/8930191531 |
AdamMachanic: @SQLvariant maintain the queries… hmm… could be a use case for this: http://datamanipulation.net/tsqlmacro/ | Thu, 11 Feb 2010 01:49:09 | http://twitter.com/AdamMachanic/statuses/8930204248 |
SQLvariant: @AdamMachanic Are you suggesting the TSQLMacro for rebuilding the queries or something? | Thu, 11 Feb 2010 01:50:48 | http://twitter.com/SQLvariant/statuses/8930274857 |
AdamMachanic: @SQLvariant But are they filtered for something like “last quarter” or something more static? I’m not saying filtered indexes are bad, 1/2 | Thu, 11 Feb 2010 01:51:12 | http://twitter.com/AdamMachanic/statuses/8930292580 |
AdamMachanic: @SQLvariant but rather that it’s dangerous and tricky to properly filter on a moving target 2/2 | Thu, 11 Feb 2010 01:51:32 | http://twitter.com/AdamMachanic/statuses/8930307008 |
AdamMachanic: @SQLvariant Right, you could create a macro for the dynamic predicate, then just re-process all of your procs along w/ the queries | Thu, 11 Feb 2010 01:51:59 | http://twitter.com/AdamMachanic/statuses/8930325441 |
AdamMachanic: @SQLvariant As a matter of fact, I’m about to release a new version of the framework, and this could be interesting to add to the docs! | Thu, 11 Feb 2010 01:52:30 | http://twitter.com/AdamMachanic/statuses/8930348061 |
SQLvariant: @AdamMachanic I have a very static use pattern for records in the last 90 days; after that, they don’t get queried as much. | Thu, 11 Feb 2010 01:53:18 | http://twitter.com/SQLvariant/statuses/8930382479 |
SQLvariant: @AdamMachanic Oh. Well I was just going to do it with PowerShell. We have a TSQL query that rebuilds them now (a bit manual though). | Thu, 11 Feb 2010 01:54:46 | http://twitter.com/SQLvariant/statuses/8930445413 |
AdamMachanic: @SQLvariant That rebuilds queries? You’re already doing that work? | Thu, 11 Feb 2010 01:55:34 | http://twitter.com/AdamMachanic/statuses/8930479667 |
SQLvariant: @AdamMachanic … Sorry, re-phrase: We already have a TSQL Query that rebuilds other things based on current month. | Thu, 11 Feb 2010 01:55:36 | http://twitter.com/SQLvariant/statuses/8930481170 |
AdamMachanic: @SQLvariant So what are you doing now? Maintaining a separate table for the previous 90 days worth of data? | Thu, 11 Feb 2010 01:56:27 | http://twitter.com/AdamMachanic/statuses/8930517242 |
SQLvariant: @AdamMachanic um, kinda-sorta. But it’s for the B/I system, not for the relational system. | Thu, 11 Feb 2010 01:57:55 | http://twitter.com/SQLvariant/statuses/8930578819 |
AdamMachanic: @SQLvariant Seems like partitioning (as you mentioned yesterday) would be the easiest solution for the case you mentioned… if you have 1/2 | Thu, 11 Feb 2010 02:00:06 | http://twitter.com/AdamMachanic/statuses/8940513547 |
SQLvariant: @AdamMachanic I was already working on replacing that semi-automatic T-SQL process with a fully-automatic PowerShell script. | Thu, 11 Feb 2010 02:00:22 | http://twitter.com/SQLvariant/statuses/8940528365 |
AdamMachanic: @SQLvariant less than 20 years worth of data, you could even partition by week and it would be dead simple to maintain 2/2 | Thu, 11 Feb 2010 02:00:25 | http://twitter.com/AdamMachanic/statuses/8940530606 |
SQLvariant: @AdamMachanic Right. Because I can create indexes that are only on certain partitions right? | Thu, 11 Feb 2010 02:01:46 | http://twitter.com/SQLvariant/statuses/8940593964 |
AdamMachanic: @SQLvariant I don’t believe so, but you can put partitions on whatever filegroup you want and easily move them around at will, so if you 1/2 | Thu, 11 Feb 2010 02:02:35 | http://twitter.com/AdamMachanic/statuses/8940630194 |
AdamMachanic: @SQLvariant had one partition per week you could slide the new week onto the flash drives and the old week off in two lines of code 2/2 | Thu, 11 Feb 2010 02:03:01 | http://twitter.com/AdamMachanic/statuses/8940648133 |
SQLvariant: @AdamMachanic Sorry, must be mixing SSAS aggregations. But I dont want to even backup the full indexes since I won’t really need all of them | Thu, 11 Feb 2010 02:06:24 | http://twitter.com/SQLvariant/statuses/8940794933 |
AdamMachanic: @SQLvariant No one will ever ask for old data? Or you just don’t care if the old data query takes 100x longer? | Thu, 11 Feb 2010 02:07:24 | http://twitter.com/AdamMachanic/statuses/8940839200 |
SQLvariant: @AdamMachanic … I know what you’re going to say: File Group level backup. | Thu, 11 Feb 2010 02:07:51 | http://twitter.com/SQLvariant/statuses/8940859327 |
AdamMachanic: @SQLvariant Wrong, I wouldn’t say that. Because I barely ever deal with backups and it wouldn’t occur to me 🙂 | Thu, 11 Feb 2010 02:08:35 | http://twitter.com/AdamMachanic/statuses/8940891847 |
SQLvariant: @AdamMachanic more like noone does same type of range searches after 3 months.If they’re looking for an 18 month old record they know the id | Thu, 11 Feb 2010 02:10:17 | http://twitter.com/SQLvariant/statuses/8940966189 |
AdamMachanic: @SQLvariant Hmm, that might change things–you wouldn’t necessarily have to bear the query maintenance nightmare. What if, instead of 1/2 | Thu, 11 Feb 2010 02:13:55 | http://twitter.com/AdamMachanic/statuses/8941122014 |
SQLvariant: @AdamMachanic lets put it this way, I don’t care if ‘old-record’ queries take 100x longer because they are out-numbered 12,000:1 by ‘new’ | Thu, 11 Feb 2010 02:14:31 | http://twitter.com/SQLvariant/statuses/8941147347 |
AdamMachanic: @SQLvariant filtering on the date col, you create a BIT col, IsInLast90Days, and update it nightly? Then filter your index on that? And 2/3 | Thu, 11 Feb 2010 02:14:34 | http://twitter.com/AdamMachanic/statuses/8941149460 |
AdamMachanic: @SQLvariant use that column in the WHERE clause in all of the range queries 3/3 | Thu, 11 Feb 2010 02:14:50 | http://twitter.com/AdamMachanic/statuses/8941160597 |
SQLvariant: @AdamMachanic oh, so what I was thinking was filter on last 90 days, then INCLUDE like 6 additional columns. Can I do that? | Thu, 11 Feb 2010 02:15:50 | http://twitter.com/SQLvariant/statuses/8941204408 |
AdamMachanic: @SQLvariant Yes, you can have included columns in a filtered index | Thu, 11 Feb 2010 02:16:46 | http://twitter.com/AdamMachanic/statuses/8941244702 |
AdamMachanic: @SQLvariant the important thing is you can’t filter on “last 90 days” — you have to include an actual date. That’s why I suggested BIT col | Thu, 11 Feb 2010 02:18:32 | http://twitter.com/AdamMachanic/statuses/8941317917 |
AdamMachanic: @SQLvariant Then it becomes 100% data-driven and you don’t have to change any indexes or queries to keep it working | Thu, 11 Feb 2010 02:19:02 | http://twitter.com/AdamMachanic/statuses/8941338868 |
SQLvariant: @AdamMachanic ok. Well doing the bit column would make the filtered indexes even smaller too. I like your solution . | Thu, 11 Feb 2010 02:19:22 | http://twitter.com/SQLvariant/statuses/8941352821 |
AdamMachanic: @SQLvariant I think I’ll do a blog post on this 🙂 … thanks for the interesting discussion | Thu, 11 Feb 2010 02:20:07 | http://twitter.com/AdamMachanic/statuses/8941384877 |
SQLvariant: @AdamMachanic I would just have to do heavy re-indexing on the weekends right? Oh no wait, I don’t!! because they’re on SSD drives 😀 | Thu, 11 Feb 2010 02:21:33 | http://twitter.com/SQLvariant/statuses/8941449444 |
SQLvariant: @AdamMachanic Awesome. I think we may have solved more problems than we currently realize. please remember to mention me 🙂 | Thu, 11 Feb 2010 02:25:00 | http://twitter.com/SQLvariant/statuses/8941594839 |
SQLvariant: @AdamMachanic Hey maybe we could do this again next week and you could help me solve some of my XML dilemmas 😀 | Thu, 11 Feb 2010 02:42:36 | http://twitter.com/SQLvariant/statuses/8942342844 |
AdamMachanic: @SQLvariant Sure thing! XML is something I love to hate… | Thu, 11 Feb 2010 03:12:23 | http://twitter.com/AdamMachanic/statuses/8943583606 |