better method for DTS: Is this index redundant?

Let's say I have a table:
create table example (
id int,
createtime datetime,
name varchar(20))
go
create index ix_createtime on example(createtime)
go
create index ix_createtime_name on example(createtime, name)
go
Is the index ix_createtime redundant because the createtime column is
part of the composite index ix_createtime_name. If someone whee to
seach based only on the createtime column would the composite index be
as useful as the index on just the createtime column?
Thanks<pshroads@.gmail.com> wrote in message
news:1148056676.401254.153400@.y43g2000cwc.googlegroups.com...
> Is the index ix_createtime redundant because the createtime column is
> part of the composite index ix_createtime_name. If someone whee to
> seach based only on the createtime column would the composite index be
> as useful as the index on just the createtime column?
Yes, it probably is redundant. There are a few very limited edge cases
where it might not be -- for instance, if you're doing a lot of analysis
based on only a single column and every I/O counts -- but they are very few
and very far between. The composite index will certainly work for a query
based only on the createtime column.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--|||Index ix_createtime is leaner than index ix_createtime_name. As such,
it is slightly more useful. However, its advantage is not too
noticeable, and under most circumstances you can safely drop it. you
can figure it out yourself, run your own tests, such as
select * from example where createtime between '20060101' and
'20060202'
there might be ranges for which access via ix_createtime is still
cheaper than scanning the whole clustered index, but access via
ix_createtime_name is already more expensive than table scan.
Good luck!|||Unless the table is read-only, you also need to consider maintenance costs.
With the redundant index, you will have extra work to do for almost every
modification to the table, and you'll need to determine if that cost is
worth the small benefit of having the narrower index for some queries.
--
HTH
Kalen Delaney, SQL Server MVP
www.solidqualitylearning.com
<pshroads@.gmail.com> wrote in message
news:1148056676.401254.153400@.y43g2000cwc.googlegroups.com...
> Let's say I have a table:
> create table example (
> id int,
> createtime datetime,
> name varchar(20))
> go
> create index ix_createtime on example(createtime)
> go
> create index ix_createtime_name on example(createtime, name)
> go
> Is the index ix_createtime redundant because the createtime column is
> part of the composite index ix_createtime_name. If someone whee to
> seach based only on the createtime column would the composite index be
> as useful as the index on just the createtime column?
> Thanks
>|||Kalen,
I agree maintenance costs need to be taken in account. Yet I don't see
why you are calling the benefit of having a lean index "small" without
knowing much about the OP's system. I think it depends on the workload.
Suppose there is a query that runs very frequently. As such, small 10%
savings in its real execution costs may result in, say, 6% reduction of
overall workload for the whole system...|||Thanks. Am I correct in thinking that it's only redundant because
statistics are maintained on only the first column of an index? So if
the composite index had been on (name, createtime) then it would not be
redundant because the optimizer would be unlikely to choose that index
when searching on createtime because there would be no statistics?|||How will said "lean" index be used? I don't think dropping 20 bytes from an
index like this will result in "10% savings" in 99% of the cases, because
most likely a query will involve more than just the datetime column. Keep
in mind that bookmark lookups are much more expensive, in most cases, than
the additional I/Os that would be incurred reading the larger index pages.
We of course don't know what the clustering key is, or any other information
about the use, so this is all speculation, but unless this is a very
specialized situation I'd venture a guess that having both indexes is not
beneficial.
Adam Machanic
Pro SQL Server 2005, available now
http://www.apress.com/book/bookDisplay.html?bID=457
--
"Alexander Kuznetsov" <AK_TIREDOFSPAM@.hotmail.COM> wrote in message
news:1148061719.429023.48180@.i40g2000cwc.googlegroups.com...
> Kalen,
> I agree maintenance costs need to be taken in account. Yet I don't see
> why you are calling the benefit of having a lean index "small" without
> knowing much about the OP's system. I think it depends on the workload.
> Suppose there is a query that runs very frequently. As such, small 10%
> savings in its real execution costs may result in, say, 6% reduction of
> overall workload for the whole system...
>|||well I'm not speaking in terms of 99% of the cases or 99.999% of the
case or whatever else. I am speaking about one particular system we
know almost nothing about - the OP's system. Suppose that system is
very simple and the primary goal of that simple system is to run
something like this:
select count(distinct id) from example where createtime between
@.fromdate and @.todate
as fast as possible. In that case the savings would be substantial, as
there would be no bookmark lookups. However rare, such cases do exist.
That's why I was trying to encourage the original poster to benchmark
by himself rather than rely on our advices. I think in our trade all
rules have exceptions.
That's all I was trying to say...|||On 19 May 2006 09:37:56 -0700, pshroads@.gmail.com wrote:
>Let's say I have a table:
>create table example (
> id int,
> createtime datetime,
> name varchar(20))
>go
>create index ix_createtime on example(createtime)
>go
>create index ix_createtime_name on example(createtime, name)
>go
>Is the index ix_createtime redundant because the createtime column is
>part of the composite index ix_createtime_name. If someone whee to
>seach based only on the createtime column would the composite index be
>as useful as the index on just the createtime column?
If there are a million rows with the same time, and you might have a
query "select * from example where createtime = @.t and name like
'foo%', then, um, is SQLServer smart enough to do its work on the
index in a case like that?
Only reason I can think of you might want that second.
Josh|||On 19 May 2006 11:06:19 -0700, pshroads@.gmail.com wrote:
>Thanks. Am I correct in thinking that it's only redundant because
>statistics are maintained on only the first column of an index? So if
>the composite index had been on (name, createtime) then it would not be
>redundant because the optimizer would be unlikely to choose that index
>when searching on createtime because there would be no statistics?
Hi pshroads,
You are right that the index on (createtime) would not be redundant if
the composite index had been on (name, createtime), but you're wrong
about the reason.
Suppose I give you a stack of paper with one-paper sized biographies of
all famous people of the last millenium, ordered by birthdate. I'll also
give you a much smaller stack of paper that includes just the name and
the page number(s) where a person with that name is described, sorted by
last name, then first name. And I give you another index that holds just
the first name and the pages where someone with that first name is
found.
What would your strategy be if I asked you for some details about
someone with first name "Kathlyn"?
--
Hugo Kornelis, SQL Server MVP|||I suppose it would be to look up the name Kathlyn in the third index
and get the page number and then go to that page number in the large
stack. Perhaps I'm being dense but I still need the reason explained :)
Thanks!|||On 19 May 2006 16:12:32 -0700, pshroads@.gmail.com wrote:
>I suppose it would be to look up the name Kathlyn in the third index
>and get the page number and then go to that page number in the large
>stack. Perhaps I'm being dense but I still need the reason explained :)
>Thanks!
Hi pshroads,
The first stack of paper equates to the table. The second stack equates
to an index on (last name, first name). And the third stack equates to
an index on only (first name). You'll use the index on (first name) to
find all people with first name Kathlyn, and you'll use the index on
(last name, first name) to find all people with last name Harrison, or
all people named George Washington. Both indexes are useful, without
being redundant.
Now replace the index on (first name) with an index on (last name) [i.e.
a list of last names and page numbers where these last names are
mentioned]. You'll still use the (last name, first name) index for
finding George Washington. You could use the new (last name) index to
locate every Mr or Mrs Harrison, but you can still do that with the
(last name, first name) index as well. And when looking for Kathlyn,
your only option is to sit down and flip each and every page of the
biggest stack of papers (the infamous table scan). So now, you have an
index that is only usable for a kind of search that can also be done
with the other index - clearly the (last name) index is redundant.
Database searches work exactly the same as yoour searches in those
stacks of paper. The only difference is that the computer turns the
pages somewhat faster. :-)
--
Hugo Kornelis, SQL Server MVP

Friday, March 23, 2012

Is this index redundant?

No comments:

Post a Comment

better method for DTS

Blog Archive

About Me