Photo by Chris Liverani on Unsplash

What to Do When the Index is Larger Than the SQL Table

SQL Server Dec 21, 2021

Working with Tables is the developer's daily routine. But SQL Development is not very often. In some cases, I saw in my past. That many developers create tables with one index.

The Problem

While the product is used in production, it will be going slow when the data amount increases. Every developer say “Yo! Create an index”.

This can result that the index will be larger than the table itself.

What causes the index size?

There are several causes that increase the size of indexes.

Too many indexes in the same columns.

First of all, analyze your indexes. Drop unused indexes. Also, reduce the number of columns in existing ones.

Remember that the clustered index column(s) is a “hidden” included column in all non clustered indexes

To identify the usage of indexes, you can use the following query:

select
i.name,
s.last_system_lookup,
last_system_scan,
last_system_seek,
last_system_update,
last_user_lookup,
last_user_scan,
last_user_scan,
last_user_seek,
last_user_update
from
sys.dm_db_index_usage_stats as s inner join
sys.indexes as i
on
s.object_id = i.object_id

The output will result like this:

This will display when and who the index will be used.

Overlapping indexes

If you have an index on a,b,c,d and an index on a,b,c you might consider dropping the second one as the first one covers the second one. So try to aggregate overlapping columns into one index.

Index Fragmentation

Indexes can get fragmented, you can get an overview about the fragmentation of the actual indexes with the following query.

SELECT S.name as 'Schema',
T.name as 'Table',
I.name as 'Index',
DDIPS.avg_fragmentation_in_percent,
DDIPS.page_count
FROM sys.dm_db_index_physical_stats (DB_ID(), NULL, NULL, NULL, NULL) AS DDIPS
INNER JOIN sys.tables T on T.object_id = DDIPS.object_id
INNER JOIN sys.schemas S on T.schema_id = S.schema_id
INNER JOIN sys.indexes I ON I.object_id = DDIPS.object_id
AND DDIPS.index_id = I.index_id
WHERE DDIPS.database_id = DB_ID()
and I.name is not null
AND DDIPS.avg_fragmentation_in_percent > 0
ORDER BY DDIPS.avg_fragmentation_in_percent desc

In this case, you can rebuild the index with the following command

ALTER INDEX INDEX_NAME; ON TABLE_NAME REBUILD

After that, the index will be rebuilt and the fragmentation will be decreased

Identify what is stored in an index file?

First, you must know, that an index will store whatever you tell to store. This query will help you tell which indexes are using the most space and for what reason (in row data, lob data)

AS pagesFROM sys.indexes i
JOIN sys.partitions p
ON i.object_id = p.object_id AND i.index_id = p.index_id
JOIN sys.allocation_units a
ON p.partition_id = a.container_id
ORDER BY pages desc

Find the best solution for you

As you can see in the diagram when you create more indexes, you get a higher consumption of index space, and hard to maintain these. But fewer indexes will decrease the query performance. It will be hard to tell anyone which practice fits. So it can be said, “It depends on your data”. When you want a rule of thumb:

Use the “5 and 5 rule”. This means that you can create 5 indexes with 5 max. 5 columns per table.

Tags