SQL Server Fragmentation Problems
Asked Answered
J

1

7

I have a couple of tables (User & UserRecord) in my database that get extremely fragmented (like 99%) and cause the whole database and therefore the website to grind to a halt.

UserRecord is kind of like a snapshot of that user at a point in time. User is like the master record for that user. User has 0 to many UserRecords. User has around a million rows, UserRecord has around 2.5 million. These tables get written to a lot. They're also being searched a lot. They're both going to get a lot bigger. The main indexes getting badly fragmented are the primary keys of the User and UserRecord tables.

The DB is SQL Server 2012 and I'm using Entity Framework and I'm not using any stored procedures.

Tables look something like this:

USER
UserName string PK ClusteredIndex
FirstName string
LastName string
+SeveralMoreRows

USER_RECORD
UserRecordId int PK ClusteredIndex
ListId int FK(List)
UserName string FK(User) NonClusteredIndex
Community string NonClusteredIndex
DateCreated datetime
+LotsMoreRows

LIST 
ListId int PK & ClusteredIndex
Name string
DateCreated datetime

(not sure if List this is important or not but thought I'd include it as it's related to User_Record. List has 0 to many UserRecords)

We've set a SQL Maintenance plan to rebuild the indexes daily which does help, but is sometimes not enough.

A friend has suggested we use two databases, one for reading, one for writing, and we sync the read DB from the write DB. Not that I know anything about doing this, but the first problem I see with this solutation is that we need up to date data when viewing the site. For example if we update a User details or a UserRecord, we want to see those changes straight away.

Does anyone have any suggestions on how I can fix this problem before it spirals out of control?

Justus answered 15/11, 2013 at 10:6 Comment(8)
What are the table definitions? Are you using GUIDs as primary keys?Bifarious
do you habe a clustered index on a uniqueidentifier column? this often leeds to fragmentation after some inserts ... because the values are random ...Austro
I've added some more details to the questionJustus
And how have you concluded that fragmentation is causing a problem anyway? Wouldn't the vast majority of queries be doing a seek on specific users?Bifarious
The site is broken and most queries are timing out, we look at the fragmentation of the indexes, they're up to like 99%, we run the SQL Server Index Rebuild/Repair task, everything works fine again.Justus
If the queries are timing out (30 second timeout?) then I would suspect a bad plan or blocking rather than fragmentation. Rebuilding the index will also rebuild the statistics which means the plan will get recompiled. You may have a parameter sniffing issue.Bifarious
I agree with Martin Smith, rebuilding the indexes also rebuilds the stats, and this is often what leads to the performance improvement, but because it was a rebuild index command that was issued, people often automatically assume that the index fragmentation was the cause.Touzle
This is true. Also, regarding my answer below, it may well be that just recreating the clustered index as it is with a lesser fillfactor, like 80, will with the combination of a good statistics update plan be more than enough. That way there will be space in the data pages to store the new values even when they're stored between the existing ones. The performance impact would only be 20% which assuming the statistics are working might be hardly noticeable.Ichnography
I
5

Clustered indexes control the order of the data on the DISK. This is one of the main reasons why it's usually recommended that you set up a always increasing integer key to act as the clustered index. This way as more data is added to the table, they're added to the end of the currently existing data.

If it's not an autoincreasing number and new rows may contain values that would be ordered somewhere between existing values, then SQL Server will basically push the data onto the disk where it belongs (to retain the order of the clustered index key values), producing fragmentation and potentially severe overhead as IO writes further slowing down the database.

I suspect you have the same problem with your UserRecord values.

So what I would do, is add a separate clustered autoincreasing primary key to each table and rework your FK references & queries where necessary.

Ichnography answered 15/11, 2013 at 11:12 Comment(7)
Why not declare the PK index as non-clustered?Tautonym
Usually it's best to have a clustered index on a table. Even if you leave it out and create a non-clustered PK, it'll store the table as a HEAP which has a number of other problems. For instance, all queries being run against it would then first have to find the nonclustered index match, then find the matching rows from the HEAP to get the other values since they don't come readily as they do with a clustered index. Again, producing unnecessary overhead slowing down the DB. Google has a few good articles on HEAP vs CLUSTERED tables.Ichnography
Thanks, sounds like SQL Server is extremely different to other DBMS (e.g. Postgres, Oracle) when it comes to indexing.Tautonym
Could an increased fill factor combined with daily index rebuilds be an alternative to adding an auto incrementing int primary key on the User table?Justus
Yes you can do that Owen, but an ever increasing value to cluster on is usually a good idea. If you have a date column you can cluster on that instead of your primary key. You can run index reorg or rebuild nightly in sql server using maintenance tasks or custom jobs.Haye
I added an auto incrementing int to my User table and set it to the clustered index. It was a lot easier than I thought it'd be and no code needed to be changed. Not sure if it'll fix the problem, but it's a start. You may be seeing more questions from me soon... Thanks for your help.Justus
Note that if your queries are still using another column to fetch the data, you might want to add a nonclustered index to that as well to avoid excessive table scans.Ichnography

© 2022 - 2024 — McMap. All rights reserved.