How to set Azure SQL to rebuild indexes automatically?
Asked Answered
Q

4

29

In on premise SQL databases, it is normal to have a maintenance plan for rebuilding the indexes once in a while, when it is not being used that much.

How can I set it up in Azure SQL DB?

P.S: I tried it before, but since I couldn't find any options for that, I thought maybe they are doing it automatically until I've read this post and tried:

SELECT
 DB_NAME() AS DBName
 ,OBJECT_NAME(ps.object_id) AS TableName
 ,i.name AS IndexName
 ,ips.index_type_desc
 ,ips.avg_fragmentation_in_percent
 FROM sys.dm_db_partition_stats ps
 INNER JOIN sys.indexes i
 ON ps.object_id = i.object_id
 AND ps.index_id = i.index_id
 CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), ps.object_id, ps.index_id, null, 'LIMITED') ips
 ORDER BY ps.object_id, ps.index_id

And found out that I have indexes that need maintaining enter image description here

Quincyquindecagon answered 8/2, 2018 at 8:38 Comment(0)
M
57

Update: Note that the engineering team has published updated guidance to better codify some of the suggestions in this answer in a more "official" from Microsoft place as some customers asked for that. SQL Server/DB Index Guidance. Thanks, Conor

original answer:

I'll point out that most people don't need to consider rebuilding indexes in SQL Azure at all. Yes, B+ Tree indexes can become fragmented, and yes this can cause some space overhead and some CPU overhead compared to having perfectly tuned indexes. So, there are some scenarios where we do work with customers to rebuild indexes. (The primary scenario is when the customer may run out of space, currently, as disk space is somewhat limited in SQL Azure due to the current architecture). So, I will encourage you to step back and consider that using the SQL Server model for managing databases is not "wrong" but it may or may not be worth your effort.

(If you do end up needing to rebuild an index, you are welcome to use the models posted here by the other posters - they are generally fine models to script tasks. Note that SQL Azure Managed Instance also supports SQL Agent which you can also use to create jobs to script maintenance operations if you so choose).

Here are some details that may help you decide if you may be a candidate for index rebuilds:

  • The link you referenced is from a post in 2013. The architecture for SQL Azure was completely redone after that post. Specifically, the hardware architecture moved from a model that was based on local spinning disks to one based on local SSDs (in most cases). So, the guidance in the original post is out of date.
  • You can have cases in the current architecture where you can run out of space with a fragmented index. You have options to rebuild the index or to move to a larger reservation size for awhile (which will cost more money) that supports a larger disk space allocation. [Since the local SSD space on the machines is limited, reservation sizes are roughly linked to proportions of the machine. As we get newer hardware with larger/more drives, you have more scale-up options].
  • SSD fragmentation impact is relatively low compared to rotating disks since the cost of a random IO is not really any higher than a sequential one. The CPU overhead of walking a few more B+ Tree intermediate pages is modest. I've usually seen an overhead of perhaps 5-20% max in the average case (which may or may not justify regular rebuilds which have a much bigger workload impact when rebuilding)
  • If you are using query store (which is on by default in SQL Azure), you can evaluate whether a specific index rebuild helps your performance visibly or not. You can do this as a test to see if your workload improves before bothering to take the time to build and manage index rebuild operations yourself.
  • Please note that there is currently no intra-database resource governance within SQL Azure for user workloads. So, if you start an index rebuild, you may end up consuming lots of resources and impacting your main workload. You can try to time things to be done off-hours, of course, but for applications with lots of customers around the world this may not be possible.
  • Additionally, I will note that many customers have index rebuild jobs "because they want stats to be updated". It is not necessary to rebuild an index just to rebuild the stats. In recent SQL Server and SQL Azure, the algorithm for stats update was made more aggressive on larger tables and the model for how we estimate cardinality in cases where customers are querying recently inserted data (since the last stats update) have been changed in later compatibility levels. So, it is often the case that the customer doesn't even need to do any manual stats update at all.
  • Finally, I will note that the impact of stats being out of date was historically that you'd get plan choice regressions. For repeated queries, a lot of the impact of this was mitigated by the introduction of the automatic tuning feature over query store (which forces prior plans if it notices a large regression in query performance compared to the prior plan).

The official recommendation that I give customers is to not bother with index rebuilds unless they have a tier-1 app where they've demonstrated real need (benefits outweigh the costs) or where they are a SaaS ISV where they are trying to tune a workload over many databases/customers in elastic pools or in a multi-tenant database design so they can reduce their COGS or avoid running out of disk space (as mentioned earlier) on a very big database. In the largest customers we have on the platform, we sometimes see value in doing index operations manually with the customer, but we often do not need to have a regular job where we do this kind of operation "just in case". The intent from the SQL team is that you don't need to bother with this at all and you can just focus on your app instead. There are always things that we can add or improve into our automatic mechanisms, of course, so I completely allow for the possibility that an individual customer database may have a need for such actions. I've not seen any myself beyond the cases I mentioned, and even those are rarely an issue.

I hope this gives you some context to understand why this isn't being done in the platform yet - it just hasn't been an issue for the vast majority of customer databases we have today in our service compared to other pressing needs. We revisit the list of things we need to build each planning cycle, of course, and we do look at opportunities like this regularly.

Good luck - whatever your outcome here, I hope this helps you make the right choice.

Sincerely, Conor Cunningham Architect, SQL

Mongol answered 28/7, 2018 at 0:28 Comment(18)
Late to the party but I will ask it anyway: Doesn't the SKU at which one is operating their elastic pools come into play here? If I were running elastic pools on Premium or Business Critical I probably wouldn't consider a run book that executes the MS-written stored procedure doing a smart and logged index rebuild but currently do this as we are on Standard. Another angle that comes into play on this is the manager, knowing just enough, wants to know how often indexes are being rebuilt. I understand the "it depends" but the manager doesn't.Epperson
My comments are really orthogonal to what SKU you use - 15% is either enough to tip you over the edge or not, but it's not going to make or break you as you won't run at 100% cpu all the time. For the manager, you measure costs (including the labor to rebuild and the time you spend minding a system without resource governance while you rebuild these indexes in a PaaS system). You are welcome to rebuild indexes if it makes you happy, but reflex actions may have more psychological benefit than actual runtime benefit :)Mongol
Thanks for the feedback.Epperson
This is an interesting response, I am finding I need to rebuild the indexes regularly (weekly at least) or else they become fragmented and the application slow and unusable.Pyrimidine
index fragmentation on SSDs is not a major source of application slowness usually. For most applications, it's in this ~15% overhead range for having extra IOs. If you can look at the query store to see the performance of key queries before and after, can you validate that the plans are the same before/after rebuild? Sometimes rebuilding indexes changes the query plans because you update the stats. In other cases, you have parameter sensitivity and rebuilding happens to force recompilation (which is not causal to the app being faster). Would love to see examples.Mongol
I was just struggling with a query that returned only 5 rows: "select top 1000 * from..." took 0 seconds. "select * from..." took more than 2 minutes (and I killed it). Then, I saw fragmentation of one particular index was 99%. I rebuilt it and now both queries run in 0 seconds. That was in my localhost (SQL Server 12). The web app runs in Azure and there are several highly fragmented indices too. Should I rebuild the indices only locally but not in Azure?Saleh
Hard to say without more data. if you were to capture the "set statistics profile on" output for the query (without the top), then we could take a better educated guess. Also, it would help if you post what storage model you have (SSD, rotating disks, SAN, etc). Doing index rebuild can update stats and this can cause better plan choices which may have fixed your problem by picking a better plan choice (instead of reducing index fragmentation as the cause)Mongol
Thank you for the answer. I was on vacation. My localhost (without Azure) storage is SSD (Toshiba KXG5AZNV512G). About the statistics: I already rebuilt the index, so I don't know if the current statistics will be helpful. Should I post them anyway?Saleh
stats profile generally can help show what kind of problem you have if it is related to the plan shape/plan choice. For some plan shapes, you can potentially infer how fragmentation would hurt on rotating disks. For SSDs, I think you should likely assume that if the rebuild caused a big change in the plan's perf, it was likely a plan choice issue more than a fragmentation issue. So, it is better to post it when there is actually a problem more than when there is notMongol
Perfect, thank you very very much. Does it mean I have to update stats regularly in my localhost? What about Azure SQL? Should I update there the stats too or it's already done by default?Saleh
you shouldn't have to update stats regularly at all (absent some bug or other plan choice issue - by default you should not upstate stats). Please read this: #48681524Mongol
I know it's a year later, but I'm adding a comment here that we consistently have queries that will slow down to take MINUTES, but upon rebuilding indexes for the tables involved, the query time drops to seconds again. So the advice "You don't need to rebuild indexes" simply isn't true in my experience. I've come across a LOT of performance issues that rebuilding indexes seems to fix. And we must do it regularly to maintain performance.Incrocci
You should separate what impact updating statistics has from rebuilding the index and look at the query plans. I am not saying rebuilding indexes is never useful - just that there are many cases in SQL Azure where people are having outages due to the rebuilds more than the performance impact of a somewhat suboptimal index structure.Mongol
@ConorCunninghamMSFT Is this applicable to full-text search catalog indices too? Do I need to run alter index to defragment the indices?Talmudist
Full-text indexes are different beasts. This guidance is for regular b-trees, mostly. I believe we've been working on an "official" doc page to codify current guidance, so I'll try to update my post once I find the link for that to redirect people to that page going forwardMongol
Any way to run the index rebuild so it doesn't spike the database dtu to 100% for an hour?Viscous
We are also seeing the need to rebuild indexes fairly regularly. Queries taking up to 10x the time to run, index rebuild, back to milliseconds. Are all versions of Azure SQL running on SSDs?Sixtasixteen
Premium/Biz critical run on SSDs. The lower tiers are on remote storage (which is abstracted). It would be good if you could consider running some XEvent sessions when performance is good vs. bad and/or looking at query store data before/after to see what is different. Fragmentation would rarely imply this kind of difference in performance, but perhaps IO limits and throttling could if you were close to the limit.Mongol
W
19

You can use Azure Automation to schedule index maintenance tasks as explained here :Rebuilding SQL Database indexes using Azure Automation

Below are steps :

1) Provision an Automation Account if you don’t have any, by going to https://portal.azure.com and select New > Management > Automation Account

enter image description here

2) After creating the Automation Account, open the details and now click on Runbooks > Browse Gallery

enter image description here

Type on the search box the word “indexes” and the runbook “Indexes tables in an Azure database if they have a high fragmentation” appears:

enter image description here

4) Note that the author of the runbook is the SC Automation Product Team at Microsoft. Click on Import:

enter image description here

5) After importing the runbook, now let’s add the database credentials to the assets. Click on Assets > Credentials and then on “Add a credential…” button. enter image description here

6) Set a Credential name (that will be used later on the runbook), the database user name and password:

enter image description here

7) Now click again on Runbooks and then select the “Update-SQLIndexRunbook” from the list, and click on the “Edit…” button. You will be able to see the PowerShell script that will be executed:

enter image description here

8) If you want to test the script, just click on the “Test Pane” button, and the test window opens. Introduce the required parameters and click on Start to execute the index rebuild. If any error occurs, the error is logged on the results window. Note that depending on the database and the other parameters, this can take a long time to complete:

enter image description here

9) Now go back to the editor, and click on the “Publish” button enable the runbook. If we click on “Start”, a window appears asking for the parameters. But as we want to schedule this task, we will click on the “Schedule” button instead:

enter image description here

10) Click on the Schedule link to create a new Schedule for the runbook. I have specified once a week, but that will depend on your workload and how your indexes increase their fragmentation over time. You will need to tweak the schedule based on your needs and by executing the initial queries between executions:

enter image description here

11) Now introduce the parameters and run settings:

enter image description here

NOTE: you can play with having different schedules with different settings, i.e. having a specific schedule for a specific table.

With that, you have finished. Remember to change the Logging settings as desired:

enter image description here

Westberry answered 8/2, 2018 at 10:50 Comment(4)
What kind of reporting do you get back from this? Is it useful? Can you run this script manually somehow - I'd like to run it once and see what it does. I'm finding many index rebuilding scripts out there and I don't want to completely blackbox the process (at least at first).Taimi
This runbook says 'Last updated: 10/12/2014' Tried and tested? Or horribly out of date :-/ I hope the former!Taimi
You can log execution results into Azure SQL Database systemcentercentral.com/…Westberry
Step 5. ("Click on Assets > Credentials and then on “Add a credential…” button.") There is no button. Not sure where to create the credential. Under "Shared Resources" from the main blade maybe?Wicketkeeper
C
1

Azure Automation is good and pricing is also negligible..

enter image description here

Some other options you have are

1.Create a execute sql task and schedule it through sql agent .The execute sql task should contain the index rebuild code along with stats rebuild

2.You also can create a linked server to SQLAZURE and create a sql agent job.To create a linked server to azure, you can see this SO link:I need to add a linked server to a MS Azure SQL Server

Carlita answered 9/2, 2018 at 3:56 Comment(0)
H
0

As @TheGamiswar suggested, add a linked server, then create a stored procedure like this:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [LinkedServerName].[RemoteDB].[dbo].[sp_RebuildReorganizIndexes]   
AS  
BEGIN  
    ALTER INDEX PK_MyTable ON MyTable REBUILD WITH (STATISTICS_NORECOMPUTE = ON, ONLINE=ON);
    ALTER INDEX IX_MyTable ON MyTable REBUILD WITH (STATISTICS_NORECOMPUTE = ON, ONLINE=ON); --Nonclustered index

    ALTER INDEX PK_MyTable ON MyTable REORGANIZE;
    ALTER INDEX IX_MyTable ON MyTable REORGANIZE;
END

Then on your linked server use "SQL Server Agent" to create a new job and a schedule:

enter image description here

For details please see https://learn.microsoft.com/en-us/sql/ssms/agent/create-a-job?view=sql-server-2017

Handful answered 9/9, 2019 at 17:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.