SSDT project builds much slower with MSBuild than in VS 2013
Asked Answered
T

3

11

I noticed that our SSDT projects were building much slower via TFS build definitions than via VS 2013 on my local dev box.

On the build server itself, I can build our largest SSDT project via VS 2013 in about 2 minutes. The exact same project builds from the command line on the same server using MSBuild in about 16 minutes (i.e. building in the same way TFS build definitions build SSDT projects)

I've gathered /verbosity:diagnostic output from both methods, but cannot for the life of me see any material difference in the parameters being supplied to MSBuild.

In VS 2013, the configuration was Debug:AnyCPU.

The MSBuild command I used was:

msbuild /t:rebuild /p:Configuration=Debug /p:Platform=AnyCPU .sqlproj

I've tried the same process on much smaller SSDT projects, and I get similar relative discrepancies in build times (e.g. another project I have takes 10 seconds in VS 2013, and 70-80 seconds via MSBuild)

I have the very latest versions of VS 2013 (12.0.40629.00 Update 5) and SSDT (12.0.60629.0)

Has anyone else seen this issue, or even has a solution to this problem?

Would a move to VS 2015 solve the problem?


Edited to respond to comments and to include diagnostic information - 15-08-2016

Thank you guys for getting back to me with your suggestions.

Ed, When testing VS against MSBuild I've used build target Rebuild, which I believe is a combination of Clean and Build. In the TFS build definition I have the "Clean Workspace" and "Clean build" set to true, however the time for the "Get sources" step is only 9 seconds.

Steven, the dbmdl factor sounded really promising. However, I tried the tests again after deleting all dbmdl files from the solution, and I'm still getting the same times (I verified the dbmdl files hadn't been regenerated between test builds). Also, I've deleted all the bin and obj folders before the tests, just to eliminate any possibility of cached data for VS builds.

Cece, I've effectively removed TFS from this problem, because I get build time discrepancies when testing VS and MSBuild. I think if I can crack that, TFS build times will be cracked too. You had a good point about which build step the time is been taken in. I didn't want to flood the initial post with too much info, but now's the time :-). I used the /clp:PerformanceSummary option to get a breakdown of where the time's being spent.

Below in section 1 you can see the time is almost all spent in the "SqlBuildTask". Nice to know, but not very helpful in pinpointing the issue.

In sections 2 and 3 is information I got from both build methods using the /verbosity:diagnostic output

I cannot see any material difference between the two builds. Both must be using this DLL to perform that actual build (MSBuild explicitly declares a reference to this DLL):

 C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\\Extensions\Microsoft\SQLDB\Dac\130\Microsoft.Data.Tools.Schema.Tasks.Sql.dll" 

------------- section 1 -------------

(Output from MSBuild with /clp:PerformanceSummary)

Project Performance Summary:
        82518 ms  P:\<snip>.sqlproj   1 calls
                  82518 ms  rebuild                                    1 calls

Target Performance Summary:
        0 ms  BuildOnlySettings                          2 calls
        0 ms  GetReferenceAssemblyPaths                  2 calls
        0 ms  GetSqlSymbolsPath                          1 calls
        0 ms  BeforeBuild                                2 calls
        0 ms  ResolveReferences                          2 calls
        0 ms  GetCopyToOutputDirectoryXamlAppDefs        2 calls
        0 ms  SqlStudioSourceFilesToCopy                 2 calls
        0 ms  AssignProjectConfiguration                 1 calls
        0 ms  BeforeClean                                2 calls
        0 ms  _CopySourceItemsToOutputDirectory          2 calls
        0 ms  Build                                      2 calls
        0 ms  GetCopyToOutputDirectoryItems              2 calls
        0 ms  Clean                                      2 calls
        0 ms  CleanPublishFolder                         2 calls
        0 ms  GetFrameworkPaths                          2 calls
        0 ms  GetTargetPath                              2 calls
        0 ms  _CleanGetCurrentAndPriorFileWrites         2 calls
        0 ms  _CopyFilesMarkedCopyLocal                  1 calls
        0 ms  AfterResolveReferences                     2 calls
        0 ms  PreXsdCodeGen                              2 calls
        0 ms  CopyFilesToOutputDirectory                 2 calls
        0 ms  GetNativeManifest                          1 calls
        0 ms  GetInstalledSDKLocations                   2 calls
        0 ms  IncrementalClean                           2 calls
        0 ms  PrepareForRun                              2 calls
        0 ms  _SplitProjectReferencesByFileExistence     2 calls
        0 ms  _SetupSqlBuildOutputs                      2 calls
        0 ms  BeforeResolveReferences                    2 calls
        0 ms  GetSqlTargetPath                           1 calls
        0 ms  Rebuild                                    1 calls
        0 ms  AfterClean                                 2 calls
        0 ms  AfterBuild                                 2 calls
        0 ms  ExpandSDKReferences                        2 calls
        0 ms  CleanStaticCodeAnalysis                    2 calls
        0 ms  BeforeRebuild                              1 calls
        0 ms  CheckRequiredProperties                    2 calls
        0 ms  GenerateSqlTargetFrameworkMoniker          2 calls
        0 ms  ResolveSDKReferences                       2 calls
        0 ms  AfterRebuild                               1 calls
        0 ms  ResolveArtifactReferences                  2 calls
        0 ms  _CopyOutOfDateSourceItemsToOutputDirectoryAlways   1 calls
        0 ms  _CheckForCompileOutputs                    2 calls
        0 ms  PrepareForBuild                            2 calls
        0 ms  CleanXsdCodeGen                            2 calls
       16 ms  _CheckForInvalidConfigurationAndPlatform   2 calls
       16 ms  AssignTargetPaths                          2 calls
       16 ms  CoreClean                                  2 calls
       31 ms  ResolveAssemblyReferences                  2 calls
       31 ms  SqlPrepareForRun                           2 calls
       31 ms  CleanReferencedProjects                    2 calls
       63 ms  PostBuildEvent                             2 calls
       63 ms  PreBuildEvent                              2 calls
       94 ms  CoreCompile                                2 calls
      625 ms  _SetupSqlBuildInputs                       2 calls
     6452 ms  ResolveProjectReferences                   2 calls
    81502 ms  SqlBuild                                   2 calls

Task Performance Summary:
        0 ms  AssignTargetPath                          26 calls
        0 ms  CallTarget                                 2 calls
        0 ms  FindAppConfigFile                          2 calls
        0 ms  ConvertToAbsolutePath                      2 calls
        0 ms  AssignProjectConfiguration                 1 calls
        0 ms  ReadLinesFromFile                          4 calls
        0 ms  WriteLinesToFile                           4 calls
        0 ms  Copy                                       6 calls
        0 ms  FindUnderPath                             14 calls
        0 ms  RemoveDuplicates                           6 calls
        0 ms  MakeDir                                    4 calls
       16 ms  SqlScriptDependenciesTask                  2 calls
       16 ms  Delete                                    12 calls
       31 ms  ResolveAssemblyReference                   2 calls
       31 ms  Message                                    8 calls
       94 ms  Csc                                        1 calls
      126 ms  Exec                                       4 calls
      516 ms  SqlModelResolutionTask                     2 calls
     6468 ms  MSBuild                                    7 calls
    81424 ms  SqlBuildTask                               2 calls

------------- section 2 -------------

(The diagnostic output given during the SqlBuildTask phase of the build for VS)

Target "SqlBuild" in file "C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v12.0\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets" from project "P:<snip>.sqlproj" (target "Build" depends on it):
    Building target "SqlBuild" completely.
    Output file "P:<snip>.dacpac" does not exist.
    Task "SqlBuildTask"
        Task Parameter:SqlTarget=P:<snip>.dacpac
        Task Parameter:ContributorArguments=ConfigurationName=Debug;
        Task Parameter:BuildContributors=;
        Task Parameter:DeploymentContributors=;
        Task Parameter:CreateScriptFileName=<snip>.sql
        Task Parameter:DacApplicationName=<snip>
        Task Parameter:DacDescription=<snip>
        Task Parameter:DacFile=P:<snip>\bin\Debug\
        Task Parameter:DacVersion=3.27.0.0
        Task Parameter:DatabaseName=<snip>
        Task Parameter:DatabaseSchemaProviderName=Microsoft.Data.Tools.Schema.Sql.Sql100DatabaseSchemaProvider
        Task Parameter:DefaultSchema=dbo
        Task Parameter:DeploymentScriptName=<snip>.sql
        Task Parameter:DeployToDatabase=True
        Task Parameter:ImplicitDllAssemblyName=<snip>
        Task Parameter:ImplicitDllFileName=P:<snip>.dll
        Task Parameter:ImplicitDllSymbolsFileName=P:<snip>.pdb
        Task Parameter:ImplicitDllGenerateSqlClrDdl=true
        Task Parameter:IntermediateDirectory=P:<snip>\obj\Debug\
        Task Parameter:ModelCollation=1033,CI
        Task Parameter:OutputDirectory=P:<snip>\bin\Debug\
        Task Parameter:
    Source= <snip - list of all the source files in the SSDT project - matches list for MSBuild builds>

        Task Parameter:
    SqlCmdVariables= <snip - same for both builds>
        Task Parameter:
    SqlReferencePath=
        C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                CopyLocal=false
                FrameworkFile=true
                FusionName=mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
                ImageRuntime=v4.0.30319
                OriginalItemSpec=C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                ReferenceSourceTarget=ResolveAssemblyReference
                ResolvedFrom=C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                Version=4.0.0.0
        Task Parameter:SuppressTSqlWarnings=71562,71502,71558
        Task Parameter:ValidateCasingOnIdentifiers=true
        Task Parameter:CmdLineInMemoryStorage=false
        Task Parameter:DefaultCollation=SQL_Latin1_General_CP1_CI_AS
        Task Parameter:AnsiNullDefault=False
        Task Parameter:AnsiNulls=False
        Task Parameter:AnsiPadding=False
        Task Parameter:AnsiWarnings=False
        Task Parameter:ArithAbort=False
        Task Parameter:ConcatNullYieldsNull=False
        Task Parameter:QuotedIdentifier=False
        Task Parameter:NumericRoundAbort=False
        Task Parameter:RecursiveTriggersEnabled=False
        Task Parameter:DatabaseChaining=False
        Task Parameter:DatabaseState=ONLINE
        Task Parameter:CloseCursorOnCommitEnabled=False
        Task Parameter:DefaultCursor=GLOBAL
        Task Parameter:AutoClose=False
        Task Parameter:AutoCreateStatistics=True
        Task Parameter:AutoShrink=False
        Task Parameter:AutoUpdateStatistics=True
        Task Parameter:TornPageDetection=False
        Task Parameter:DatabaseAccess=MULTI_USER
        Task Parameter:Recovery=FULL
        Task Parameter:EnableFullTextSearch=False
        Task Parameter:DefaultFilegroup=PRIMARY
        Task Parameter:Trustworthy=True
        Task Parameter:AutoUpdateStatisticsAsynchronously=False
        Task Parameter:PageVerify=CHECKSUM
        Task Parameter:ServiceBrokerOption=DisableBroker
        Task Parameter:DateCorrelationOptimizationOn=False
        Task Parameter:Parameterization=SIMPLE
        Task Parameter:AllowSnapshotIsolation=False
        Task Parameter:ReadCommittedSnapshot=True
        Task Parameter:VardecimalStorageFormatOn=True
        Task Parameter:SupplementalLoggingOn=False
        Task Parameter:CompatibilityMode=100
        Task Parameter:IsChangeTrackingOn=False
        Task Parameter:IsChangeTrackingAutoCleanupOn=True
        Task Parameter:ChangeTrackingRetentionPeriod=2
        Task Parameter:ChangeTrackingRetentionUnit=Days
        Task Parameter:IsEncryptionOn=False
        Task Parameter:IsBrokerPriorityHonored=False
        Task Parameter:IncludeCompositeObjects=True
        Loading project references...
        Loading project files...
        Building the project model and resolving object interdependencies...
        Validating the project model...
        Writing model to P:<snip>\obj\Debug\Model.xml...

------------- section 3 -------------

(The diagnostic output given during the SqlBuildTask phase of the build for MSBuild)

Target "SqlBuild: (TargetId:68)" in file "C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v12.0\SSDT\Microsoft.Data.Tools.Schema.SqlTasks.targets" from project "P:<snip>.sqlproj" (target "Build" depends on it):
Building target "SqlBuild" completely.
Output file "P:<snip>.dacpac" does not exist.
Using "SqlBuildTask" task from assembly "C:\Program Files (x86)\Microsoft Visual Studio 12.0\Common7\IDE\\Extensions\Microsoft\SQLDB\Dac\130\Microsoft.Data.Tools.Schema.Tasks.Sql.dll".
Task "SqlBuildTask" (TaskId:56)
  Task Parameter:SqlTarget=P:<snip>.dacpac (TaskId:56)
  Task Parameter:ContributorArguments=ConfigurationName=Debug; (TaskId:56)
  Task Parameter:BuildContributors=; (TaskId:56)
  Task Parameter:DeploymentContributors=; (TaskId:56)
  Task Parameter:CreateScriptFileName=<snip>.sql (TaskId:56)
  Task Parameter:DacApplicationName=<snip> (TaskId:56)
  Task Parameter:DacDescription=<snip> (TaskId:56)
  Task Parameter:DacFile=P:<snip>\bin\Debug\ (TaskId:56)
  Task Parameter:DacVersion=3.27.0.0 (TaskId:56)
  Task Parameter:DatabaseName=<snip> (TaskId:56)
  Task Parameter:DatabaseSchemaProviderName=Microsoft.Data.Tools.Schema.Sql.Sql100DatabaseSchemaProvider (TaskId:56)
  Task Parameter:DefaultSchema=dbo (TaskId:56)
  Task Parameter:DeploymentScriptName=<snip>.sql (TaskId:56)
  Task Parameter:DeployToDatabase=True (TaskId:56)
  Task Parameter:ImplicitDllAssemblyName=<snip> (TaskId:56)
  Task Parameter:ImplicitDllFileName=P:<snip>.dll (TaskId:56)
  Task Parameter:ImplicitDllSymbolsFileName=P:<snip>.pdb (TaskId:56)
  Task Parameter:ImplicitDllGenerateSqlClrDdl=true (TaskId:56)
  Task Parameter:IntermediateDirectory=P:<snip>\obj\Debug\ (TaskId:56)
  Task Parameter:ModelCollation=1033,CI (TaskId:56)
  Task Parameter:OutputDirectory=P:<snip>\bin\Debug\ (TaskId:56)
  Task Parameter:
      Source= <snip - list of all the source files in the SSDT project - matches list for VS builds>
  Task Parameter:
      SqlCmdVariables= <snip - same for both builds>
  Task Parameter:
      SqlReferencePath=
          C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                  CopyLocal=false
                  FrameworkFile=true
                  FusionName=mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
                  ImageRuntime=v4.0.30319
                  OriginalItemSpec=C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                  ReferenceSourceTarget=ResolveAssemblyReference
                  ResolvedFrom=C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\mscorlib.dll
                  Version=4.0.0.0 (TaskId:56)
  Task Parameter:SuppressTSqlWarnings=71562,71502,71558 (TaskId:56)
  Task Parameter:ValidateCasingOnIdentifiers=true (TaskId:56)
  Task Parameter:CmdLineInMemoryStorage=false (TaskId:56)
  Task Parameter:DefaultCollation=SQL_Latin1_General_CP1_CI_AS (TaskId:56)
  Task Parameter:AnsiNullDefault=False (TaskId:56)
  Task Parameter:AnsiNulls=False (TaskId:56)
  Task Parameter:AnsiPadding=False (TaskId:56)
  Task Parameter:AnsiWarnings=False (TaskId:56)
  Task Parameter:ArithAbort=False (TaskId:56)
  Task Parameter:ConcatNullYieldsNull=False (TaskId:56)
  Task Parameter:QuotedIdentifier=False (TaskId:56)
  Task Parameter:NumericRoundAbort=False (TaskId:56)
  Task Parameter:RecursiveTriggersEnabled=False (TaskId:56)
  Task Parameter:DatabaseChaining=False (TaskId:56)
  Task Parameter:DatabaseState=ONLINE (TaskId:56)
  Task Parameter:CloseCursorOnCommitEnabled=False (TaskId:56)
  Task Parameter:DefaultCursor=GLOBAL (TaskId:56)
  Task Parameter:AutoClose=False (TaskId:56)
  Task Parameter:AutoCreateStatistics=True (TaskId:56)
  Task Parameter:AutoShrink=False (TaskId:56)
  Task Parameter:AutoUpdateStatistics=True (TaskId:56)
  Task Parameter:TornPageDetection=False (TaskId:56)
  Task Parameter:DatabaseAccess=MULTI_USER (TaskId:56)
  Task Parameter:Recovery=FULL (TaskId:56)
  Task Parameter:EnableFullTextSearch=False (TaskId:56)
  Task Parameter:DefaultFilegroup=PRIMARY (TaskId:56)
  Task Parameter:Trustworthy=True (TaskId:56)
  Task Parameter:AutoUpdateStatisticsAsynchronously=False (TaskId:56)
  Task Parameter:PageVerify=CHECKSUM (TaskId:56)
  Task Parameter:ServiceBrokerOption=DisableBroker (TaskId:56)
  Task Parameter:DateCorrelationOptimizationOn=False (TaskId:56)
  Task Parameter:Parameterization=SIMPLE (TaskId:56)
  Task Parameter:AllowSnapshotIsolation=False (TaskId:56)
  Task Parameter:ReadCommittedSnapshot=True (TaskId:56)
  Task Parameter:VardecimalStorageFormatOn=True (TaskId:56)
  Task Parameter:SupplementalLoggingOn=False (TaskId:56)
  Task Parameter:CompatibilityMode=100 (TaskId:56)
  Task Parameter:IsChangeTrackingOn=False (TaskId:56)
  Task Parameter:IsChangeTrackingAutoCleanupOn=True (TaskId:56)
  Task Parameter:ChangeTrackingRetentionPeriod=2 (TaskId:56)
  Task Parameter:ChangeTrackingRetentionUnit=Days (TaskId:56)
  Task Parameter:IsEncryptionOn=False (TaskId:56)
  Task Parameter:IsBrokerPriorityHonored=False (TaskId:56)
  Task Parameter:IncludeCompositeObjects=True (TaskId:56)
  Creating a model to represent the project... (TaskId:56)
  Loading project references... (TaskId:56)
  Loading project files... (TaskId:56)
  Building the project model and resolving object interdependencies... (TaskId:56)
  Validating the project model... (TaskId:56)
  Writing model to P:<snip>\obj\Debug\Model.xml... (TaskId:56)
Done executing task "SqlBuildTask". (TaskId:56)
Tearful answered 12/8, 2016 at 16:50 Comment(6)
If you do a clean then build in vs, what time do you get?Unexperienced
Specifically try closing VS, deleting the dbmdl file from the solution folder and then building again. VS has some perf optimization that involves caching data in the dbmdl file, but that isn't used when running msbuild from the command-line.Hagridden
Which version of TFS are you using? Do you mean you get the same behavior when build your project with MSBuild command line on TFS build server, as you build project with TFS? Which step cost the most time in the log?Outboard
I made one discovery. There seems to be a bug where duplicate references within sqlproj files can significantly increase build times when built via MSBUILD. Our largest project has about 11,000 entities. Build times via MSBUILD varied between 30 to 80 mins, whilst the same project took 3-4 mins when built via VS. Once I removed the dup references (about 40 of them), build time went down to a consistent 9 mins! I've not had time yet to try and deduce if the extra time is proportional to number of dups, or a specific dup. No matter - just remove your dups!Tearful
In addition to removing dups, an important change I made was to add /p:CmdLineInMemoryStorage=TRUE to the MSBUILD arguments. This was suggested by Simon D'Morias from sabin.io (thanks Simon). I know the flag was introduced to solve memory issues with builds, but it definitely helps with our build times in TFS too.Tearful
It appears that VS uses cached information about the project to speed up builds somehow (probably using the dbmdl file?). I have no confirmed proof of this. However this could explain why MSBUILD builds take longer that VS builds, and with my new revised build times, the fact MSBUILD takes 7 mins longer without any cached information on such a large project seems reasonable.Tearful
T
22

I made the following changes to our large SSDT project and TFS MSBUILD arguments that have brought the build times via TFS build definitions to an explainable level:

1) Add /p:CmdLineInMemoryStorage=TRUE to MSBUILD arguments

Adding CmdLineInMemoryStorage MSBUILD argument to TFS build definition

This reduced our build times massively. We had a build definition that went from 40 mins to 16 mins by adding this option. (Times are pre-removal of duplicate references)

2) Removed all duplicate references in the .sqlproj file

Over time the .sqlproj file managed to reference some files 2 or 3 times. We had about 40 duplicate references. This was almost certainly due to merge issues between our many branches. Visual Studio does not complain about this, and builds perfectly happily, but the dups made a significant difference to our build times in TFS, which of course uses MSBUILD.

Finding Duplicate References in sqlproj files

Here's a powershell script I used to find duplicates in our sqlproj files (change $root to point to the folder containing your .sqlproj files):

[Reflection.Assembly]::LoadWithPartialName("System.Xml.Linq") | Out-Null

$root = "C:\TFS"

$sqlprojFiles = Get-ChildItem -Path $root -Filter "*.sqlproj" -Recurse
$numFilesChecked=0

foreach ($f in $sqlprojFiles)
{
    $numFilesChecked++
    $fullName = $f.FullName

    $xml = [System.Xml.Linq.XDocument]::Load($fullName)
    $ns = $xml.Root.Name.Namespace;

    $itemGroups = $xml.Descendants() | Where-Object { $_.Parent.Name -eq $ns + "ItemGroup" };

    $dict = @{}
    $itemGroups.Attributes() | Where-Object { $_.Name -eq "Include"} `
        | % { if ($dict.ContainsKey($_.Value)) {$dict[$_.Value]++} else {$dict.Add($_.Value,1)} }

    $duplicates = $dict.Keys | Where-Object { $dict[$_] -gt 1} 

    $fullName

    if ($duplicates.Count -gt 0)
    {
        Write-Output "--------------------------------------------------------"
        $duplicates | % {$_+", "+$dict[$_]}
    }
    else
    {
        Write-Output "No duplicates found"
    }
}

"Files checked: $numFilesChecked"

I used the output of this to manually edit the .sqlproj files textually (i.e. "Unload Project", then "Edit .sqlproj" in VS)

Build Time Differences Explained

Now that our large sqlproj builds in 9 minutes on our build server, I think the build time differences between VS and MSBuild can be explained.

I've been told VS takes advantage of precompiled information about the SSDT project that's not used by MSBuild from the command line. I can't confirm this, and I certainly didn't see any evidence of this in the "/verbosity:diagnostic" output from VS (I may have missed it). However it takes 2 mins to build the large project in VS after VS has finished analysing the project, which suggests it is using cached information in the build process. It takes several minutes for VS to finish analysing the large project.

MSBuild build times (either from the command line or TFS build definition) takes a consistent 9 mins now. The extra 7 mins is almost certainly taken up by the MSBuild process analysing the project on every build - something the VS seems to do once (or at least periodically), then takes advantage of the cached information at build time.

Tearful answered 5/9, 2016 at 10:15 Comment(1)
Thanks, this was really helpful. It would be nice if the PS script actually stated what the duplicate references were.Recipient
D
5

For those who experiences slow SSDT builds of projects with many dacpac dependencies there is one more option: simplify those dacpacs and make them smaller.

The key assumption was made with realizing that referencing a dacpac is not the same as deploying it: it is not supposed to require as much meta data as the "normal" sqlproj.

Here are possible steps that can be done with dacpac-references to reduce their sizes and build time drastically (in addition to options mentioned by Mark and others).

Shown code tips are in Powershell.

Inside DACPAC

DACPAC is a ZIP archive containing XML files. So you can open it and review model.xml which is the project itself expressed as a single file of metadata and sources. This file can be very big (100MB+) and as you know size of XML file influences processing performance of it. SSDT generates very verbose description of different sql elements.

To start processing it unpack and load source XML:

Expand-Archive -Path $dacpac -DestinationPath $expandedFolder | Out-Null;
[xml] $dacpacXml = Get-Content "$expandedFolder/model.xml";

$ns = New-Object System.Xml.XmlNamespaceManager($dacpacXml.value.NameTable);
$ns.AddNamespace("dac", $dacpacXml.value.DocumentElement.NamespaceURI);

Metadata not used for referencing

authorization

I believe your TSQL code is not supposed to need hard references to such objects in a different database as:

  • user, login, role
  • permissions, rolemembership, authorization and execution context (user/login) info

so removing them

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlRoleMembership']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlPermissionStatement']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlLogin']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlUser']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlRole']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Login']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'User']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Authorizer']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Schema']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

and if you keep all of it in your sqlproj (users, logins), I'd suggest to review this approach and remove exploitation level data from source code.

plus sensitive data and signatures

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'EncryptionPassword' or @Name = 'Password']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlCertificate' or @Type = 'SqlSymmetricKey']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSignature']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

file storage and partitions info

If you have partitioned tables with many partitions then your dacpacs contain hundreds or thousands of elements for each partition.

elements

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'DataCompressionOptions' or @Name = 'BoundaryValues']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Filegroups']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Filegroup']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'FilegroupForTextImage']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'PartitionScheme' or @Name = 'PartitionColumn']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

DML and DDL objects

Objects not supposed to be referenced from outside a database as well

  • triggers
  • indicies
  • default constraints

so removing them

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlIndex']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlDmlTrigger']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlIndexedColumnSpecification']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlDatabaseDdlTrigger']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlDefaultConstraint']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

sources

Whilst a SP can be referenced, it's sources are not required for this - we already have all the metadata for it: params are described verbosely as separate Elements with type info and so on. So removing all the sources

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlView']/dac:Property[@Name = 'QueryScript']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlProcedure']/dac:Property[@Name = 'BodyScript']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlScriptFunctionImplementation']/dac:Property[@Name = 'BodyScript']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlAssemblyFile']/dac:Property[@Name = 'Source']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlComputedColumn']/dac:Property[@Name = 'ExpressionScript']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;

and dependencies produced by build from these sources

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'BodyDependencies']", $ns) | % { write-verbose $_.ParentNode.Name; $_.ParentNode.removechild($_); } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'QueryDependencies' or @Name = 'ExpressionDependencies']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'DynamicObjects']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'AssemblySources']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlQueue']/dac:Relationship", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlContract']/dac:Relationship[@Name = 'Messages']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

annotations

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Annotation[@Type = 'SysCommentsObjectAnnotation']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'BoundTargets']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Annotation[@Type = 'PersistedResolvableAnnotation']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlExtendedProperty']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSimpleColumn']/dac:AttachedAnnotation", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

update of november 2022

Some time later I decided to go deeper, reviewed our cleaned dacpacs and found some more options for cleanup.

A little more sources/expressions

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'CheckExpressionScript']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'FunctionBody']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSubroutineParameter']/dac:Property[@Name = 'DefaultExpressionScript']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

"private" relations

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlContract']/dac:Relationship[@Name = 'Messages']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlService']/dac:Relationship[@Name = 'Contracts' or @Name = 'Queue']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSynonym']/dac:Property[@Name = 'ForObjectScript']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSynonym']/dac:Relationship[@Name = 'ForObject']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

properties which have no effect on build or reference resolution

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsActivationOn' or @Name = 'MaxQueueReaders']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsMemoryOptimized' or @Name = 'IsNativelyCompiled' or @Name = 'IsSchemaBound']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'FillFactor' or @Name = 'IsPadded' or @Name = 'IsClustered']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'LockEscalation' or @Name = 'RetentionUnit' or @Name = 'Collation']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsAnsiNullsOn' or @Name = 'IsQuotedIdentifierOn']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsNullable' or @Name = 'IsReadOnly' or @Name = 'IsHidden']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsIdentity' or @Name = 'IdentitySeed' or @Name = 'IsRowGuidColumn']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Property[@Name = 'IsCaller' or @Name = 'IsOwner']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

Proc/func param and column type definitions are terribly verbose and somehow it comes SSDT never uses any of it. Which looks strange to me. It felt dangerous and a bit unreal when removing this metadata but after removing it from 70+ dacpacs SSDT haven't said anything about it. Builds are the same (just faster).

Here is an example of saying "this column is of type DECIMAL(18,8)":

<Relationship Name="TypeSpecifier">
    <Entry>
        <Element Type="SqlTypeSpecifier">
            <Property Name="Scale" Value="8"/>
            <Property Name="Precision" Value="18"/>
            <Relationship Name="Type">
                <Entry>
                    <References ExternalSource="BuiltIns" Name="[decimal]"/>
                </Entry>
            </Relationship>
        </Element>
    </Entry>
</Relationship>

making our ref-dacpac structure much more flat

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlScalarFunction']/dac:Relationship[@Name = 'Type']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlPartitionFunction']/dac:Relationship[@Name = 'ParameterType']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSubroutineParameter']/dac:Relationship[@Name = 'Type']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlSimpleColumn' or @Type = 'SqlTableTypeSimpleColumn']/dac:Relationship[@Name = 'TypeSpecifier']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'ColumnSpecifications']/dac:Entry", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

Removing constraints and links to them. Note, these must be removed all at once! or none. Earlier I tried to remove some of constraints, some of annotations but internal cross-references from metadata became broken and such dacpacs did not work. Finally found out how to remove it "consistently":

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlDefaultConstraint' or @Type = 'SqlTableTypeDefaultConstraint' or @Type = 'SqlTableTypePrimaryKeyConstraint' or @Type = 'SqlPrimaryKeyConstraint']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlForeignKeyConstraint' or @Type = 'SqlCheckConstraint' or @Type = 'SqlUniqueConstraint']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Annotation[@Type = 'SqlInlineConstraintAnnotation' or @Type = 'SqlInlineIndexAnnotation']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Relationship[@Name = 'Constraints']", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlTable' or @Type = 'SqlTableType']/dac:AttachedAnnotation", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;
# here only Disambiguator attr getting removed, not the whole element
$dacpacXml.value.DocumentElement.SelectNodes("//dac:Element[@Type = 'SqlTableType' or @Type = 'SqlSubroutineParameter']", $ns) | % { $_.RemoveAttribute("Disambiguator") } | Out-Null;

Changes from this update reduced dacpacs more by 1.5 times. Build gained boost by ~20%

// end of update

cleanup

Deletions have produced number of empty elements with no children. Removing them

$dacpacXml.value.DocumentElement.SelectNodes("//dac:Entry[not(child::*)]", $ns) | % { $_.ParentNode.RemoveChild($_) } | Out-Null;

save changes

$dacpacXML.Save("$expandedFolder/model.xml");

Recompress DACPAC

After editing we need to turn our xml files back into dacpac. Before that it's required to recompute and update checksum in Origin.xml which is a part of dacpac archive as well. DACPACs with mismatching checksum won't work.

$newModelHash = (Get-FileHash -Path "$expandedFolder/model.xml").Hash;

[xml] $originXML = Get-Content "$expandedFolder/Origin.xml";
$originXML.DacOrigin.Checksums.Checksum.InnerText = $newModelHash;
$originXML.Save("$expandedFolder/Origin.xml");

Now we are ready, so compressing it back (note /* in the end of path to compress all the files but not the containing folder). Copy, rename afterwards.

Compress-Archive -Path "$expandedFolder/*" -DestinationPath "$dacpac.zip" -CompressionLevel Optimal -Force;

Build

Now put your recompressed references (dacpacs) into appropriate folder and rerun build process. As tested, edited dacpacs can become smaller 10-50 times and are fine for references resolution in VS or MSBuild. Build time can be halfed by this dirty trick.

Example SQLPROJ build (cli msbuild) statistics with ~3K procs, hundreds of tables and ~20 dacpac dependencies:

  • naive: ~15min
  • p:CmdLineInMemoryStorage: 5-6min
  • hacked dacpacs: 2min

Note! Obtained reduced dacpac is fine for referencing but not for deploying!

Dogmatic answered 4/2, 2021 at 14:51 Comment(1)
Here is a feature request for optimization of dacpac internals github.com/microsoft/DacFx/issues/360Dogmatic
L
0

Thanks to all the other posters here for some great ideas on how to improve our database build times

I significantly modified the PS script to automatically cleanup the sqlproj file of duplicate include entries. We just run it automatically before we run msbuild to set and forget.

In case it can help anyone else:

-- Remove-DbprojDuplicateIncludeNode.ps1
<#
.SYNOPSIS
    Will remove any duplicate references from all sqlproj files
#>
[CmdletBinding(PositionalBinding = $true, SupportsShouldProcess)]
Param(
    $Root = "$PSScriptRoot\..\..\src\"
)

$sqlprojFiles = Get-ChildItem -Path $Root -Filter "*.sqlproj" -Recurse
$numFilesChecked = 0

foreach ($f in $sqlprojFiles) {
    Write-Host "Checking file $($f.FullName)"
    $dbProjPath = $f
    $projXml = [xml](Get-Content $dbProjPath -raw)
    $nsm = [Xml.XmlNamespaceManager]($projXml.NameTable)
    $nsm.AddNamespace('ns', $projXml.Project.xmlns)

    $sqlProjNodes = $projXml.SelectNodes('//ns:*[@Include]', $nsm)

    $numFilesChecked++

    $includedFileSet = [System.Collections.Generic.HashSet[string]]::new()
    $changed = $false

    foreach ($includeNode in $sqlProjNodes) {
        if (-not $includedFileSet.Add($includeNode.Include)) {
            Write-Host "Removing duplicate include node '$($includeNode.Include)'"
            $includeNode.ParentNode.RemoveChild($includeNode)
            $changed = $true
        }
    }

    if ($changed) {
        if ($PSCmdlet.ShouldProcess($dbProjPath)) {
            $projXml.Save($dbProjPath)
        }
    }
}

"Files checked: $numFilesChecked"
Leilanileininger answered 8/5, 2022 at 21:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.