Monday, 16 September 2013

Continuous Integration with Database Projects (SQL Server)

Introduction

In addition to the daily-grind of DBA activities, our DBAs also participate in the software production activities of the platform team. 

Continuous integration patterns

The business of creating data schemas and alike is left to the Developers with oversight and governance from the DBAs. The eventual deployment to production is handled exclusively by the DBAs with assistance from the developers.

Most new code features and fixes will involve modifications to the DB schema, the reference data and the gold-data sets. Without automation, the DBA team would probably need to provide a dedicated resource assisting the development teams with deployments to dev and test environments.

However, Microsoft do provide some very sophisticated tools for deployment and patching databases, and this repetitive and resource intensive task can be entirely automated an exploited in a CI process.

Software prerequisites

Visual Studio 2008 and 2010 support the "Database Project (.dbproj) which are compatible with SQL 2008. Database Projects will not be supported by SQL 2012, so you will have to move sooner or later.

Visual Studio 2010 and 2012 support the newer Data-Tier Applications (DacPac), which is supported by SQL 2008 R2 or better.

Compatibility issues

Database Project (.dbproj)

The Database Project (.dbproj) are not supported by SQL 2012 or newer.

Data-Tier Applications (DacPac)

The Data-Tier applications (DACPAC) arrived some time ago, supported by SQL 2008 R2 or newer.

MSBuild

MSBuild handles both types of database project in exactly the same way, and neither are too dissimilar to building a regular code assembly.

There are some subtle command line differences from the usual library builds.

Parameters

$msbuild = @{
performanceParameters = "/nologo", "/noconsolelogger", "/p:WarningLevel=0", "/clp:ErrorsOnly", "/m:1"

loggingParameters = "/l:FileLogger,Microsoft.Build.Engine;logfile=c:\log.txt"

packageParameters = ,"/property:outdir=$($data.output)/","/p:configuration=release"

targets = "/t:rebuild"

}

cmd /c "C:\Windows\Microsoft.NET\Framework\v4.0.30319\MSBuild.exe" `
$msbuild.performanceParameters `
$msbuild.packageParameters `
$msbuild.loggingParameters `
$msbuild.targets `
$databaseProjectFile

Output

Whilst MSBuild is invoked in identical fashion irrespective, the output is entirely different between the two projects.

Database Project (.dbproj) - Last choice

The database project (dbproj) emits a manifest file, which effectively defines the intended DB schema. Using VsDbCmd the manifest is compared to a real instance of the database, and a diff-SQL script is generated.

The idea is then to run the generate diff-sql against the same target SQL instance, however I have found it to contain a serious amount of detritus that typically renders the actual target database unusable. To work around this some fairly hideous post-processing is performed to remove all the destructive actions. 

To date, I have not heard of any reports that the post-processed file has done anything that was unexpected, but I still do not have the confidence to use this on our Live systems. 

Database projects are compatible out of the box with Visual Studio.

Data-Tier Applications (DacPac) -1st Choice

However, things have got better. The Data-Tier applications (DACPAC) arrived some time ago, supported since SQL 2008 R2.

The DacPac's produced by the .SqlProj are quite a bit easier to deploy, and do not involve any post processing. With DacPac's you are very much "syncing" the compiled schema with the database instance. With a little bedding down, we may trust DacPacs with our production databases.

If you want to use the DacPac's, you just need to get the appropriate version of the Sql Server Data Tools first.

TFS and Build Agents

The database projects can be compiled by having the desired version of Visual Studio installed on your build agents. 

In order to work with the new Data-Tier applications (DacPac), you'll need the correct version of Sql Server Data Tools on all machines involved with building.

Packaging and Archiving

MSBuild has parameters which allow you to specify a custom output location. With this parameter, we can ensure our database output is correctly archived after the build has completed.

Deployment of Database Projects (.dbproj)

As I might have previously hinted, things have got better, but I had to support these projects once-upon-a-time, and maybe you still have too. So, here's a script example that covers the following actions:
  1. Generating the Sql-Diff file
  2. Post-Processing the SQL file to remove the destructive modifications
  3. Patching the SQL instance
function GenerateSqlScripts
{
param(
[string]$dbManifestFile,
[string]$dbOutputFile,
[string]$sqlDeploymentFile,
[string]$dbName,
[string]$sqlConnectionString
)
Write-Debug "****************************"
$debugPreference = $debugMode;
$start = Get-Date

if (Test-Path $dbManifestFile) 
{
$drop = $false;

$VsDbCmd = @()
$VsDbCmd += "/dd:-"  #Note the dd:- means DO NOT RUN AT SERVER
$VsDbCmd += "/a:deploy"
$VsDbCmd += "/manifest:$dbManifestFile"
$VsDbCmd += "/cs:$sqlConnectionString"
$VsDbCmd += "/p:TreatVerificationErrorsAsWarnings=True " #This prevents failure on replicated tables
$VsDbCmd += "/p:DeployDatabaseProperties=True " #This prevents file resizing from being viewed as something to undo
$VsDbCmd += "/p:IgnoreColumnOrder=True " #This prevents table create/copy/drop/rename for the sake of columns being in the same order
$VsDbCmd += "/p:GenerateDropsIfNotInProject=True " #Drop objects not present(Dangerous)
$VsDbCmd += "/p:AlwaysCreateNewDatabase=False " #Don't drop the database first
$VsDbCmd += "/p:IgnoreLoginSids=False "
$VsDbCmd += "/p:IgnorePermissions=False "
$VsDbCmd += "/p:IgnoreRoleMembership=False "
$VsDbCmd += "/DeploymentScriptFile:$dbOutputFile"
 
$cmd = Join-Path $workingDirectory "..\BuildTools\VSDBCMD\VSDBCMD.exe"
$output = & $cmd $VsDbCmd
$output | Out-File "$dbOutputFile.compare.log.txt"


# This is the repugnant Post-Process part
# I did only what needed to be done, this is not pretty.
# -----------------------------------------------------

(Get-Content $dbOutputFile) -inotmatch "(:on error exit|:setvar|REVOKE CONNECT|EXECUTE sp_droprolemember|DROP ROLE|DROP USER|AS \[dbo\])" -replace "USE \[\$\(DatabaseName\)\]","USE [$dbName]" |  Out-File $sqlDeploymentFile

$sql = Get-Content "$sqlDeploymentFile" | out-string

$keywords = @("USE [$dbName]")
$index = $sql.length;

foreach ($keyword in $keywords)
{
$foundAt =  $sql.ToLower().IndexOf($keyword.ToLower())
if ($foundAt -lt $index -and ($foundAt -gt 0))
{
$index = $foundAt
};
}
$sqlScript = $sql | % {$_.substring($index)} 
if ($sql -imatch "SET READ_COMMITTED_SNAPSHOT ON")
{
$sqlScript = "
IF EXISTS (SELECT 1
FROM [master].[dbo].[sysdatabases]
WHERE [name] = N'$dbName')
BEGIN
ALTER DATABASE [$dbName]
SET READ_COMMITTED_SNAPSHOT ON;
END
END
"  + $sqlScript 
}
$sqlScript | Out-File $sqlDeploymentFile
}
else
{
Write-Debug "Manifest file was not found!"
}
}

GenerateSqlScripts $ManifestFile $RawSqlOutputFile $ParsedFile $DbName $ConnectionString

$ParsedSql = Get-Content $ParsedFile | Out-String

Invoke-Sqlcmd -Query $ParsedSql -ServerInstance $server -Database $database -U $username -P $password 

Deployment of the Data-Tier Application (DacPac)

The terminology with the DacPacs is "Sync", and syncing the compiled dacpac manifest is a much simpler, much safer process.

function SyncDatabase
{
param(
[string]$dbManifestFile,
[string]$dbName,
[string]$sqlConnectionString
)

Write-Debug "****************************"

$debugPreference = $debugMode;
$start = Get-Date

if (Test-Path $dbManifestFile) 
{
#There is a deployment manifest file, so let's process it.
Write-Debug "Manifest file was found!"

& "C:\Program Files (x86)\Microsoft SQL Server\110\DAC\bin\sqlpackage.exe" /a:publish /sf:"$dbManifestFile"  /tcs:"$sqlConnectionString" /p:IncludeCompositeObjects=true /p:BlockOnPossibleDataLoss=false | Out-File "$dbManifestFile.txt"
}

}