Wednesday, 10 April 2013

Continuous Integration with PowerShell

I am the Build and Release manager for a rapidly growing online gaming company. My clients, are the product manager, the 12 developers and two testers who form the Platform Team.

Up to press in my blog I've focussed almost entirely on building our platform, but those artefacts need to go somewhere, so in my next set of posts I'll start describing how we take the build output and push it to an environment.

The approach we've taken fits rather neatly with the Continuous Integration paradigm, so if this is how I got your attention, please read on.

The Platform

Our platform is a fairly large and complex piece of software, which comprises roughly of:
  • 50 C#.NET solutions
  • 130 C#.NET code projects
  • 30 Databases
  • 10 ASP.NET websites
  • 10 Schedule Tasks
  • 8 Windows Services
  • 1 CMS
  • An API consisting over 100 WCF endpoints
It's an ever changing landscape, and every feature-branch is unique in its own way.

The environments

We have a fairly typical environment setup for a software development company
  1. A development environment per feature (typically, 15)
  2. A set of external environments for 3rd party vendors
  3. A UAT environment
  4. A pre-live staging environment
  5. And a live environment
Every environment is as "like live" as possible, at least in terms of software and configuration. The close an environment is along the development pipeline to live, the more it's composition becomes more representative of Live too.

So, for instance, the developer environments are single-host solutions, designed to be quickly torn-down and rebuilt. The UAT environment consists of 6 hosts, largely for performance reasons. The Staging environment is a virtually identical scaled down representation of the Live environment.

The challenges

The scale of the platform makes deployment challenging already, but there were additional complications, specifically how we have chosen to deploy and use our platform.
  • Some of the websites participate in load-balanced arrays
  • Some of the WCF applications are deployed multiple times, with different identities. 
  • Some of the websites are deployed multiple times, with different identities.
  • Some of the databases can only be "restored", whereas others are "patched". 
  • Some of the sites have multiple web bindings, SSL certificates
  • Some of the WCF applications have custom security models
And some of the challenges were baked in by that nature of SOA. 
  • WCF applications need to locate each other, wherever they've been installed.
The end result is that there are over 100 packages involved in a platform deployment, many of them deployed several times. Speed, as ever, was going to be the final challenge.

Supporting the development workflow

Deploying the build output is a critical part of the development process. Developers are continually deploying their code to a development environment, so just like the build process, it needs to be fast; fast as it can be. It's also needs to be totally dependable, otherwise whats the point? 

Testers also need to be able to take any build from TFS and deploy it to their test environments. Whilst not as frequently as developers, it remains a frequent operation that needs to be fast 'and' easy to use. 

Essentially Powershell is the driving force of our continuous integration pipeline.

The end result

This output list has been altered, to protect the internal identity of the platform.
Over 50 entries relating to websites, databases, tasks and services have been removed.

DDDDDDDDDDDDD                                                lllllll
D::::::::::::DDD                                             l:::::l
D:::::::::::::::DD                                           l:::::l
DDD:::::DDDDD:::::D                                          l:::::l
  D:::::D    D:::::D     eeeeeeeeeeee    ppppp   ppppppppp    l::::l    ooooooooooo yyyyyyy           yyyyyyy
  D:::::D     D:::::D  ee::::::::::::ee  p::::ppp:::::::::p   l::::l  oo:::::::::::ooy:::::y         y:::::y
  D:::::D     D:::::D e::::::eeeee:::::eep:::::::::::::::::p  l::::l o:::::::::::::::oy:::::y       y:::::y
  D:::::D     D:::::De::::::e     e:::::epp::::::ppppp::::::p l::::l o:::::ooooo:::::o y:::::y     y:::::y
  D:::::D     D:::::De:::::::eeeee::::::e p:::::p     p:::::p l::::l o::::o     o::::o  y:::::y   y:::::y
  D:::::D     D:::::De:::::::::::::::::e  p:::::p     p:::::p l::::l o::::o     o::::o   y:::::y y:::::y
  D:::::D     D:::::De::::::eeeeeeeeeee   p:::::p     p:::::p l::::l o::::o     o::::o    y:::::y:::::y
  D:::::D    D:::::D e:::::::e            p:::::p    p::::::p l::::l o::::o     o::::o     y:::::::::y
DDD:::::DDDDD:::::D  e::::::::e           p:::::ppppp:::::::pl::::::lo:::::ooooo:::::o      y:::::::y
D:::::::::::::::DD    e::::::::eeeeeeee   p::::::::::::::::p l::::::lo:::::::::::::::o       y:::::y
D::::::::::::DDD       ee:::::::::::::e   p::::::::::::::pp  l::::::l oo:::::::::::oo       y:::::y
DDDDDDDDDDDDD            eeeeeeeeeeeeee   p::::::pppppppp    llllllll   ooooooooooo        y:::::y
                                          p:::::p                                         y:::::y
                                          p:::::p                                        y:::::y
                                         p:::::::p                                      y:::::y
                                         p:::::::p                                     y:::::y
                                         p:::::::p                                    yyyyyyy

                    Target Environment ->                                   Port HTTP      Duration    Elapsed Time
dev-server01 [app_host]     powershell -> Prepare-TaskHost.ps1              0    0         [5.00s]         [9.26s]
dev-server01 [app_host]     powershell -> Prepare-SqlHost.ps1               0    0         [2.65s]        [11.00s]
dev-server01 [app_host]     powershell -> Prepare-WebHost.ps1               0    0        [11.64s]        [21.16s]
dev-server01 [app_host]     powershell -> Prepare-WindowsServiceHost.ps1    0    0        [18.65s]        [24.66s]
dev-server01 [app_host]            sql -> Sample database                   0    0         [2.88s]        [32.36s]
dev-server01 [app_host]     powershell -> Initialise-SQL.ps1                0    0         [2.47s]     [2m 53.19s]
dev-server01 [app_host]     powershell -> Reset-CacheCluster.ps1            0    0         [2.42s]     [2m 54.93s]
dev-server01 [app_host]        service -> Sample windows service         7778    0        [59.77s]     [3m 58.09s]
dev-server01 [app_host]     powershell -> Prepare-WindowsServiceHost.ps1    0    0         [9.07s]     [4m 10.82s]
dev-server01 [app_host]            web ->      80  200        [40.53s]     [6m 14.39s]
dev-server01 [app_host]           task -> Imported SampleTask.xml           0   -1        [24.18s]     [6m 17.33s]
dev-server01 [app_host]     powershell -> Finalise-WindowsServices.ps1      0    0        [46.88s]      [7m 7.86s]
dev-server01 [app_host]     powershell -> Finalise-WebHost.ps1              0    0     [2m 55.73s]     [9m 18.42s]
dev-server01 [app_host]     powershell -> Finalise-TaskHost.ps1             0    0        [53.82s]    [10m 56.81s]
dev-server01 [app_host]     powershell -> Initialise-TestUsers.ps1          0    0        [51.92s]    [10m 56.86s]
dev-server01 [app_host]     powershell -> Finalise-MSMQ.ps1                 0    0     [3m 54.80s]    [13m 56.07s]
dev-server01 [app_host]     powershell -> Validation-TestRoutes.ps1         0    0         [3.89s]     [14m 6.36s]
dev-server01 [app_host]     powershell -> Validation-PlatformLogin.ps1      0    0        [22.92s]    [14m 23.54s]

What's next?

Hopefully, I've framed the deployment landscape nicely, so in my next series of posts I'll deconstruct our deployment pipeline into its constituent parts and discuss each in greater details.

I'm thinking these could make for interesting posts.
  1. Building a deployment map
  2. Synchronising websites using MSDeploy
  3. Synchronising databases with VBSQLCMD
  4. Using XDTs to configure your applications
  5. Using REST to configure your applications
  6. Automated testing

Tuesday, 9 April 2013

TFS Build agents not picking up the next job in the queue

We have two build agents, and from time to time a situation occurs where only one build agent collects jobs from the queue. The other build agent, just sits idle.

In essence, normal priority builds are ignored in TFS.

What didn't work?

  • Restarted the Build Controller
  • Restarted both build controllers

What was it?

There was a job that had never completed in TFS, and was still marked as "In Progress".
This job was assigned to the build agent that wasn't picking up any new jobs.

Why TFS was no longer showing this job Visual Studio is another matter entirely, but we were confident that this job was dead and buried. It was over a month old, and the build agents had been rebooted many times since then. It was definitely not still running.

Query [tbl_BuildQueue] in our TFS database we found the old job with the status of "1"

0  – None
1  – In progress
2  – Queued
4  - Postponed
8  – Completed
16 – Cancelled

So, we used SQL to manually mark this job as cancelled.

Taking a note of the QueueId, we set the status of the stuck job to "16" and this released the build agent to start picking up new jobs from the queue.

Update tbl_BuildQueue set status=16 where QueueId = ______

This is an action I need to perform once every few months.