Wednesday, 29 August 2012

Building Blocks


Before I continue with my distributed tests project, I thought I'd pause for a moment, and better explain how my scripts are composed. How I manage parallel job execution and gather results of complete jobs. 

Most scripts of this nature (i.e. asynchronous PSJobs, multi-hosts etc.) are composed in a fairly standard fashion.

So, here I go. First, the basics.

Using hash-tables to store and share complex data structures.

I pass a lot of information around in my scripts. The flow is bi-directional, and multi-hop. Information may pass between two or three scripts, spanning several hosts, the resulting feedback information being enriched at every stage.

My preferred mechanism for sharing and representing complex data structures, or aggregated sets of information, is to use a Powershell Hash-table object.

The hash-table is a thing of great beauty, its declaration and usage are both elegant, readable and compact. There are other well documented techniques for creating custom objects in Powershell, but in most cases, I default to the hash-table.

I generally lean towards 'chunky' interfaces over verbose, namely aggregating related sets of information into hash-tables to be passed as parameters rather than a long serial list of parameters. I find this improves the readability and maintenance of the scripts. It's much simpler to add a property to a hash-table than to be continually modifying parameter lists.

Take this example for instance....

Create-Database.ps1

usage
Or, you could be very granular
usage

You can choose which you'd rather have, but I will always go for hash-tables.

PSJobs, Queues and Abstraction

Let's take a simple hash-table declared as $myJob, and use it to create a new PSJob.

Did you see that? Not sure what I'm getting at? Ok, what if I expand a little further?

Now do you see it??? No?!? Ok...

What you're seeing here is a collection of objects, each representing a specific job. The object specifies the name of the job, it details the handler script and the parameters needed by the handler script.

Then, with a very simple for-each loop the hash-table can be used to spawn a multitude of asynchronous remote PSJobs.


Still not with me? Ok, consider this more real world example.


Three jobs, defining deployments of three very different types of package, handled by three different scripts, each with a unique set of parameter requirements. 

It get's even more powerful when you consider that $deploymentJobs could be dynamic? It just so happens, that is a mechanism not too disimilar to this that enables me to push the Platform-A in it's entirety. And, it's not much more complicated that the example above! 

Getting results

The problem with asynchronous jobs is not so much knowing when they're done, but what they did whilst doing it. 

Getting the results is much simpler than you might think.

Remember the hash-table? What about this then...

DeployWebsite.ps1


And back in the script that created the job..

How hard is that?! Vive la Powershell!

Coming up? Scaling up! 

With the basic blocks covered, in my next post we'll be ready to tackle some big questions,
  • How to manage a list of pending jobs
  • Capture the results of these jobs as they complete
  • Display the results back to the users console host


Friday, 24 August 2012

Security Changes

Build machine configuration

This post relates to the general series of posts on configuring an MSbuild farm using Powershell.

So that my build agents can delegate build tasks to the workstations, I need to make a few slight configuration changes to all the participants.

Let's get a few terms sorted

For the avoidance of doubt, here's a few key terms that I'll be using throughout this series of posts.

Build agents

When I say build agents, I am referring to the PC's that are running the TFS build agent service. These are the machines that TFS relies upon to orchestrate the builds.

In terms of CredSSP, the build agents act only as a client (originators).

Workstations

When I say workstations, I am referring to the desktop computers that I'll be pushing remote jobs too.

In terms of CredSSP, the workstations are both clients (originators) and servers (receivers). 

CredSSP

CredSSP is a multi-hop security mechanism, it's full name is "Credential Security Service Provider".
Essentially, CredSSP works like a passport in that it permits scripts to make multiple hops across a network from their point of origin. Without CredSSP, scripts could only make one hop from the originator and would be subject to strict sand-boxing on the remote host.
To give a more real world example, CredSSP will allow a build agent to run a PSJob that will go on to invoke a remote session on one or more build workstations. And these remote sessions are able to access network locations beyond the workstation host.
Just FYI, but CredSSP doesn't work on XP, 2003 or earlier incarnations of Windows. This is only for Vista, Windows 7, 2008 and beyond

Getting started

These are basic steps, just enough to allow my pool of machines to talk to each other. I am by no means claiming that this is secure, or even best practice. I strongly suggest reading this in-depth article about setting up WinRM and CredSSP from Microsoft.

With the disclaimer and warnings out of the way, I'm going to quickly configure our developer workstations to accept remote connections using CredSSP.

Configuring the "build agents"

The build agents are configured to work both-ways.
  • They make outbound connections to the developer workstations
  • They receive inbound connections from the workstations for file access.
In an Powershell console running with elevated permissions:

Configuring the "workstations" in my nascent build farm

The workstations are also configured to work both-ways.
  • They receive inbound connections from the build agent.
  • They will make outbound connections to other network resources for access to files
  • In my case, this will be the build agent which will have the TFS workspace.
In an Powershell console running with elevated permissions, type:
The CredSSP services can be disabled any time

Testing

From my own computer (acting as build agent), I should now be able to connect to one of my new build workstations, and then inspect the file system back on my local computer.

What really happened?

Well, nothing ever goes quite to plan, does it? :) The information in my post above got me 90% there, but there were a few extra steps I'd kind of skipped over, and/or forgotten. Fortunately, someone invented the Internet, and here's what else I had to do.

So, the new developers PC wasn't configured at all for any form of Powershell Remoting, and it's likely that most PCs won't be.

The first 4 lines simply setup the machine for remoting, which includes opening port 5985 in your firewall. The 6th line is a critical step that modifies the computers policy settings.

Job Done! Next up, job scheduling. How I'm going to orchestrate my jobs for remote execution on the developer PC's. I have already started this, but it needs a bit of refining before I embarras myself any further in a public forum :)

Thursday, 23 August 2012

Distributed testing

Cost effective scaling


As part of Project A, we purchased some fairly hefty liquid-cooled I7 workstations to plough through some very lengthy build and test cycles. 

Both of these workstations are performing fantastically as you might expect for a (4.3Ghz) I7. Each machine nearly halved the length of time it takes to complete the build and test cycle phases of the pipeline. 

However, not even the combined super-awesomeness of these two liquid-cooled monsters can handle the amount of work we're now throwing their way. As we slowly increase the size of the development team, both build frequency and queue lengths are slowly increasing. 

I want to tackle this niggle before it becomes a burning issue, which won't be too long given how quickly we're expanding. 

The easy answer is to get another liquid-cooled I7 and add it to our existing build farm, but before I approach the IT guy and ask him for "another" monster, I thought I'd explore and idea I've been mulling over for a good while.

Surplus power

In most organisations, developer workstations are typically above average specification. 

What I'm wondering is, can I use Powershell and its background jobs to take advantage of all that spare capacity sitting under each developers desk? When you think about it, all those I7 workstations, dotted throughout the office, just sitting there, idling away, is a colossal reservoir of power.

Powershell is the key to unlocking this unparalleled processing power.

The potential is enormous and the cost savings equally so.
  • Rapid build and deployment cycles
    The quicker a build completes, the quicker the feedback for the developer. 
  • Tools and licenses
    Developer workstations already have installed and licensed all the software a build agent would ever need. And, in some cases, bypassing the need to get a *special* license for build servers!
  • Infinitely scalable
    If the build pool is derived from the number of developer workstations, then as each new developer joins the team, there will be another powerful workstation joining the pool! All the tools installed, licensed and ready to go.

Whats my plan?

My deployment scripts are already capable of farming-out and load-balancing the deployment packages. So, I want to reuse this mechanism for pushing out parts of the build cycle to each developer machine, taking a small slice of their redundant power for the greater good.

There are several stages in the current build pipeline:
  1. Clean
  2. Build
  3. Unit tests
  4. Code coverage
  5. Quality analysis
  6. Static analysis
  7. Databases
  8. Packaging
As a proof of concept, I'm going to take the 3rd step first, and see how the performance improves by farming out the unit tests.

Two reasons for this... firstly, the unit tests have no inter-dependency and secondly they are very easy to orchestrate, being nothing more than a linear list of jobs to process. The clean and build cycles are far more complex as they have to navigate the minefield of build dependencies.

How will it work?

Right now, I only have a rough idea in my head of how this might work, and its largely based upon how the deployment mechanism works.

My intention will be to clean and build the workspace on the build agents first. Then, each test container will be packaged up and farmed out to the available pool of developer machines. 

There are 120 test containers at the moment and the build agents take just under 2 minutes to run all the tests in the current parallel execution model. I'm hoping that by pushing out 5 test containers to each available developer workstation that I'll be able to reduce the testing cycle by a factor roughly equal to the number of machines in the pool.

I've currently got about 12 workstations at my disposal, so I'm hoping to reduce unit tests from 2 minutes to 20 seconds! High hopes, and we shall see about that :) 

More specifically, the compiled output and test containers will remain on the build agent, and the Powershell jobs that are remotely executing on the developer workstations will make UNC connections to the build agent. I think (although I will compare and contrast) that this will be quicker than first copying the files to the developer workstation and pushing the results back over. 

There will be issues of multi-hop security, so CredSSP policies will need to be enabled on the developer workstations, and I'll have to include the network guy to make sure we're not leaving ourselves wide open to abuse. 

This should be fun! 

How will I know if this has been a success? 

I 've been keep some quite detailed statistics of the build and deployment refinements over the past year. 


The illustration above demonstrates how each evolution of the build and deployment pipeline has contributed to an overall reduction in the cost of creating and deploying packages. 

So, the plan will be to farm out the Unit Tests to the developer workstations, and then compare the results against the current times.

I'd rather not speculate that something was *just* better, I'd like it to actually be better, and I'd seriously want to know by how much its better. I have high hopes for this distributed build project, but I'll wait for the results before I get too excited.

I shall post regularly relating to this project, hope you enjoy.


Wednesday, 22 August 2012

PoSH

Powershell Rocks!

At the very heart of my entire build, deployment and DevOps work is Powershell. I can't honestly imagine life without this absolute Gem. 

It enables me to build, deploy and manage extremely complex solutions, and its free! What is there not to love! 

Parallel processing

The single biggest advantage of PoSH in my opinion is its ability to run background jobs. It's by this inherent feature that I've managed to decimate the running times of some very long running processes.

For instance
  • Build times from 45 minutes to 4
  • Deployment times from 35 minutes to 3
There is a standard limitation of 20 concurrent processes per host, but this is easily modified. However, there does come a point I've discovered where you can over work your local computer, so how many jobs you choose to run is very much determined by just how much CPU power each process will consume.

MSBuild for instance, even when set to run in multi-processor mode, doesn't really push the CPU to its limits. I've found that I can run approximately twice as many MSBuild processes as cores available, and my CPU then averages around 90-95%. 

MSTest is another matter. MSTest can and frequently does run a core upto 100%, so with MSTest, I run as many as I can, leaving a single core left to idle and take care of the system.

Remote processing

Another fantastic ability is running processes remotely. Again, this is another feature that I have used extensively to scale up the processing power available to my long running tasks. What could be better than running 20 jobs on a single host? Running 100 over a network! 

Not every action is suitable for remote processing. MSBuild for instance, would require source code to be copied remotely, or given network access, and the overhead of this typically takes away the advantages of remote processing.

MSTest on the other hand, works on test containers and dependent assemblies. Quick to copy, and can take several minutes to process, which makes MSTest an ideal candidate for remote processing.

XML

Handling XML is so ridiculously easy with PoSH that I've almost forgotten what XML looks like. Many of my tasks and processes consume, create and share XML objects without ever realising that they're dealing with XML. 

A good example was NCovers XML reports. Hideous to casual reader, but so very easily processed using PoSH and transformed into very meaningful radar diagrams using MSCharts.

Equally, PoSH can consume XML via web services, which opens up a whole world of possibilities when interacting with APIs and cloud services. 

.NET

If I were to descrive PoSH to a programmer, I would basically say that it's a "Scripted .NET" engine. Without any real effort, my PoSH scripts and modules frequently employ methods within assemblies to perform tasks.

An example of this would be the .NET MS Chart Controls. By taking XML from NCover, I was able to transform it into some very effective radar diagrams all within PoSH. 

Cmdlets

From Microsoft, nearly every product or service, is a Cmdlet designed to make interaction from PoSH that little bit easier using native PoSH syntax and conventions.

I frequently use Cmdlets to interact with
  • SQL Server
  • SQL Analysis Services
  • App Fabric Caching
  • IIS
  • Windows
  • Windows Remote Management (WinRM)
  • Azure services
  • And loads more.... 

Why I love PoSH?

Over the years, I've come to be very mistrusting of proprietary frameworks and cloud services.

In the past few years, I've chosen to develop my own build and deployment framework using PoSH, rather than using CCNet, TeamCity, TFS etc. And I've never looked back.

What I like about PoSH is that it works like developer glue. Whatever products and services my dev teams decide to use, I know that using PoSH I can bring them all together with PoSH. I am not locked into any one approach, service or framework. I can also be choosy, taking only what I need from any given application or service, and ignoring the bloat that I would otherwise be forced to use.

Using PoSH, I could substitute my source control repository tomorrow without having any affect on any other part of my build and deployment pipeline. Equally, I could change test suites, code coverage tools, even deployment hosts. I don't have to wait for updates to use the latest tool sets, and I'm free to choose products from competing vendors. 

PoSH is freedom, flexibility and power.




Project A

Project A

A high availability financial platform

I've been working on the build and delivery aspects of 'Project A' for about 9 months. It's been quite a challenge as there are so many inter-dependent components in this high-availability platform.

This platform delivers an online gaming experience with over 4000 transactions per minute. Most of the transactions have a financial aspect which places extra emphasis on stability.

A high-level overview of this platform, from a perspective of build and release breaks down as follows:

Delivery stack

  • Microsoft Windows Server 2008 R2
  • Microsoft AppFabric 1
  • Microsoft Web Farm Framework 2
  • ASP.NET Framework 3.5 & 4
  • Apache + Tomcat
  • RabbitMQ
  • Microsoft MSMQ
  • Microsoft SQL Server 2008 Enterprise
  • Citrix Xen Server
  • Windows Scheduler
  • Microsoft Web Deploy 2
  • Powershell 2

Development stack

  • Visual Studio 2010
  • ASP.NET Framework 3.5 & 4
  • MSBuild
  • NCover
  • MSTest
  • SpecFlow
  • Microsoft Web Deploy 2
  • Powershell 2

Platform composition

  • 100 x WCF applications
  • 10 x Web applications
  • 1 x Apache Tomcat application
  • 14 x MSMQs
  • 1x RabbitMQs 
  • 6 x Windows Services
  • 10 x Scheduled Tasks
  • 1 x HTML5 client

Continuous delivery pipeline

This has to be the most complex project that I've had to deliver an automated build and deployment pipeline for. They say that necessity is the mother of invention and this project has certainly demanded a lot of innovation.

Parallel build pipeline

The build pipeline essentially produces the packaged artefacts for deployment. As part of the pipeline code is compiled, tested and analysed. The product of successful builds are uploaded to an artefacts repository for later deployment. 

The platform consists of over 80 solutions. The solutions are built in a specific sequence and parallelised where possible to minimise compilation times. Running sequentially, the platform would take 45 minutes to compile, in the paralleled build pipeline, this time is a mere 4 minutes.

Unit tests and code coverage analysis is also performed in parallel. 

Parallel deployment pipeline

Artefacts generated by the build pipeline need to be deployed to an environment. 

The build artefacts are packages for Websites, WCF applications, Windows Services, Scheduled Tasks and Database.

There are 5 environments for Project A, each different in composition and optimised for a specific purpose. 
  • Developer testing
  • System testing
  • User acceptance testing
  • Operations testings
  • Live
Developer, System and UAT test environments are single host solutions provided by Citrix Xen Server. Small, cheap, disposable and automatically provisioned. Their purpose is primarily to aid the software development process across the development teams. At any given time, we can have over 10 of these environments running different versions of the platform.

Operations testing is a scaled back version of the live environment. OAT as we call it, is a multi-host solution that mimics the live environment in terms of composition, and its purpose is primarily for DevOps and other specialist teams to perform tests in a "like-live" environment.

Live, is the production environment, a monstrous array load-balanced and high-availability servers serving many thousands of transactions per minute.

The deployment pipeline takes packages from the build pipeline, and pushes them to any one of the above environments. The packages and the deployment mechanisms are platform agnostic, what varies are "service maps" that define the specifics of each target environment.

The service maps allow the deployment mechanism to deploy, configure and test each package.

As with the build pipeline, this many packages would take a long time to deploy if performed sequentially. When deployed sequentially, the entire platform takes about 30 minutes to deploy, but using a parellel solution, deployment times can be as low as 3 minutes.

Dev Ops toolset

Supporting the platform outside of the routine deployment deserves automation just as much as the build and deployment processes.

These tools use the same service maps used by the deployment mechanism. 
  • Automatic environment provisioning
  • Web farm framework (WFF) configuration and application request (ARR) routing rules
  • App Fabric configuration
  • Database restores and patch upgrades

Continuous Delivery, Continuous Integration, Continuous Improvement

This platform is continually evolving, each week new features are introduced. Occasionally, these new features will require additional support from the build and deployment pipeline, and I'm hoping to journal these extensions through this blog.

The approach

Initially most of my posts will be a catch-up of how Project A has already evolved, and then I'll move into less frequent posts as and when the pipeline evolves. 

Happy reading!

Hello iWorld

Just testing my new CI blog is accessible from iOS!