This is part of a series of posts on how to make software deployments boring, because Boring is good! Today I’ll start with an old nursery rhyme I learned as a kid:
There was an old woman who lived in a shoe.
She had so many children, she didn’t know what to do;
She gave them some broth without any bread;
Then whipped them all soundly and put them to bed.
Sometimes it comes to mind when I’m preparing to deploy an update of our software. I’ll tell you why.
We have 17 primary components in our software, plus about a dozen (potentially a hundred) sub components. We deploy each of these independently from the others, although version dependencies often require us to deploy a set of components together.
Developers can create a dozen versions of each of these components each day, although usually it’s far less. But in our two-week development cycles, we have between ten and one hundred different versions of each component to keep track of.
We have five test environments plus our production environment. It’s my job to ensure that the correct version (of possibly a hundred) of each component gets to each environment at the right time.
Can’t I just put them to bed? No?
Well, then, I need a way to know what’s supposed to go where.
My team uses two tracking mechanisms, one supplied by Microsoft Team Foundation Server (TFS) and one that I created.
When a developer checks in a code change, TFS automatically builds the software, assigning the build a unique identifier (the "build number") for tracking. You can list all your builds in Team Explorer, which is an add-on to Visual Studio for working with TFS. Each build in the list can be assigned a "build quality". You can create your own list of build qualities, then choose from that list to assign a particular quality to a particular build.
We use build qualities such as "Move to QA", "QA", "Move to Production", "Production". You’ll see that some of these express an intent: we want to move version X of component Y to Production. Other qualities express the current state: version W of component Y is currently in Production.
The build quality gives us half of what we need. We use it to communicate what "should be" in each environment. It’s great–it’s quick and easy, and it can give you a single, at-a-glance view of which version of what component should be in which environment. We especially like the qualities that express intent (move version X to Production). It’s our primary mechanism for communicating between our quality assurance team and our deployment team.
There’s only one problem–build qualities are set manually. I can’t tell you the number of times I’ve forgotten to update the build quality to indicate that version X HAS BEEN moved to production.
That’s why we have a second mechanism. At the beginning of each deployment, the automated process updates a file on each server, indicating which version of which component is being deployed to that server. When deployment is complete, the automated process updates the status and also indicates when the process finished. This enables us to see, dependably, what really IS on each server in each environment, not just what we intend to have in each environment.
You might think it odd to have both mechanisms, but I like it. One quickly and easily expresses intent and can be updated by our quality assurance and operations teams from within Visual Studio. It’s a nice communication tool. The other is a highly controlled indicator of exactly which version of any component is installed on a particular server in a given environment. It’s quickly accessible (to read only) and it is always right.
These two methods combined make it possible to have conversations like this:
|OK, we’re ready to go to Production!|
|They’re all marked "Move to Production."|
|Hmm…hey, version X for component component Y has a newer build than the one you’ve marked. Did you mean to include it?|
|No, we’re holding that one until a developer checks in another fix.|
|Did that bug fix get out to our Acceptance Test environment?|
|Yes, it was deployed last Tuesday.|
|Well, the bug is still there, but it’s intermittent. Are you sure you didn’t skip a web sever?|
|Our process should prevent that, but let me check…no, all the servers are at exactly the same build.|
|Are you sure that’s the right build for the fix?|
|The bug was fixed in version W (you can get this from TFS), and version X is in that environment, so yes, the fix is there.|
|How was it tested?|
|You’ll have to ask our QA group about that, but it was deployed to the test environment a week ago.|
These kinds of conversations help ensure that the right version of each component gets to the right environment at the right time. And when there’s a problem, it helps us put our finger on the problem a little quicker. We can quickly determine whether or not it’s a versioning or deployment problem.
One more thing to make deployments boring.