Release management

Introduction

In small companies software is usually released manually (or half manually) directly from developers machine to production. Usually there is one developer (the go-to guy) who knows the process by heart and everything solely relies on him. That go-to guy is the one you call when you need to patch, fix, investigate something, or add an ad-hoc feature. He is developing and testing on his machine, there is no test system and using small "personal" scripts to verify if some happy-path functionality is working. Sounds familiar?

What happens when the go-to guy leaves (temporary/permanently)? How to scale his work process and knowledge on other developers? How many issues happened because of some quick fix rolled out to production? How is this in terms of security, does everyone have production access, or just a few people?

Everybody in the industry knows that this is a bad practice, so how is this still a thing? Because it is easy. Most companies live from product and features, and often how fast they can roll them out. When the company starts growing it faces the go-to developer superstar becoming critical path to all important work.

It is critical to recognise this early-on. All solutions that don't alleviate the dependency on the bottleneck are wasted efforts. Creating software environments that enable developers to develop, testers to check features before deployment, and giving the customers the product without leaking internal processes is critical.

To very originally quote the wikipedia:

Release management is the process of managing, planning, scheduling and controlling a software build through different stages and environments; including testing and deploying software releases.

Identifying problems

Let's imagine a simple scenario... A company that runs an e-commerce system. In the past the company had 2 developers; a back end developer and a mobile app developer. They split the responsibilities and did everything from feature specification, planning, implementation, and management. When new features were implemented in the back end, the developer would spin up a test server such that the mobile developer can connect and integrate with the new feature. After the feature was implemented in a mobile app, he would deploy the new back end to production and expose to customers.

There are a few open questions here:

Is the test environment the same as production? (If not, unexpected bugs can arise.)
How to rollback if something goes wrong?
Is the process of deploying error-prone? Or manual? (Can a developer cause problems while deploying?)
Can a new back-end developer repeat the process?
Do testers test in development or in production?

Staging and automatization

In order to provide everyone a suitable environment to work on different builds must be fully separated one from another. The most common practice is to use at least three stages: development stage (dev-stage), integration stage (i-stage) and production stage (p-stage). Deployments to this stages should be automated to reduce manual work, and to encourage small releases that in the end limit the possibility for error. Long release cycles usually result in a vicious cycle: lots of changes potentially introduce lots of possible problems, which have to be solved before the next release. This is a often painful process, which leads to longer release cycles and even more problems. Note that this article does not cover automation, this is a topic for another article (or, a series). Builds and releases must always have one way flow from development, to integrations to finally production stage.

Development stage, also known as developers playground. Developers need to experiment, and don't get me wrong, they love it. It is not uncommon that in feature development to take apart the whole project, tear it down and break it altogether. In order to enable the developers to do so, the process of total recovery must be as simple as possible. Only then they can "play" all they need. Even if companies use on-premise servers, this can be easier to achieve with some cloud service subscription (due to automated provisioning).

Integration stage, you can also call it test stage. A more stable environment with the latest pre-release version of the software, with all modules and all necessary dependencies. Here your external partners, and your mobile application developers both can connect. This should be a rather steady environment that is fully automated and mimics the production system. Manual interventions should be minimal and limited to troubleshooting, any changes must end-up in automated scripts.

Production stage, aka "the real thing". Production should be set up automatically if possible, to limit the possibility of error and remove repetitive deployment. It is also very important that access to production is limited to people who really need it (e.g. operations). Not every developer, read your regular IT guy, needs access to p-stage. Sometimes it is hard to explain to people that this is not about trust, but avoiding accidents, keeping a clean environment and in the end it is also better for them. At first I had some mixed feelings about the limited access but it is not that fun being always on call (I was called twice for emergency while on vacation on different continent and the laptop was in the hotel, of course).

Summary

I live by the rule if you are not going forward you are going backward. The sooner you start implementing changes to deployment process the sooner you will start to reap the benefits. But I must say that implementing changes is hard, full of resistance and doubt. Best practices are a good guide but each company must find its way to work with the custom environment and processes. The most important thing that I have learned by implementing changes is that nothing is black and white and this is a process that takes time and team effort.