You are shipping the code safely. Are you shipping the value?

Even though steps are quite wide, the only distance travelled that matters is the vertical one.

Shipping code. Safely.

Joining Intercom taught me how to ship changes quickly but safely. It’s almost impossible to work there and not learn these things. Some things are built-in on the infrastructure level, protecting me by design.

Most new hires get super excited when they figure out that merging to master branch actually kicks off a new deployment. It’s quite smooth. Nobody has to supervise, We are constantly shipping (more than 200 times a day), with zero negative impact on production, and if something goes terribly wrong, the deployment is reverted automatically.

Shipping with such a setup changes how you think about building software. You get in the mode of shipping ambitious changes as a series of small safe steps. Some changes are very straightforward (simple code change), some are moderately complex (adding a nullable column), and others are a little tricky + need to be deliberately split up and handled separately.

Introducing a new functionality into the system while making sure I can turn it off if something goes wrong? No problem, the mechanism was in place — I just added a feature flag as a killswitch, wrote tests for both code branches and I was ready to roll safely once again.

Some time went on, I had to refactor the whole system. Changing bit by bit, chopping bigger changes into smaller ones to lower the risk of the change, doing series of surgical, safe moves. But the app did not go down.

Breaking production to ship the value faster.

All these techniques help us ship all the time, in small increments, with no fear of breaking production. They get so ingrained in the culture that they become the only way people are rolling.

The mistake I caught myself doing is conflating shipping the code safely and shipping business value. I am finding great comfort in doing these surgical moves, but I’ve recently started asking myself what is the cost of it.

Is it worth it? Can I trade some safety for speed? Is safety even important to what I am doing right now? These abstract questions helped me to uncover tradeoffs I’ve been doing unconsciously:

Chopping up a PR in 10 smaller ones costs time both me and my colleagues reviewing it. It could be cheaper to roll out and revert if something goes wrong (even though reverts could bring bad reputation)
Adding a feature flag to turn off a feature nobody’s using yet requires unnecessary code and tests. Very likely a premature optimisation.
Scheduling a 5 minute partial outage could be cheaper than performing a series of small, safe schema changes.
Heck, I’ll even challenge writing some tests (good that Uncle Bob is not reading this, he’d state I am not a professional)

As with everything in engineering, things are not black and white. How we measure the cost and benefit here is often very loosely defined, but the truth is that businesses often win by execution. As engineers, we are (often) a huge part of it and every speed difference we gain helps the company win over competition.

Learning the extent to which we can break things is hard. It needs some business context. It is definitely harder than just sitting in the code editor, but it makes you a better engineer. It’s an art of finding the balance between speed and safety. First you’ll burn yourself and get more defensive. Eventually, you’ll become more aggressive again until you find that balance.