Effective team culture: Five Whys & Stop-the-line

Effective team culture: Five Whys & Stop-the-line

When Google set out to measure what makes their best software development teams successful, they found a strong correlation with psychological safety:

A shared belief held by members of a team that the team is safe for interpersonal risk-taking... A sense of confidence that the team will not embarrass, reject or punish someone for speaking up.

I find it interesting that this lends explanatory power to two much earlier ideas that came out of the famous Toyota Production System: "Five Whys" and "Stop-the-line". Both of them are directly relevant to building a software team with more psychological safety.

Five Whys

When a problem happens, it's tempting to accept the first explanation.

Q. Why is our site down!?

A. Because Ed accidentally deleted the production database!

The idea of Five Whys is to force yourself to keep asking questions until you uncover deeper causes.

Q2. Why did he delete it?

A. Because he typed the wrong command.

Q3. Why was he typing that command at all?

A. He was trying to re-deploy the staging environment, and accidentally hit production instead. He was typing commands because our deploy process is manual.

Q4. Why is it manual?

A. Because we're under-prioritizing site-reliability work. Let's use this incident to explain to the business why we need to spend time on automated deployment.

Q5. Why was he using credentials sufficient to delete the production database?

A. Because we use the same credentials in staging and production. I guess we should change that.

Five Whys has a direct benefit (finding and fixing deeper problems). But it has an even more powerful indirect benefit of improving your team's psychological safety. Mistakes are treated as gaps in process, technology, or training. Not as personal failings.

Stop-the-line

Another widely-admired part of the Toyota Production System is the rule that any person on the assembly line is obligated to push the Big Red Button that stops the entire assembly line if they see anything wrong. And the whole team will gather to identify the root cause before starting things up again.

When it was first introduced, this ran directly counter to dogma. Stopping the line is incredibly expensive, you're supposed to do everything you can to keep it running.

But iterative process improvements are like compound interest — over time they make an overwhelming difference that more than pays for stopped assembly lines (or paused feature development work).

It is your obligation to push the button. You never get punished for it, and you never get told you shouldn't have. Even if you turn out to be wrong, the fact that you perceived a problem where there wasn't one was itself a problem, pointing to the need for better design or better training.

In both software and manufacturing, stop-the-line leads to improvements that are otherwise hard to make because it isolates specific problems precisely when they are occurring. There is no better time to fix the underlying cause.

In software, stop-the-line doesn't involve pausing an assembly line. Instead, it means pausing work on your currently assigned feature in order to solve another problem that you hit along the way. If you hit the bug, others are likely to hit it too. So it's worth fixing, or at least worth talking to the rest of the team about it.

A culture of stop-the-line directly fosters psychological safety. Everyone — even the most junior members of the team — should feel safe pointing out problems, without fearing that they might be ridiculed for being wrong or accused of criticizing their peers.

Necessary but not sufficient

There's more to psychologically safe teams than just these two principles. But I think they're worth talking about because they're relatively easy to teach and understand, and they may catalyze introspection that uncovers other cultural/interpersonal problems.

They're also a litmus test for a junior team member who's wondering "am I bad at this, or is this a toxic environment?". If your organization can't or won't honestly engage in this kind of self-examination, it's not you, it's them.

Image credit Creative Common by-nc-nd Toyota UK