The New Normal: Team Scale Autonomy

Taking the Two Pizza Team to a Higher Level

You’re probably familiar with the concept of the two pizza team. This is Amazon founder and CEO Jeff Bezos’ rule that every team should be sized no bigger than you can feed with two large pizzas. At Cognitect, we take this concept one step further: the one chateaubriand team—with sharper tools you can afford to spend the extra money on better food.

The two pizza team is an important concept. The idea is that smaller teams are more self-sufficient and typically have better communication and a greater focus on getting things done because they eliminate dependencies. Every dependency is like one of the Lilliputian ropes that ties Gulliver to the beach. Each one may be simple to deal with on its own, but the collection of 1,000 tiny threads keeps you from breaking free. Eliminate dependencies and your teams move faster. Encourage autonomy and you allow innovation.

Rube Goldberg Machine, Jeff Kubina

Rube Goldberg Machine, Jeff Kubina

The idea of eliminating dependencies is central to the Cognitect philosophy. It’s also a concept promoted by Chad Fowler, then CTO of Wunderlist (purchased by Microsoft in June 2015). Fowler promotes the idea of impermanence, where everything is written to avoid interdependencies, deliberately avoiding shared or reused code. According to Fowler, if a piece of code has been around too long, it’s a candidate for rewrite because it means it can’t change very easily. It represents a point of risk.

The idea that you should deliberately destroy your code rather than preserving and reusing it is a radical perspective. Typically, people think of code reuse as a good thing. There are entire teams inside of enterprises just trying to make reusable components—pieces of code that are untouchable, treated with fear and reverence. That's like deliberately churning out inventory and making everyone depend on it! In contrast, Fowler’s idea is to prevent any of those untouchable pieces of code from creeping in. The minute you start to fear a piece of code you should rewrite it (and delete the old one). This approach goes straight to the idea of antifragility, because it deliberately creates variation in the code base. Variation plus selection equals evolution.

Dependencies can creep into projects in many different ways.

The Ripple Effect: But It’s Just a Button!

In a typical IT infrastructure, dependencies across different systems exist where there is a provider of information and a consumer of information. A good example of this is a retailer’s point-of sale system (the provider), which feeds into their general ledger system (the consumer). 

These two systems require an agreement on the format of messages they’re going to pass between them or files that are going to pass between them. If you want to change something on one side you have to change the other side—there is a dependency that in order to implement a feature on the provider side you likely have to alter the consumer side as well, and vice versa.

When you extend this out across the entire enterprise and its mesh of interdependent systems, you get a ripple effect. A seemingly simple feature in one system requires you to change something in the company that appears to be unrelated—it may reside under a different management chain, have different stakeholders or different schedules—but is affected by the cascade of changes through the interfaces. 

Someone somewhere will say, “What’s the problem, it's just a button on the website!" Well yes, it’s a button on the website. But it’s a button we can only add in five minutes after you change every other back-end system in the whole damn company!

Dependency Hell

Over time, we have built this wonderful open source ecosystem that has billions of lines of code, all of which are available for us to use at no cost. It sounds like a great resource. But here’s the thing. When you reuse a piece of code, you typically have to pull in a library. When you pull in a library, sometimes that library pulls in others, and then those pull in others… 

This scenario reflects the day-to-day experience of most developers. All you need is a single feature from a common open source library. Before you know it, you end up depending on 40 different JAR files. A change in any one of those files can cause bugs in your applications. It’s dependency hell.

Another kind of dependency is between the components at runtime. For example, if you have a component that’s calling another component, they must both be running at the same time and have the same availability characteristics. They have a shared protocol, so if one of them changes they can only change in ways that still satisfy the other end of that dependency.

Many enterprises have run centralized library teams over time. They’ll have an enterprise data dictionary team, or an enterprise architecture team that’s trying to build a common framework or common library of components. Running a centralized library team is exactly backwards. It means you’re increasing the number of dependencies, which is going to decrease velocity and your ability to maneuver. Centralized library teams make you more fragile, not less.

Dependencies Among People and Teams

If I’m on a development team, I may have the freedom to do my own work, but when it’s complete, I need to hand if off to different groups for review. It may be a QA team that checks for bugs, or a DBA team that ensures that database schema changes make sense. Now we’ve introduced another dependency across teams. 

The reason I need the DBA to make the schema change is because the technology is fragile. Bad schema changes cause performance problems, and can break other applications. Part of the current move towards microservices, where each service has its own database, is about limiting the effect of changes so that the schema change only affects one service rather than all database users simultaneously. 

Dependencies among people and teams also raise timing and queuing issues. Anytime you say “I can’t make my change until someone else does his or her change,” you have a dependency and everyone gets slowed down. If you need a DBA to make a change in a schema before you can write your code, it means you have to wait until that DBA is done with other tasks and is available to work on yours. How high you are on the priority list determines when the DBA will get to your task. Tack on another review process and you have yet another dependency.

The Cognitect View: Safety Enables Team Scale Autonomy

At Cognitect, we have a strong bias against dependencies. Our approach is to reduce these dependencies among systems, people and teams. We do this in a few different ways:

  • Work toward safe technology so that development teams can be trusted to work on their own without the risk of harming other teams
  • Use decoupled architecture wherever possible
  • Build cloud-native applications with autoscaling
  • Use strong consistency where it is needed
  • Make context and metadata explicit
  • Isolate failure zones in our architecture

At the source code level, we try to avoid a large stack of library dependencies. If you view code as an expensive asset, you will want to reuse it. Pursue both reuse and "don't repeat yourself" and you get a lot of library dependencies. Suppose you just need a function that you wrote somewhere before. If it’s a pure function, you don’t need a whole JAR file. Just copy the function! 

In our world, most things are pure functions. So we don’t have to bring along base classes, utility classes and whole frameworks to get one function. This lets us manage our dependencies much more carefully so we are less vulnerable to changes in the upstream libraries.

"Safety" also means that one team can change without harming others. This requires careful thought about protocols and representations. If you also describe your protocols with metadata, then you can apply logic to ensure that you only make safe changes:

  • Never require something you didn't require before
  • Never reject something you accepted before
  • Never return less than you returned before

These are like Postel's Law for protocols.

Safe Technologies Give You Greater Leverage

Eliminating dependencies between systems, components and teams requires safe technologies and sharp tools. We must go to war against these dependencies and the queues that go along with them. We need a revolution, where working with data and in a pure functional style becomes the new normal, helping teams move faster, encouraging autonomy and driving innovation.

Read all of Michael Nygard's The New Normal series here.