Technical challenges of working in autonomous product teams and how to overcome them

Engineering

Written by: Helin Ece Akgul

At TransferWise, we work in independent, autonomous teams. Nilan, our VP of Growth describes it so that “Each team is a learning organisation – trying to understand what matters to a customer, how to change behaviour and drive growth”. The benefits of this structure are rather clear: small, independent teams that are empowered to make decisions on their own can maintain focus and ship product value faster. 

But when you have tens of independent teams, how do you stay aligned and informed with other teams? With teams focused on their own product and codebases, how do you create unity around a common vision and goal? 

During my three years at TransferWise, we’ve more than doubled in size from under a thousand employees into over 2200 of us today. We now have over 50 Product teams with over 400 Engineers and Tech Leads embedded into them. We’ve had to adapt our ways of working as we’ve scaled. In this blog, I’ll dive into some of our challenges and learnings from them that have helped us scale.

The benefits and challenges of the autonomous teams structure

In an autonomous team structure, decision-making is done by teams who are closest to the problem and customers. It allows teams to work on and solve problems throughout the full product life-cycle. 

Some of the key benefits of autonomous teams:

  • Trusting teams with their own decision-making and their full product life-cycle empowers them, makes teams more efficient and reinforces customer-focused thinking
  • Creating product teams with shared clear vision and goals in contrast to the usual separated Engineering, Design and Analytics teams helps members of a team to be on the same page and deeply understand the challenges at hand
  • Teams consisting of different expertise areas allows to learn from each other and increases diversity in both tech and ways of working

While here are some of the key challenges:

  • Teams fully owning their area means lots of accountability. If and when something goes wrong, it’s up to the team to fix it and there should be clear owners and procedures set in place. If not managed well, this can overburden the team and lead burnout. 
  • With teams laser-focused on their own product, it’s easy to lose track of what other teams are doing. This can result in duplication of work and conflicting visions for product and tech. 
  • With teams working independently in different codebases and services, it is important to have technical standards to enable centralized tooling and set expectations from each component.
  • Working in small, independent autonomous teams, it can be easy to lose alignment on the bigger picture, like overall scalability of tech and team overall vision and direction of the team. 

What’s helped us scale the way our autonomous teams work

I wanted to talk through the key learnings from where we are today with autonomous teams and how we make it work when growing our customer base and product quickly. Here are a couple of things that have helped us:

1. Our teams are responsible and accountable for the full life-cycle of their ownership areas

At TransferWise, teams own the whole life-cycle of their domain. We operate in a microservice architecture with modularised web frontend and mobile apps and have code and service owners looking after each repository and service. 

Ownership shows as accountability in our day-to-day work in multiple ways. When planning what to build, teams are responsible for making sure that each feature release adds value to our customers. Our Engineers work with Analysts and Product Managers to make sure we’re measuring the impact of each feature and that as a team, we’re focusing on the most impactful things. 

Sometimes a feature might require changes in the codebases of other teams. When that happens, we work together with the owner team on the design and agree early on with the approach for the extension. This way, owners are expected to also support new functionality in their domains through building extension points, participating in discussions, providing guidance and doing code reviews. 

If and when things go wrong, teams are also accountable for fixing issues in their product. Rather than Engineers just shipping code, passing it on and moving to the next task, their role here includes also ongoing end-to-end support for their product. Also, when necessary, our Engineers participate in on-call rotations and escalation policies. The need for on-call activity is determined by the criticality of the product and being on-call is always planned in advance and compensated for.

2. We aim to solve recurring problems effectively to avoid duplication of work

When teams are focused on their own codebase it’s easy to end up duplicating work through solving the same problem in multiple different places in the organisation. Thus, we have forums where we discuss engineering-wide problems and share knowledge on challenges and learnings. 

We use internal libraries to solve recurring problems in different systems. An example is our observability base developed by our Site Reliability Engineering team that makes it easy to instrument our services. Another one is our open-source asynchronous task executor, we use for many async processes like automatic reconciliation of accounting statements with Xero. These internal tools help us avoid reinventing the wheel with issues we face and enable us to quickly iterate on already solved problems. 

3. We’ve built and constantly iterate on technical standards for teams to follow

Something we’ve been learning and iterating on is creating standards and expectations for our technical components.

We have simple service tiers for our deployables. Tier of a service depends on its criticality and ability to degrade gracefully.  Each service, depending on the tier of it is expected to fulfill a set of automated or manual checks. We use service tiers to set up on-call, monitoring and availability requirements for the services. We’ve started to implement this idea during last year, and it’s proving to be a robust way of communicating technical standards.

Besides platform requirements, we have guilds setting up standards and vision for the future of our tech. As an example, our Android guild has decided on a modern architecture based on MVVM, coroutines and Kotlin Flows, while our front-end guild has actively built a new stack for creating better apps. For more insights, take a look at Yurii’s great post about how we’ve evolved our stack.

4. We come together for Engineering-wide initiatives

Embedded in Product teams, most of our work revolves around our team’s domain. In the past few years we’ve also had a couple of engineering-wide initiatives to help ensure our tech can continue to scale with the growth of our product and customer base. 

An interesting example was our move to the cloud. As this change impacted all Engineering teams, it was important to involve all teams into the discussion and planning before implementation. This involved discussing alternatives and collecting feedback from Engineers across the board, to understand the different perspectives. 

But true to our autonomous nature, as with any other project, the decisions were made by the team closest to the problem. In this case that was our Platform team. Throughout the process there was open two-way communication to ensure all perspectives were heard. Once implementation began, the Platform team communicated about all decisions openly, got feedback and assisted all teams along the migration process.

5. We stay aligned globally and ensure visibility of projects through quarterly planning

On a global company level, we do quarterly planning. Every quarter each team gets together to plan and present to the company a reflection on the previous quarter, along with focus areas for the upcoming one. In planning, we set our  OKRs (Objectives and Key Results) for the quarters, which we’ll track and reflect upon at the end of the quarter. 

In Engineering we have specific Engineering OKRs to make sure we prioritise scaling our technical systems and reducing the technical risk. Most of the time these technical OKRs are team specific and depend heavily on the context of the team. As an example, this quarter our technical OKRs in the Small and Medium Business Product includes increasing observability around one of our critical flows, business onboarding. This will help us detect and assess our risks better, thus provide higher quality service for our customers.

6. We make make time for learning from each other

In a fast-growing environment, where there are problems to tackle left, right and center, it can often feel like there’s no time to stop to reflect and learn outside of your day-to-day projects. That’s why we make a conscious effort to take time for learning. We often do ‘side-by-sides’ with each other to learn on a specific topic from a colleague. On top of that, each of our employees have a dedicated personal development budget to use for personal development. 

Another method we have been iterating on in order to learn from each other is technical coaching. Since it is not scalable to go over the plans of all the teams in our size anymore, we have specific coaches to give each team high quality in-depth feedback about their plans. Technical coaches spend time to understand the context of the team, follow them from design to the implementation phase and provide continuous feedback.

We also make an effort to learn from our mistakes. When something goes wrong, we write blameless postmortems to avoid the similar hiccups happening in the future. We discuss any high-severity incident with the learnings in weekly cross-team Engineering meetings and all other postmortems are shared with all engineers.

We also host internal learning sessions called ‘TEX talks’ weekly. These are for our Engineers to share insights into technical topics of interest, projects they’re working as well as technical problems they’re facing. We’ve had topics ranging from how a team uses contract testing to deep dives into how specific components are architected. 

7. We’re building a holistic vision for the future of TransferWise Engineering – together

We often talk about the vision for our architecture. Growing our customer base,  product offering and Engineering organisation exponentially, we’ve learnt the need for building long-term vision, especially for our core domains. 

To stay aligned, we have weekly cross-team Engineering meetings to discuss any changes into our core domains. This helps us ensure everything fits together and collect feedback from the wider team. We’ve found that open forums and communication lines between teams and Engineers help us to maintain a holistic understanding of where TransferWise architecture is going.

We get together with our global Engineering team twice a year to stay aligned and shape up the vision for the Engineering organisation. Our internal tech conferences called ‘Tech Days’ are a chance for Engineers to demo what’s been built, talk about where we’re going with our tech, what we’re doing for scalability and align on upcoming technical challenges. These ways of keeping in touch with the wider Engineering Organisation helps us to stay on track and organised around the same vision.


We’ve learned a lot throughout the years, but still have a long way to go. I’m sure that in the future, we’ll keep iterating on how we work and the future of our collaboration will look a bit different than today. I’m excited to see what that might look like!

 

P.S. Interested in working with us? We’re hiring! Check out our open Engineering roles here.