Why Every Engineering Manager Should Learn System Design
The demand for managerial tasks increases, and so does the need to keep technical skills sharp. Here's how to stay up-to-date with System Design.
Over the course of my career, I have talked with many engineering leaders who fear becoming irrelevant in a rapidly evolving tech landscape.
If you are like most engineering leaders, you probably face a constant dilemma: Juggling your time between managerial tasks and keeping your technical skills sharp. Whether it’s struggling to catch up on new programming languages/frameworks or learning how other companies approach scaling their web applications, some engineering leaders are falling behind. It’s a dilemma I know all too well as a founder of a tech company.
Some might argue that it’s only natural for engineering managers/leaders to stay focused on their managerial tasks for team productivity. But as the modern tech landscape evolves, we have to grow as well.
Today’s world depends on distributed systems. Just one failure in a distributed system can bring down thousands of servers and crucial operations. Facebook’s 6-hour outage in October 2021 affected Messenger, WhatsApp, Mapillary, Instagram, and Oculus, costing them about $100 million in revenue losses. With so much at stake, we need to design systems that scale and degrade gracefully.
It's more important than ever for engineering leaders to learn System Design — but it's also never been easier to start learning. Here’s what we’ll cover today:
The growing demand to learn System Design
Why you could be at a disadvantage
How to start learning System Design today
Designing TinyURL using the RESHADED pattern
Closing thoughts: Breadth over depth
The growing demand for System Design
A new wave in the past decade shows a strong trend towards understanding cloud computing and designing large-scale distributed applications.
Engineering leaders from the early 2000s — now directors, CTOs, and beyond — never had the chance to design large-scale cloud applications or learn some of these new technologies during their early education and careers. It’s nobody's fault; it’s just timing. After all, the Google file system paper came out in 2004, and the Amazon Dynamo paper also came out in the early 2000s. Before then, most people were unaware of distribution systems using commodity hardware. It was an industry niche that Google and Amazon later pioneered.
Early developers who advanced their careers into engineering management never had the chance to learn or work with these new technological advancements. Instead, for the sake of their teams, they had to set aside coding to better support their team.
Skill gaps increase as engineering managers devote more time to management responsibilities but require more time to catch up on all the technological advancements.
In the past decade, the world has changed a lot. Every professional engineer today interacts with these systems in one way or another. Even programmers that program your refrigerator build instrumentation so your fridge can talk with the cloud. Self-driving cars have to talk with the cloud. Even hardware programming jobs now require distributed systems and cloud computing skills.
Moreover, the computer science education landscape has evolved with this trend. Recent graduates have a large technical edge as undergraduate courses now teach distributed systems and cloud computing.
As you grow in your career, it becomes harder and harder to keep up with your current responsibilities while still sharpening your technical skills. However, I’ve learned from personal experience that you have to start somewhere or risk getting left behind.
Before I go into how to fix this growing problem, I wanted to share my personal experience regarding the topic of System Design and engineering leadership.
Why you could be at a disadvantage
I graduated from university in 2005. At that time, System Design and distributed systems were novel concepts. I had taken a theoretical course on distributed databases, but my primary knowledge was in basic compilers and algorithms.
In 2008, I joined a team at Microsoft called “Red Dog.” As one of the earlier engineers working on Azure, I had my first reality check: when I stepped into one of my first team meetings, I realized I had no idea what the other engineers on my team were talking about.
Topics around building and scaling a distributed database were completely outside the scope of my prior education and experience. So I did the only thing I could: I started the long journey of reading through all the available Distributed Systems books and papers of that time.
Fast forward to today’s time; if you’re an engineering leader who hasn’t upgraded your technical skills, you may be facing any of these scenarios:
1. Feeling stuck in your current position
2. Receiving a reality check when you’re in the market interviewing for a new role
3. Sitting quietly in meetings and doubting your insights
4. Dreading review meetings when having to provide critical feedback
5. Making critical decisions about the long-term direction
It’s only a matter of time before people realize their leader isn’t providing meaningful feedback or insights. So how do you catch up on your skill gaps?
While developers have the luxury of learning as they build, engineering leaders don’t have enough time to actually build these systems to gain knowledge.
And while you could go the route I took and read 20-40 papers, there have been critical advancements and investments in System Design learning resources.
How to start learning System Design today
1. Learn the building blocks of System Design
There are similar foundational layers that most modern System Design problems share, though specific details are often unique. Think of them as the Lego bricks of System Design that we use to construct more effective, capable systems.
Many of the building blocks we discuss are available for actual use in the public clouds, such as Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP).
As an example, you can read about the building block “Key-value store” here, which covers how to:
Design a Key-value Store: Defining the requirements of a key-value store and design the API.
Ensure Scalability and Replication: Achieving scalability using consistent hashing and replicate the partitioned data.
Versioning Data and Achieving Configurability: Resolving conflicts that occur due to changes made by more than one entity, and we’ll make our system more configurable for different use cases.
Enable Fault Tolerance and Failure Detection: Making a key-value store fault tolerant and how to detect failures in the system.
2. Learn the RESHADED method for modern System Design problem-solving
The RESHADED acronym stands for:
Requirements
Estimation
Storage schema (optional)
High-level design
APIs
Detailed design
Evaluation
Distinctive component/feature
It’s a comprehensive approach for you to apply standard patterns to come up with a solution to a problem you’ve never seen before.
Keep in mind that there isn’t a one-size-fits-all solution for every system that needs designing.
3. Practice
You can then apply RESHADED to various modern-day examples such as:
How to design a URL shortening application
How to design YouTube
How to design Google Maps
How to design Uber
Designing TinyURL using the RESHADED pattern
I’ll cover the first few steps to implement RESHADED to design a URL Shortening Service.
You first segment your requirements into functional and non-functional requirements. Next, you’ll want to estimate your resources. Starting with some assumptions, you’ll walk through storage estimation, query estimation, bandwidth estimation, memory estimation, and number of servers estimation.
By the end of your estimations, you’ll end up with a summary that looks similar to this:
With your estimations complete, you can identify key building blocks for your design. These building blocks include:
Database(s) to store the mapping of long URLs and the corresponding short URLs.
Sequencer to provide unique IDs that will serve as a starting point for each short URL generation.
Load balancers at various layers will ensure smooth requests distribution among available servers.
Caches to store the most frequent short URLs related requests.
Rate limiters to avoid system exploitation.
The following steps of RESHADED (APIs through distinctive components/features) are further covered in great detail here:
Closing thoughts: Breadth over depth
Understanding System Design is becoming more important for an engineering manager. Still remember that your primary job is to help people deliver software and manage your team.
While technical expertise is an important part of your job, as a manager, you’re not on the hook to understand every tiny detail — instead, you need to have a broad overview of System Design.
Rather than focusing on the depth of your knowledge in System Design, your time is better served skimming courses & documentation that can quickly get you up to speed.
There are a lot of accessible resources available today that would’ve made learning a much easier process during my time as an engineer. (We built our newest System Design course with exactly this use-case in mind)
As always, I’m curious to hear what you’d like me to cover next, so feel free to comment with your suggestions and subscribe.
RESHADED typo in the green pic … thanks for the article !
I've been a manager for years, and I've only dabbled in System Design because I didn't know how to start. This article really made me think. I enjoyed how you laid it out.