The Anatomy of a Good Emergency Procedure [Yearn Finance Case Study]
Yearn Finance has a structured way to deal with worst-case scenarios in Web3. Learn more about essential war room roles and how to set up your action plan following the Yearn Finance checklist.
In this post
Hack, bugs, and exploits have plagued software development since its inception. In the blockchain industry, the stakes are usually even higher when you consider the value of targeted assets. These emergencies differ in severity and there are countless scenarios of how things can go wrong. So, it’s essential to have a contingency plan.
No one can predict the future, but guys from Yearn Finance sure know how to prepare for worst-case scenarios. They introduced well-structured procedures to handle time-sensitive situations.
That’s why we took a page from their playbook (for detailed information, read the full Yearn Finance Emergency Procedures guide) to give you an overview of the steps you can take to create your own war room protocol.
What are the essential roles & responsibilities in a war room?
When organizing a war room in Web3, the first step to avoid panic is to assign roles. This will help people understand what’s expected of them and how they can contribute to finding a solution. Aim to limit involvement to essential staff members only.
However, there are a few key roles you need to include regardless of the problem:
- To make sure that everything goes as planned, you'll need someone with previous experience in similar situations to facilitate the whole process. The facilitator is there to vet and direct the flow of information and ensure the procedures are met. In addition, you need someone who’s ticking off the boxes on the checklist and making sure no steps are overlooked.
- Protocol dev leads and core developers are typically the largest part of the war room team. Why? You need people who understand the exploit, as well as any involved protocols. Their combined experience should cover all possible aspects that the issue even theoretically could affect. This team also needs a web developer to change the UI to reflect the protocol changes.
- A community lead should also be a part of the war room team. Their focus lies on transparency and damage control, but they can also be beneficial in filtering incoming information.
- The speed is of the essence in a war room, so you’ll need a reliable person to keep track of everything and handle items such as note-taking, post-mortem preparation, and outcome analysis. To make sure you don’t repeat the same mistake, you need someone with strong analytical and operational skills to document everything, draw conclusions, and implement the next steps.
How to create an action plan
It’s essential to set things in motion as soon as possible. This might seem obvious, but people struggle to choose the right course of action under the pressure of a security breach. That’s why it’s helpful to foster a culture of speaking up if you notice something is off.
Across your company, the message should be that it’s always better to raise a concern than sit on information for too long while fact-checking. In time-sensitive situations, when a team member isn’t certain whether a threat is valid, sharing information and starting a war room to investigate as a team is typically best:
- If there’s a real threat, it allows the team to react quickly, potentially saving your company a lot of money.
- If there’s no problem, it will give your team members more information to quickly debunk suspicious incidents.
To help you create your own action plan, here are multistep guidelines you can follow:
Step 0: Assemble your war room
Create a private chatroom and invite essential team members only. Doing so allows you to quickly assess the problem and decide who else to invite to cover the problem from all angles. During this step, keep in mind the background and skill set of the people involved.
Step 1: Confirm the issue & take immediate action
With a team assembled, you’ll need a designated spokesperson to present the problem, knowledge gathered about the situation, and necessary tooling to all participants. It’s a good idea to create a shared document where everyone can crowdsource information for late-comers. Such a document can include information like:
- Important EOAs and contract addresses involved in the incident
- Links to key transactions in a sequence
- Scripts used to check data from the past and current blocks
- Documentation of a root cause
The first critical task is to vet all the information coming in and correctly diagnose the problem. Keep in mind that new details will pour in from different sources. So, think critically and double-check all data and decisions.
At this point, the protocol devs will start investigating a transaction or a set of transactions that led to the exploitation. Then, they will suggest measures you can implement to mitigate further attacks, including preemptively pausing the protocols that could be vulnerable. Once you have dealt with the immediate threat, you can move on to the next step.
Step 2: Identify the source of the problem & implement a solution
Once your funds are secure, it’s time to deal with the root cause of the issue. The core dev team needs to dive deep into the execution to understand the attack vector and pin down the problem.
The best-case scenario is to recover the funds, but if that’s not possible, the next best step is to propose a solution that will permanently prevent this type of attack. This discussion is crucial, so the team facilitator needs to make sure that the team weighs all the options before implementing a solution.
While your team is hard at work solving the problem, don’t forget about your community of users. You should communicate the information from the war room to your customers in a timely manner. Your reputation is also at stake, so you’ll need your best community liaison on deck.
Step 3: Manage the aftermath
The worst is behind you, but there’s a bit more work to be done. So, before calling it quits, schedule a post mortem, analyze the outcomes, and (if necessary) communicate with the attacker.
It’s important to distinguish between a public post mortem and an internal one. Public incident disclosure is the best way to communicate any issues with your community. It shows your commitment to the safety of their funds and reflects transparency. For example, the Web3 community can be forgiving if you simply own up to any mistakes.
By assessing the steps taken and results, you can see what can and should be improved for future reference. Highlight what went well and what didn’t and formalize these conclusions so this can be a learning experience for the entire team.
If you didn’t manage to recover the funds, you’d most likely try to contact the attacker to make a deal. However, if that doesn’t work, it’s advised to contact authorities and work with them to find and recover the lost funds.
The emergency checklist by Yearn Finance
To make sure you don’t overlook an important step, it’s best to have a checklist on your hand. This one from Yearn is an excellent example, outlining a list of crucial steps:
- Create a war room with audio
- Assign key roles to war room members
- Add a Strategist or some other Expert (or their backup) to the war room
- Clear related multi-sig queues
- Disable deposits and/or withdrawals as needed in the web UI
- If share price has been artifically lowered, then call
- Confirm and identify the issue
- Take immediate corrective/preventative actions to prevent (further) loss of funds
- Communicate the current situation internally and externally (as appropriate)
- Determine the root cause
- Propose workable solutions
- Implement and validate solutions
- Prioritize solutions
- Reach an agreement in your team on the best solution
- Execute the solution
- Confirm the incident has been resolved
- Assign ownership of the security disclosure report
- Disband the war room
- Conduct an immediate debrief
- Schedule a post mortem
Handling Web3 emergencies with the right support
The Yearn Finance case study can be a great starting point for establishing a structured and strong procedure for Web3 war room situations. However, in addition to assembling a team of experts and creating a checklist for them to follow, you also need to provide them with advanced tooling for troubleshooting potential issues. This can further facilitate the entire process and help you find a solution quickly before a situation gets out of hand.