Emergency Workflow

Workflow for an ASE when their account submits an emergency

An account with an Assigned Support Engineer (ASE) can submit an emergency either while you’re available or when you’re not.

In either case, it is important to note that you are not permanently on-call and, thus, are not required to take the emergency unless you’re the on-call engineer. However, at times like these it is important to remember that you are the Support Engineer with the most context of the customer’s situation, problems, & objectives, and might be able to save several hours of troubleshooting during this emergency scenario.

Process for the on-call engineer

Regardless of when it comes in, the DRI for the emergency continues to be the on-call Support Engineer. Their only process change is that they should notify the ASE when their account submits an emergency. This can be done by pinging the ASE’s Slack handle in the Slack thread where the emergency is being discussed.

Once an emergency comes in, the on-call engineer creates its Slack thread, and notifies the ASE of what’s going on, then the on-call engineer can continue working with the customer as they would any other.

Process for the ASE

When available

If the on-call engineer notifies you of an emergency and you are either available or can be available, then work alongside the on-call engineer to resolve the ticket. This may involve any of the following:

  • Taking over as DRI for the emergency
  • Shadowing the emergency for any amount of time
  • Troubleshooting the emergency in Slack, asynchronously
  • Updating the on-call engineer with any required context

Any help you can provide will be appreciated.

When unavailable

If you come back to work and see that an emergency took place while you weren’t available, then catch up on what happened and reach out to the customer to discuss any next steps.

If a big task is planned while you know you’ll be unavailable be proactive and prepare the on-call engineer.

Be proactive

If a task is planned after-hours that may lead to an emergency (an upcoming upgrade, migration, etc.) then it would be useful to create a summary of what’s planned, possible problems that could arise (if you know any), and suggested solutions. If you have other account-specific information that the on-call SE might need while troubleshooting (architecture info, problems, etc.), please put it into an internal note in the existing ticket leading up to the task so the on-call engineer can see it.

Last modified January 17, 2024: Update the link to the DRI page (5189085f)