Handoffs and Continuity

The end of your on-call shift is as important as the beginning. A good handoff means the next engineer is set up to succeed.

What is a Handoff?

A handoff is a formal transfer of on-call responsibility from you to the next person.

It happens at the boundary between shifts — usually:

Daily if you’re doing 24x5 rotations (end of each day)
At any time if an incident is ongoing when your shift ends

Why Handoffs Matter

Poor handoffs cause:

Lost context and confusion
Issues being forgotten
Repeated work
Missed critical details
Stressed out engineers

Good handoffs:

Give the next person what they need to succeed
Prevent duplication of effort
Ensure nothing falls through the cracks
Build trust in the rotation

What to Include in a Handoff

1. Current Incidents (Active Issues)

If you have any ongoing issues:

Details needed:

What is the issue? — Clear description
What’s the impact? — How many users/services affected?
What have you tried? — What investigation steps?
Where are you stuck? — What’s the blocker?
What should they do next? — Your recommendation

Example:

“High memory usage on Payment Service. I restarted it twice; memory dropped to normal, but came back up within 30 minutes. Latest deployment was 4 hours ago. I think there’s a memory leak in the new transaction handler. Look at the deployment details and consider rolling back. If it gets worse, escalate to Marcus.”

2. Recent Issues You Resolved

What did you fix during your shift?

What was the issue?
What was the root cause?
How did you fix it?
Is there follow-up work needed? (ticket, monitoring change, etc.)

Example:

“Resolved: API Rate Limit errors around 3 PM. The cron job that resets daily limits wasn’t running due to a scheduling error. I manually reset the counters. Fixed the cron schedule. No further action needed.”

3. Alerts That Fired (Noise)

Which alerts went off but weren’t real issues?

What alert fired?
Why wasn’t it a real issue?
What should be done about it?

Example:

“DiskSpaceWarning fired 6 times but was always a false alarm. We cleared old logs and the alert kept going off within an hour. We should either increase the threshold or implement log rotation. Not urgent, but worth noting.”

4. Pending Changes or Deployments

Did anything get deployed or changed during your shift?

What changed?
When?
Why? (hotfix, planned change, etc.)
What should they watch for?

Example:

“Deployed version 2.1.3 at 8 PM to fix connection pool issues. Seems stable so far but monitor Database Connection Count and PoolSize metrics closely for the next hour.”

5. Context You Have

Things they might not know:

Is the customer aware of any ongoing issues?
Is leadership tracking something?
Are there open incidents you escalated?
Was there a meeting or decision they should know about?

Example:

“The big API redesign is tentatively going out tomorrow at 10 AM. There’s an all-hands at 9:30 AM. Leadership is watching for issues, so if anything unusual happens overnight, don’t wait to escalate.”

How to Do a Handoff

Before Your Shift Ends

30 minutes before:

Start writing your handoff notes
Summarize any active issues
Gather details and links

Synchronous Handoff (Preferred)

If possible, talk to the next person:

Reach out: “Hey, I’m ending my shift now. Do you have 10 minutes for a quick handoff?”
Walk through: Go through each active issue
Answer questions: Let them ask clarifying questions
Verify understanding: Make sure they get it
Leave your contact: “You can Slack me if issues come up”

This takes 15-20 minutes but is worth it.

Asynchronous Handoff (When Needed)

If you can’t connect directly:

Write detailed notes in Slack or a shared doc
Post them in the on-call channel so the next person sees them
Send a direct message to the next person: “@NextEngineer, check the on-call channel for my handoff notes”
Be specific so they don’t have to guess or search for context

Example Handoff Message

@Sarah, I'm handing off. Here's what you need to know:

**Active Issue:**

- HighMemoryUsage on Payment Service. Restarted twice; memory is back up.
  Likely memory leak in new transaction handler from deployment 4 hours ago.
  Recommend checking the deploy; consider rolling back if it keeps happening.

**Resolved During My Shift:**

- Fixed cron job for Rate Limit reset (it wasn't running)
- Manually reset counters at 3 PM

**False Alarms:**

- DiskSpaceWarning fired 6 times. Not real, just log buildup.

**Deployments:**

- v2.1.3 went out at 8 PM for connection pool fix. Stable so far but monitor metrics closely.

**FYI:**

- Big API redesign tentatively going out tomorrow at 10 AM
- Leadership is tracking issues, so escalate early if anything seems wrong

Let me know if you have questions. I'm around for a bit if you need clarification.

What If You’re Still on an Incident?

If an issue isn’t resolved when your shift ends:

Tell the next person immediately
Don’t just disappear and leave it for them
Stay for a quick handoff conversation to explain context
Offer to help if they get stuck: “Let me know if it gets worse, I can jump in”

The next person shouldn’t inherit a mystery.

After the Handoff

The next person acknowledges they have the information
You’re officially off-call (sort of — stay near your laptop for 30 min)
If they have questions, they can reach out
You’re available if they need help (but not required to jump back in)

What Happens If You Don’t Get a Good Handoff?

If you inherit a shift with no context:

Ask Slack — “Can someone give me context on this issue?”
Check recent PRs — What changed?
Look at Incident.io — Is there a ticket or log?
Reach out to whoever was on-call — Even if they’re off, they can help quickly

This is frustrating, so don’t do this to others.

Common Handoff Mistakes to Avoid

❌ Disappearing Immediately

Don’t just go offline at the end of your shift. Give context first.

❌ Being Too Vague

“Everything was fine” is not helpful. Be specific about what you checked and what happened.

❌ Assuming They Know the Context

They might not know about ongoing projects, recent issues, or team priorities. Explain.

❌ Leaving Active Issues Unaddressed

If something is still breaking, don’t ignore it in the handoff. Explain what you’ve tried and what you recommend.

❌ Being Defensive

If someone asks questions, answer them clearly. “I didn’t debug that far” is fine. “It’s probably not the issue” is not helpful.

Communication and Culture — Learn clear communication during handoffs
Your First Shift — Receive good handoffs when you take over
Coverage and Scheduling — Understand when handoffs occur

Last modified October 19, 2025: drs-add-landing-page-tier2 (6f2cba79)

View page source - Edit this page - please contribute.

Handoffs and Continuity

What is a Handoff?

Why Handoffs Matter

What to Include in a Handoff

1. Current Incidents (Active Issues)

2. Recent Issues You Resolved

3. Alerts That Fired (Noise)

4. Pending Changes or Deployments

5. Context You Have

How to Do a Handoff

Before Your Shift Ends

Synchronous Handoff (Preferred)

Asynchronous Handoff (When Needed)

Example Handoff Message

What If You’re Still on an Incident?

After the Handoff

What Happens If You Don’t Get a Good Handoff?

Common Handoff Mistakes to Avoid

❌ Disappearing Immediately

❌ Being Too Vague

❌ Assuming They Know the Context

❌ Leaving Active Issues Unaddressed

❌ Being Defensive

Related Pages