The 411 on focus trapping in modals | Blog

It's a (focus) trap!

Admiral Ackbar from Star Wars yelling it's a trap in complete shock

For today's entry on focus trapping, I hope to keep things simple and illustrative to highlight its importance for modals.

Light technical discussion included. Hopefully not a cure for insomnia.

Before we dive into what focus trapping is, let's talk about keyboard trapping. If a user relies on a keyboard to navigate websites using Tab and Shift plus Tab, keyboard trapping can occur in widgets that don't consider them.

Imagine you're using your mouse to move up and down a webpage. After you scroll past a certain point, prior content on the page vanishes. You're now only able to view and interact with a subset of the page, no matter where you click or how rapidly you accelerate your mousewheel in righteous fury.

That's the mouse equivalent of a keyboard trapping experience.

Frustrating? A WCAG (Web Content Accessibility Guidelines) level A violation? Why not both?

Why?

Keyboard trapping happens when focus is programmatically shifted into a widget and the unshifting logic is neglected for keyboard interactions. I don't often see this in modals, but when it does happen a likely cause is the close or cancel button using the HTML <svg> element instead of the <button> element.

In the following example, <button> will receive focus when pressing Shift or Shift plus Tab but <svg> won't.

<html lang="en">
<body>
  <svg><text>Close Button</text></svg>
  <button>Close Button</button>
</body>
</html>

Unlike <button>, <svg> doesn't ordinarily receive focus through keyboard interactions. Regardless, <svg> is a decorative element and very popular on the modern web. If someone uses an unmodified <svg> as a modal's close button they'd be unintentionally keyboard trapping their users.

Without further ado: focus trapping

Focus trapping is the altering of all webpage content external to a widget (in this case, our modal) to make that content not focusable while the widget is open. This is a good thing. And yet, keyboard trapping is bad. What's the difference?

Remember from our mouse-equivalent-to-keyboard-trapping experience: the user's actions made no difference after they scrolled and experienced their slop-coded content lockout. Proper focus trapping is always escapable.

If we don't focus trap, pressing Tab or Shift plus Tab within a modal could move focus outside of the modal and back into external webpage content while the modal remains open. That's a potentially confusing or harmful outcome. Ask yourself the following to understand why:

As a non-visual user, am I supposed to know that I just exited my modal's interface via keyboard and yet it's still open?
As a non-visual user, if I keyboard navigate back to the still-open modal, will it surprise me if I expect other content in its place?
As a keyboard user, do I find your modal's nonstandard behavior disruptive compared to other applications?

Try keyboard navigating in the WC3's accessible dialogue example. Notice how once the modal is open and you press Tab or Shift plus Tab, focus can't shift outside of it to anywhere else on the webpage that was formerly interactable?

Interestingly, in Github's Settings modals you can keyboard navigate into the address bar while the modal is open.

Access to user controls is the only exception to outside content that can be exposed in a focus trapped modal. However, I generally don't see that behavior defaulted to in accessible component libraries' modals. Even the WC3's example doesn't feature it.

When making my own modals, I opt for the behavior in the WC3's implementation since it seems to be a normalized expectation for keyboard users. And yet, if my users preferred access to user controls in open modals I'd be happy to oblidge. WCAG (Web Content Accessibility Guidelines) should help people out, after all.

Wrapping up: the process

Before concluding, I'll provide an overview of each step in a robust modal focus trap. Maybe it'll help you debug or write your own code?

The last focused element on the webpage - almost always a HTML <button> element - is stored to be referenced later.
All webpage content outside the modal is programmatically hidden from focus. This means everything that'd normally be accessed with Tab or Shift plus Tab such as HTML <input> and <a> elements has tabIndex=-1 applied to it.
An array of the modal's focusable elements like <input> and <button> is created. Some modals programmatically focus their title, usually a HTML <h2> element. Since text doesn't ordinarily receive keyboard focus, tabIndex=0 is applied to the title so it's the first thing keyboard users access before the modal's controls.
When Shift or Shift plus Tab are pressed, some JavaScript checks against the array of modal focusable elements. If the user is at the last index in their array and presses Shift, their focus is programmatically shifted to the first array element. If the user is at the first index in their array and presses Shift plus Tab, their focus is programmatically shifted to the last array element.
When the modal's close button is clicked or selected by keyboard, focus is returned to the last focused element from step 1 and tabIndex=-1 is removed from the elements that were hidden in step 2.

A lot goes on behind the scenes to make stuff accessible! That's why it's a great decision to use semantic HTML elements like <button>. They tend to have inbuilt keyboard functionality you may not be aware of.

Above all else, remember: focus on focus trapping your modals.

Patrick Star from Spongebob thumbing down and booing a cheesy joke

Note to self: write a better sounding outro next time.