Nine things automated accessibility tests can’t test

With Luro, I’ve found myself in the accessibility tooling space. I’m bullish on the need for automated accessibility testing to help designers and developers do a minimum viable good job, establish a baseline experience, and diagnose problems before they are giant problems. Even though automated tests cover 20-30% of WCAG Success Criteria, Deque data suggests those issues are representative of 57.38% of the most common issues found. In my mind, automated accessibility tests have four key tangible benefits:

“Passing” puts you in a state of being more accessible than not accessible
Objective numbers raise organizational awareness and create SMART goals
Reduces surface area of attack from robo-lawyers (Disclaimer: I am not a lawyer)
Knocks out low-hanging fruit so actual accessibility audits can be more impactful

Can you trick an automated test and score a 100 with false positives? Yes, anyone can commit a fraud. Assuming in good faith that most people aren’t trying to trick the machine and want to do a decent job, that’s where I think automated tests have a valid place. However, there are weaknesses to automated testing and I think it’s good to be candid about them. My brain doesn’t map Success Criteria well, but here’s a list of jobs I know automated tests can’t do well.

Focus states. Can you tab through your site and not get lost? An automated test tool almost certainly can’t do that.

Although...

…Microsoft’s Accessibility Insights has a cool feature where it does tab through your site to show tab order. Focus states might be scriptable if we had a tool to diff styles of the focus and blur states of document.activeElement, then look for something other than color.
Captions and transcripts. Automated tests aren’t going to watch your videos or listen to your podcasts and do quality assurance on your (automated) captions and transcripts, which the deaf community lovingly refer to as “crap-tions”.

Although...

…if you used the <video> element and the <track> element, you could at least ensure the site made an attempt to add captions, the same fidelity as testing for alt text.
Date pickers and typeahead autocompletes. Automated tests are pretty good at detecting if you wired up your forms up and labels. But if you’re using any form control popularized in the last decade (like typeahead autocomplete), automated tests won’t be able tell you if the ARIA attributes and announcements are all functioning as expected.

Although...

…if more people used input[type="datetime"] and <input list> + <datalist> (and you were able to style those 🤞) it’d be easier to support and test.
Test Interactions. Automated tests aren’t going to open all your dropdown menus and modals on your site and try to use them, so it can’t guarantee your ARIA is setup right, your light dismiss actions work, and your keyboard traps all function.

Although...

…if more people start using the new HTML popover and popovertarget attributes, this would be. Using the hidden attribute instead of leveraging lots of DOM rewrites would make code more scannable. And if you use <dialog> the predictability/testability goes up considerably.
Big tap targets. WCAG 2.5.5 wants at least a 44px ⨉ 44px tap target to help people without fine motor control. Automated tests I’ve run (currently) don’t expose this issue.

Although...

…I know Lighthouse’s UX report flags small tap areas, so it seems possible to measure interactive elements and flag this. Also browser math may allow for smaller targets because they split the difference.
200%-400% zoom. Automated tests aren’t going to zoom your site in and see what broke. That’s a you job.

Although...

…a test might be able to detect use of fluid type and relative CSS. Or coming from the opposite angle, is it possible to generate a list of common CSS code smells (e.g. declaring width) and issue warnings for those?
Seizure-inducing animations. WCAG 2.3 is all about animations triggering seizures. I personally don’t get seizures (I have friends that do) but over-use of animation does give me a day-ruining migraine. If you’re going to play in the seas of motion, you need to understand sea sickness.

Although...

…this isn’t super testable but transitions and animations that use CSS might be able to provide some guardrails or sniff out prefers-reduced-motion alternatives. Or if you can detect risky animations and at least provide a warning.
Confusing UX. Do you have a button controls a thing above it but it’s unannounced and unfocused? Automated tests can’t detect bad aural or braille UX, nor can they detect bad visual UX that could frustrate a user with a cognitive disability.

Although...

…with more AT support for aria-controls it might make it easier to detect if an element controls an element before itself in the DOM and raise a flag. Not perfect, but a start. More HTML elements and controls would help this as well.
Ability to complete a task. One issue that’s most frustrating to blind folks I’ve talked with is when you get real far into a process like checking out and then –for no clear reason– get stopped towards the end. This is frustrating when booking a flight, but imagine it happens every day for even mundane tasks like ordering a pizza or important tasks like registering for college. Automated tests won’t do an end-to-end test by loading up a shopping cart and running a credit card unless you configure a custom system to do that.

Although...

…if you stand page-by-page tests next to each, you’ll have a better idea of where you’re starting from.

This is a short, incomplete list. I could probably do a handful of posts like this to cover other jobs automated tests can’t test. But did you notice a trend? If you have good, declarative HTML and CSS as a foundation, the amount of tasks and success criteria you can statically analyze goes up considerably. Probably not to a degree you could sign off on a site being 100% accessible (I don’t think automated tests will ever give you that confidence) but enough to tell you if you messed something obvious up. And that’s what most people need.

In my mind, automated tests are a first step in the journey to creating accessible experiences. They are also the first line of defense in detecting regressions. With the low-hanging fruit managed and out of the way, you’re able to apply more time, attention, and education towards harder problems. Let’s leverage computers at the jobs they’re good at and acknowledge where humans need to step in and right the ship.

I will wrap this post up with my standard appeal for more native elements like <tabs> to make web development even more idiot-proof (for people like me).

Follow-up: Talked about his on ShopTalk Ep585.