Does vibecoding improve user experiences? | Blog

A simple question

Do LLMs (language learning models) produce web products that are inherently performant and accessible? You know. Things that build user trust alongside good security? Significant drivers of revenue?

Note: this is a technically oriented piece that critically evaluates current LLMs. If that doesn't interest you, feel free to huck it into the virtual dustbin.

I'm writing to you, mister or miss prestigious business owner. You're the CEO, CTO and licensed forklift operator managing SquidCoffee - a billion dollar franchise that's cornered the caffeinated seafood beverage market.

The "Calamocha?" ☕🦑 All you.

Your business is profitable, but you have a decision to make. One of your shareholders won't be able to afford his seventh yacht if you don't fire your entire dev team and replace them with an LLM.

Since the representative from the LLM company you're negotiating with has told you it's fluent in Esperanto and can juggle chainsaws, you're a little skeptical of its true capabilities.

Sure, the marketer regales you with tales of the utopian technological future. But if the present reality it delivers is a subpar user experience, is it worth the cost of your customers' goodwill?

And if you're somehow sociopathic enough to despise your users, is the loss of their business acceptable?

Say - in the cases of a financially devastating security breach, or if an AI solution doesn't automate accessibility as advertised? That's less money for you.

Talk is cheap. Or expensive in the case of LLMs. But the point stands. We need evidence. Numbers. Data.

You mean there's a better way?

But of course! We can evaluate the effectiveness of developerless LLM output on March 3rd, 2026 by visiting the websites of AI-oriented companies and giving them a performance and accessibility audit.

After all, in a "frontend solved" world we'd expect to see astounding results.

We could then extrapolate that something much more complex - like a vibecoded SaaS (Software as a Service) solution - would be of comparable quality. If not magnitudes more so.

I'll review seven advertised services that appeared in my LinkedIn feed. Each gets an individualized, expert-identified accessibility or performance assessment. Because I care.

We'll be doing a Google Lighthouse audit. This metric has real-world value since it affects where your site appears in Google search rankings, how many people with varying degrees of internet access can interact with your service and whether people with accessibility needs can perceive it.

A score of ninety or greater across all fronts is desirable, but we'll only include performance and accessibility scores in our list. Mind you, even services with a good accessibility score need manual auditing. Lighthouse flags obvious a11y (accessibility) issues but isn't a great indicator of whether something truly works on a keyboard or with screenreaders.

In the interest of fairness, if I received varying scores from multiple Lighthouse scans I opted for the highest one.

Bubsy Bobcat shrugging as he asks, "what could possibly go wrong?"

⚠️ Warning: hazardous levels of slop inbound ⚠️

Base44

URL	Accessibility	Performance
Base44	"100"	46

Flaw: accessibility.

Keyboard users on displays less than 1100 pixels wide get treated to a hamburger menu that isn't visually operative in the main navigation. One that lacks focus styles for its navigation items.

They did better than a lot of vibecoded websites by using landmark roles, but then you have <li> elements incorrectly nested in <div> elements.

HTML div element erroneously containing list item children elements

LLMs really, really love <div> elements. The contractual functionality of semantic HTML with assistive JavaScript? Not so much.

Cal

URL	Accessibility	Performance
Cal	86	42

Flaw: accessibility.

The Solutions, Developer and Resources navigation items fall into a <div> souppot, making them unfocusable and invisible to keyboard users.

main navigation item "Enterprise" highlighted as the first item "Solutions" was skipped over when pressing Tab

The motion sickness inducing automated animations last longer than five seconds and lack a way to toggle them.

Lots of images convey meaning only to visual users due to them being hidden with aria-hidden=true. Screenreaders will never announce that AngelList, Coinbase and other companies use the product since they'll get crickets after reading "Trusted by fast-growing companies around the world."

Most egregiously, the language picker in the <div> impersonating a <footer> reloads the entire website - meaning anyone keyboard navigating has to refocus their old content by pressing Tab thirty or more times.

And if they thought they'd get a status update reading "All Systems Operational" afterward? Surprise! It's an <img> element nested in a <a> element with an alt text of "logo."

Cassidy.ai

URL	Accessibility	Performance
Cassidy.ai	93	33

Flaw: performance.

A nearly four second first contentful paint. I saw three network requests for CSS that could likely be condensed into a lone file with extraneous rules removed. To their credit, everything's minified.

Unfortunately, their largest contentful paint was over four times the value of a good experience per Google's standards.

Cassidy.ai largest contentful paint value of 11.6 seconds

Glowstep.io

URL	Accessibility	Performance
Glowstep.io	94	39

Flaw: performance.

Roughly five seconds to first contentful paint. An easy win would be to compress some of the half megabyte plus images found on the site. People on 3G and 4G internet will thank you for it.

Glowstep.io first contentful paint value of 4.8 seconds and largest contentful paint value of 13.8 seconds

Liatro

URL	Accessibility	Performance
Liatro	"100"	63

Flaw: accessibility.

Three identical <nav> elements with "Main" aria labels, making the site's main navigation incomprehensible to non-visual users.

div element with three children nav elements each with an aria label of "Main"

OdeCloud

URL	Accessibility	Performance
OdeCloud	92	34

Flaw: accessibility.

Plenty of a11y atrocities. The main navigation inexplicably duplicates its submenus for keyboard users, doubling the effort to navigate for anyone with a motor disability.

main navigation with duplicate submenu arrow button highlighted

A single, functionless "Trek Travel" <div> cosplaying as a tablist misinforms screenreader users as to what it's even doing. Which is nothing.

div element with tablist role but no sibling tab elements or functionality with aria label: "Tabs. Open items with Enter or Space, close with Escape and navigate using the Arrow keys."

Quarterzip

URL	Accessibility	Performance
Quarterzip	86	27

Flaws: accessibility and performance.

I doubled up on this one because it's got a 9-ish second first contentful paint, which is mindbendingly awful performance. There's likely pages that don't need whole script inclusions of things like GSAP and Swiper for animations.

Quarterzip first contentful paint value of 9.6 seconds and largest contentful paint value of 23.3 seconds

Does accessibility fare better? Well, sometimes there's decent alt text on individual <img> elements.

It's all downhill from there though. Product, Solutions and Resources in the main navigation don't exist for keyboard users. Looks like we're <div>-maxxing as the kids say.

We also get nonsensical mixes of aria attributes and semantic <h1>-<h6> elements.

h5 element with an aria-level attribute value of 2

The results are in

Squidward Tentacles from Spongebob with a shocked expression and shrunken nose

Bleak, isn't it?

Every one of these AI-first websites received a failing performance grade. Several earned aesthetically "good" accessibility scores yet did not actually function for keyboard or screenreader users.

Remember: this represents unsupervised LLM output for a "simple" and "solved" problem - frontend code. It's being produced en masse.

And the same sellers creating websites broken for disabled users and people on slower internet connections are now promising monumentally more complex SaaS-scale products of "comparable quality."

Well, SquidCoffee?

Based on the evidence, I'd say you're ready to make your decison. Aren't you, mister or miss prestigious SquidCoffee CEO, CTO and part-time interpretative dance instructor? ☕🦑💃

You're going to make a level-headed, empirical assessment of the financial and productivity gains of LLMs, aren't you?

And in the process, you plan to acknowledge that experts are required to manage the technology in a way that's conducive to amazing user experiences?

And as a show of respect for your developers - who are all decrying the marketing meme statement "code was never the bottleneck" - you'll affirm that their competency is vital to ensuring customers are respected and valued?

Because we're both in agreement that the end goal of automation is to produce technically impressive and universally beneficial software at scale, meeting the needs of-

...what's that?

You just laid off the dev team and told the unpaid intern to orchestrate agents to replace them?

You signed on with OpenFraud.ai? A subsidiary of Knotta-s.cam?

I'll see myself out.