The 20-minute demo is real. I've watched a non-developer spin up a functioning web app from a handful of sentences in the time it takes to finish a coffee. Vibe coding — prompting an AI to build something directly, without planning, architecture, or specialized tools — has gotten genuinely impressive.

So when small business owners ask "why would I pay a consultant when AI can just build it?" I understand the question. The demos are fast. The tools are free. The barrier to entry is basically a working internet connection.

Here's what I tell them.

The gap between vibe coding and a hybrid AI workflow isn't about the AI. It's about what you bring to the AI before you ever type a prompt. Experienced operators don't just use AI faster — they use it differently. They arrive with a plan. They know which tool belongs at which stage. They know what a good output looks like, and crucially, what a broken one is hiding.

That layer is invisible in the demo. And it's where everything compounds.

Key Takeaways

- McKinsey found that highly skilled developers see 50–80% productivity gains with AI, while junior developers are actually 7–10% *slower* (McKinsey & Company, 2023).

- AI-generated code produces 1.7x more issues per pull request than human-written code, with security vulnerabilities 2.74x more common (CodeRabbit, 2025).

- Only 11.9% of professional developers actively vibe code, and just 32.7% trust AI output accuracy — down from 43% the year before (Stack Overflow Developer Survey, 2025).

What Is Vibe Coding — and Why Does It Actually Work?

AI tools now write 46% of all code committed by active GitHub Copilot users, and 62% of professional developers report using AI in their daily workflow (Stack Overflow Developer Survey, 2024). Vibe coding takes this further: instead of using AI to assist development, you use it to drive development entirely — describing what you want in plain language, iterating on the output, shipping the result.

And at the right stage, it works. For prototypes. For internal tools. For MVPs that need to exist in 48 hours to test a hypothesis. For showing a stakeholder something tangible before anyone commits budget. These are real wins, and I won't pretend otherwise.

I use scaffolding tools like Google Stitch and Replit myself when I'm generating a starting point quickly. That's not vibe coding I'm ashamed of — it's appropriate tool use for the ideation stage. The goal is speed, and AI delivers it.

The problem isn't vibe coding. The problem is treating it as a complete workflow when it's actually the beginning of one. What's missing isn't tool capability. It's the judgment layer that decides what to build, how to build it, and whether the output actually solves the problem.

Vibe coding hands that judgment to the AI. Hybrid orchestration keeps it with the operator.

According to a 2025 Stack Overflow survey of 49,000+ developers across 177 countries, 72% of professional developers say vibe coding is not part of their professional workflow, and 45% cite "AI solutions that are almost right, but not quite" as their top frustration (Stack Overflow, 2025). That "almost right" gap is exactly what experience closes.

Where the Gap Opens Up

!Close-up of a laptop screen displaying colorful code in a dark-theme editor — the kind of output that looks complete but requires expert review to validate

The productivity advantage from AI isn't evenly distributed — and the variance tracks directly with experience. McKinsey's research on generative AI in software development found that junior developers working with AI tools were 7–10% *slower* than without them, while highly skilled developers saw productivity gains of 50–80% (McKinsey & Company, 2023). Same tools. Wildly different results.

McKinsey's explanation is worth quoting closely: AI output requires engineers to critique, validate, and improve the code — "which inexperienced software engineers struggle to do."

That's the mechanism. The tool doesn't know which code is good. *You* have to know. And if you don't, the errors pass undetected until they're expensive.

AI Productivity Impact by Developer Experience Level AI Productivity Impact by Developer Experience Level Junior devs with AI Average (GitHub Copilot) Highly skilled devs −10% +56% +75% −20% 0% +25% +50% +75% Source: McKinsey & Company (2023), GitHub Research / arXiv:2302.06590 (2023)
The productivity gap widens with experience. The tool is the same; the judgment layer is not.

There's a more recent finding that sharpens this further. A 2025 METR study tracked 16 experienced open-source developers working on 246 real tasks from high-star GitHub repositories using Cursor Pro and Claude 3.5/3.7 Sonnet. The result: experienced developers were 19% *slower* with AI tools than without them — on real, complex work (METR, 2025). The remarkable part wasn't the slowdown. It was that those developers *estimated* they'd been 20% *faster*. Their perception was the exact opposite of reality.

That gap — between what the AI produces and what you can accurately evaluate — is where expertise lives. And it's invisible to vibe coders precisely because they don't have the baseline to measure against.

The three failure modes I see most often:

  1. Wrong problem definition. The project looked impressive but wasn't solving the actual bottleneck. 5 Questions Before Hiring an AI Consultant
  2. Wrong architecture. The AI chose a technology stack that doesn't connect to the existing system, the hosting environment, or the team's maintenance capacity.
  3. Wrong output evaluation. The result looked right but contained logic errors, security gaps, or missing edge cases that only surface under real-world load.

McKinsey's 2023 research established a clear experience multiplier for AI productivity: junior developers working with AI tools are 7–10% slower than without them, while highly skilled developers see 50–80% gains. The mechanism is evaluation — AI generates code that requires expert critique to validate. Without that skill, errors pass through undetected until they become expensive production failures.

What Happens Before Any Tool Gets Opened?

Where AI Actually Fits in Your Business

This is the stage that vibe coding skips entirely.

Before I open Stitch or Replit or Claude, I spend time on what I'd call the problem audit. Not the solution — the problem. What are we actually trying to fix? For whom? What does "working" look like in six months? What systems does this need to connect to? What's the existing tech debt? What would make this fail in ways the demo won't reveal?

On a recent client project — a small Colorado retailer who needed to automate parts of their customer onboarding — the initial brief was "we need a chatbot." Thirty minutes of problem-definition work revealed the real issue: their intake form was collecting the wrong data, creating hours of manual cleanup work every week downstream. The right solution was a form redesign and one email automation. No AI build. No chatbot. Total cost: a fraction of the original quote. Total impact: meaningful, measurable, and it actually stuck.

A vibe coder would have built the chatbot. It would have looked impressive in a demo. It would have made the downstream problem worse, not better.

The planning layer isn't glamorous. It doesn't show up in demo videos. But it's where I earn the fee — by solving the right problem before any AI generates a single line of code.

What makes this stage hard is that it requires domain knowledge the AI doesn't have. It requires knowing what questions to ask, which constraints matter, and what "good" looks like for this specific client in this specific context. That knowledge is accumulated over years. It can't be prompted.

Developer surveys confirm this is where AI-assisted work breaks down. 66% of developers report spending more time fixing AI code that is "almost right, but not quite," with 65% citing missing context as the primary cause of poor AI output quality (Stack Overflow, 2025). The context that's missing is the problem-definition layer — the part only an experienced operator knows to provide.

The Hybrid Stack in Practice

A hybrid AI workflow uses specialized tools at each stage — not because the tools are magic, but because each stage requires a different kind of judgment.

Stage 1: Ideation and scaffolding (Stitch, Replit)

This is where AI earns its keep. I'll use Google Stitch to generate UI concepts or Replit to scaffold a working prototype quickly. The output is rough and that's intentional. I'm not shipping this. I'm using it to see what's possible and give the client something tangible to react to within hours.

Vibe coders often stop here. The demo looks done. It isn't.

Stage 2: Refinement (Figma)

The AI-generated layout goes into Figma. This is where design judgment enters the process. Does the hierarchy communicate the actual goal? Is the information architecture working for how users actually think? Does it match the brand? Will it convert, or just impress?

Figma isn't about aesthetics. It's where I make decisions about how a user will experience the product — decisions that require knowing what "good" looks like across hundreds of past projects. The AI hasn't built anything real. It's pattern-matching from training data. The judgment about whether that pattern fits *this* client's users is mine.

!A MacBook Pro with a JavaScript codebase on a dark editor theme — the assembly stage where system context determines whether AI output works in production

Stage 3: Assembly (Claude, Cursor, Antigravity)

The refined design goes into production using coding agents. Here's where knowledge of the client's existing systems matters enormously. What CMS are they running? What's the hosting environment? What does the database schema look like? What are the performance constraints the AI doesn't know to ask about?

I know those things because I asked them in Stage 0. The AI doesn't know them unless I tell it exactly the right things. And I know what to tell it because I've built dozens of systems like this one.

AI-Generated Code: Issue Rates Relative to Human-Written Code AI Code Issues vs. Human-Written Code (Per Pull Request) Overall issues / PR Logic errors Security issues (vs. human baseline) Readability issues 1.7× 1.75× 2.74× 3.0× 0 Human baseline (1×) Source: CodeRabbit State of AI vs. Human Code Generation Report (2025) — 470 open-source GitHub PRs analyzed
AI-generated code requires expert review to catch what the tool misses — especially security gaps and logic errors.

The handoff moments between stages are where expertise is most concentrated. Every transition — scaffolding to design, design to code — requires a judgment call about what to keep, what to discard, and what context to carry forward. Vibe coders make those calls by feel, or skip them entirely. Experienced operators make them by design, drawing on pattern recognition built across years of real projects. That difference doesn't show up in the demo. It shows up when something goes wrong in production.

A hybrid AI workflow uses specialized tools at each project stage. Scaffolding tools (Stitch, Replit) handle speed; design tools (Figma) handle judgment and user experience; coding agents (Claude, Cursor, Antigravity) handle production assembly with full knowledge of the client's existing architecture. AI-generated code has 1.7x more issues per pull request than human-written code, with security vulnerabilities 2.74x more common (CodeRabbit, 2025). Experienced operators catch these gaps at the handoff. Vibe coders ship them.

So Why Does Experience Still Matter If AI Does the Work?

Vibe Coding Adoption Among Professional Developers (2025) Who's Actually Vibe Coding Professionally? 72% not professional Not part of professional work (72%) Actively vibe code (11.9%) Use occasionally (~11%) Emphatically don't (5%) Source: Stack Overflow Developer Survey 2025 — 49,000+ respondents, 177 countries
Most professional developers aren't vibe coding. Those who try it most often report frustration with outputs that are "almost right."

Developer trust in AI accuracy has dropped for the third consecutive year. In 2025, only 32.7% of developers trust AI output accuracy — down from 43% in 2024 — while 45.7% actively distrust it (Stack Overflow, 2025). That's not a technology problem. That's a signal that the gap between what AI produces and what production actually requires is real and growing.

Why is this happening now, when AI tools have never been more capable? Because capability and judgment are different things. AI gets better at generating. The question of whether that generation is *right for this situation* is still answered by the human in the loop.

So what does experience actually buy you?

It buys you the ability to define the problem before any tool is opened. It buys you the judgment to know which tool belongs at which stage. It buys you the skill to evaluate AI output critically rather than ship whatever comes back. And when something breaks in production — not if, but when — it buys you the knowledge to understand *why*, not just that it broke. That's where client hours are saved. That's where the fee is earned.

The punchline is straightforward: AI amplifies what you bring to it. Bring twenty years of experience in defining real problems, selecting the right tools, and evaluating production-grade output — and you get that experience amplified. Your work moves faster. Your iterations are more accurate. Your debugging has direction.

Bring a vague prompt, and you get amplified vagueness. Your Website Is Leaving Money on the Table

AI amplifies what the operator brings to it. McKinsey's 2023 research confirms this: highly skilled developers see 50–80% productivity gains with AI, while junior developers slow down 7–10%. A METR study of experienced open-source developers found them taking 19% longer on real complex tasks with AI tools, while estimating they were 20% faster — the perception gap alone illustrates why domain expertise is required not just to generate, but to evaluate (METR, 2025). The gap isn't the tool. It's what the operator brings to it.


Not sure whether your project needs hybrid orchestration or a simpler approach? That's actually a good question to ask before committing to either. Optimizing Websites for AI Overviews


The Short Version

The question isn't whether AI can build things. It clearly can. The question is whether what it builds will solve your actual problem, connect to your actual systems, and hold up under real-world conditions.

Vibe coding answers the first question. Hybrid orchestration answers all three.

If you're a small business owner evaluating whether to build something with AI yourself, here's the simplest test: can you clearly define the problem you're solving, the constraints you're working within, and what success looks like in six months? If yes, you might be ready to run with AI tools. If that definition is murky, AI will make the murkiness more expensive, not less.

That's not a knock on AI. It's a knock on skipping Stage 0.

Optimizing Websites for AI Overviews