logo

AI Governance in the Vibe Coding Era: When Shadow IT Becomes Shadow AI

When AI lets marketing, HR, and sales each run their own backend, Shadow IT becomes Shadow AI. From the phenomenon and its causes to a governance framework you can actually use: the new challenge for AI governance in the vibe coding era.

Ling WuLing Wu

In May 2026, the Israeli security firm RedAccess scanned 380,000 vibe-coded apps sitting publicly on the internet, around 5,000 of which were holding real corporate secrets: a Brazilian bank's financial data, a UK clinical trial's patient records, a hospital's doctor-patient conversations and staff schedules, a school's lesson recordings and student data. These apps run on platforms like Lovable, Replit, Base44, and Netlify. According to Axios, most of them weren't hacked. The platforms default to public, nobody flipped them to private, and search engines indexed them straight away.

The people who built these apps mostly don't know they leaked anything — this is what Shadow IT looks like in the AI era. Hold that term; we'll come back to it.


Zoom in on a more everyday version.

Monday morning, 10 a.m. Sam, the IT director, picks up the phone.

The company's membership campaign page is down. Customers paid but never got a confirmation email. Sam opens his dashboard to grep the logs, but the service isn't there. After asking around, he finds out: marketing built it last week with Cursor, and it's running on someone's personal cloud account.

That phone call is the moment Sam's job changes.

This isn't a one-off. Shadow IT (employees bypassing IT to adopt their own tools or build their own systems) has always been around. But when AI tools let marketing, sales, and HR each ship a production app on their own, its scale jumps from "an employee quietly installs a SaaS" to "an employee quietly runs a backend," and the traditional IT governance toolkit can't keep up with that magnitude. This piece is about that new problem: when Shadow IT becomes Shadow AI, how the enterprise security boundary shifts with it, and how the front-line IT director (also called MIS or IT manager at SMBs and traditional firms, or CIO depending on company size) should respond.

First, let's mark the scope. AI governance is big: model risk, AI usage policy, human-in-the-loop, and accountability are all in it. This piece focuses on one slice, and the one surfacing fastest: how the enterprise security boundary moves once vibe coding turns Shadow IT into Shadow AI. It runs from the phenomenon and its causes through to a governance framework you can actually follow. The other slices are for another piece.

TL;DR

  • Framework: three layers of AI governance, visibility → boundaries → audit. Order matters; skip ahead and the bottom falls out.
  • Thesis: sandbox is the new enterprise perimeter. Don't ban vibe coding; pull it into a place IT can see.
  • Playbook: inventory what's running → open an approved workspace → set an audit cadence.

Shadow AI in the Vibe Coding Era: Why It's 10x Worse Than Old Shadow IT

"Shadow IT" isn't a new term; Gartner turned it into a named concept back in the 2010s. To IT it's a governance blind spot; to employees it's often the reasonable choice to "just get the work done." Both sides have a point. Over time, company data, processes, and permissions end up scattered where IT can't see them.

Ten years ago, shadow IT meant employees quietly using Notion or expensing their own Slack. The IT director's job was reviewing SaaS purchase orders, writing acceptable use policies, running periodic inventories. The pain was "data lives in the wrong places," but at least those places were well-known SaaS, backed by SOC 2 reports, SLAs, and a support line.

Shadow AI is the AI-era version of shadow IT, and the underlying problem is identical: employees putting sensitive data on a third party IT never vetted. What actually changed is the nature of that third party. It used to be off-the-shelf SaaS; now it's a vibe coding platform, or a backend and database the employee built themselves. Swap the container from "an audited, well-known service" to "an ungoverned self-built system" and the risk is no longer the same magnitude. Lay it side by side:

DimensionShadow IT (past)Shadow AI (now)
Where sensitive data livesOff-the-shelf SaaS (Notion, Slack)A vibe coding platform, or an employee-built backend + database
What's behind that serviceSOC 2, SLA, a support lineNo governance record, nobody to take it over
Breach blast radiusOne account, one datasetBackend + database + API, the whole thing spills out
IT's old playbookReview purchase orders, write usage policy, run inventoriesThe old toolkit can't handle this magnitude
Pain pointData lives in the wrong placesData, backend, and permissions all scattered, and you can't name them when it breaks

The most important difference is controllability. Before, if data accidentally landed on Notion, permissions were left open, and an outsider clicked into the page, you'd close the permissions, delete the page, and more or less stop the bleeding. But Shadow AI wires sensitive data into a live system: one exposed endpoint and an attacker can follow the thread, pulling out the chained database, API, and other internal data behind it. Deleting a page won't get that back. Those 5,000 leaked apps from the opening were exactly that: defaulted to public, swept up wholesale by search engines.

This is more common than you'd think. Stack a few numbers together:

  • 40%+ of enterprises will hit a shadow AI security or compliance incident by 2030, per Gartner.
  • 71% of employees have used unapproved AI tools, and 51% use them every week, per a Microsoft UK survey from October 2025.
  • Only about half of frontline employees regularly use AI at all. Having access to GenAI is one thing (70% of employees in mature industries, under half in immature ones); actually using it once you have it is another. BCG's September 2025 report The Widening AI Value Gap (1,250 respondents) calls the gap between "can access" and "actually uses" the "silicon ceiling."

Even if only half the workforce regularly uses AI, that half is already accumulating shadow AI that IT can't see.

Vibe coding changed the game: what employees do leveled up a notch, from "install a SaaS" to "run a backend + database + API of their own." AI tools (Claude Code, Cursor, GitHub Copilot) spread the ability to write code from engineers to everyone, but the responsibility for running production didn't spread with it. Being able to build something doesn't mean you know that:

  • Secrets shouldn't live in the repo.
  • Backups don't run by themselves.
  • A 10x traffic spike means doing something first.

The result is a company quietly accumulating ten to thirty production services with zero governance record. They collect customer data, send email, hit the internal ERP — and the IT director can't even name them. This tension between tool democratization and security is the underlying friction vibe coding governance can't get around.

Why the Old Security Perimeter Can't Hold Vibe-Coded Output

Shadow AI grew this big because of three forces happening at once, each one making the traditional IT line (firewall, IDP) unable to hold employees' output.

The deployment threshold dropped to zero, and visibility went with it

Lovable, Bolt, v0, Replit Agent, and Claude Code all make "write code and it runs in the platform sandbox" the default action: no Dockerfile, no server setup, no account request. For employees this is a huge win; for the IT director it's total visibility collapse: he doesn't know how many production apps are running without process, whose personal accounts they run on, or whose card pays for them.

To build something usable, AI has to reach real company data

For vibe coding output to go from prototype to genuinely usable, the AI needs to be fed real context: customer lists, order records, solutions to past problems, the internal wiki. The more context you give, the more the output fits the scenario. But giving it data means facing redaction / privacy / compliance liability; withholding it leaves the employee unable to make bricks without straw, with output stuck at a "looks like it works but doesn't fit the scenario" toy version.

Agents and employee-run systems bypass the old identity perimeter

Anthropic's Computer Use (October 2024), OpenAI's Operator, Devin, and others let agents move a keyboard and mouse to write and deploy code themselves; employees' AI apps also frequently spin up a system that never passed through an IDP (identity provider). The SaaS era relied on IDPs to manage "who can log into which system," and now that's bypassed by both a new actor (the agent) and a new surface (the production apps employees launch themselves).

Deloitte's State of AI in the Enterprise 2026 report notes only 1 in 5 (20%) of companies have a mature governance model for autonomous AI agents, and points directly at the direction: "Effective governance integrates with existing risk and oversight structures, not parallel 'shadow' functions." Governance has to integrate into existing risk structures, not stand up a parallel shadow function.

These three forces rewrote the definition of the "enterprise security perimeter." Where the new boundary should sit, and why the answer points to the "sandbox," gets its own section later (hold that term too).

Three Signs Shadow AI Is Out of Control: Visibility, Boundary, Audit

When the IT director starts saying these three sentences out loud, the shadow IT problem has reached the point where you need to move:

Signal 1: The Visibility Crisis

"I don't know how many apps the company is running right now."

Employee AI apps live across different accounts, different payment cards, different PaaS providers. IT has no inventory. Every time something breaks, the IT director is the first one called, and he can't even name the thing.

Signal 2: The Boundary Crisis

"He's leaving next week, and I don't know what company data the stuff he built is using."

An employee showed the customer database connection string to the AI, wrote it into the code, and runs the whole thing on a personal account. When that person leaves, the company has no mechanism to take over the system, and no way to revoke their access.

Signal 3: The Audit Crisis

"We have an audit at year-end, but these things aren't in my control."

Customers demand SOC 2, government tenders require ISO, the industry wants a security review. Auditors ask "Access logs? Backups? Change history? Who can modify it, who can read it?" The IT director has no answers.

The Enterprise AI Governance Dilemma: IT Caught Between Adoption and Risk

Start with a baseline number. McKinsey's 2025 State of AI global survey (1,993 respondents across 105 countries) found that 88% of organizations already use AI regularly in at least one business function, but only 18% have built an enterprise-wide AI governance council. In other words, in more than eight out of ten companies, employees are already using AI but the governance structure hasn't grown yet. The IT director's dilemma grows out of that gap.

To understand the IT director's bind, look at the two layers of pressure above him first.

The view from the corner office: AI-first anxiety

Across 2025-2026, just about every CEO is AI-anxious: peers are doing it, competitors are doing it, the press is chasing it, employees ask about it. The instinctive response is top-down: "We're AI-first next year." "Every department ships an AI app." "Departments that can't use AI get reassessed." The point is to make the company look like it's moving, so it doesn't get washed out. But there's a chasm between pushing top-down and actually deploying.

Duolingo is the canonical case of that chasm going public. On April 28, 2025, CEO Luis von Ahn announced an "AI-first" strategy on LinkedIn: phasing out contractors AI could replace, putting "uses AI" into hiring criteria and performance reviews. TechCrunch reported that two days after the announcement the company shipped 148 AI-generated courses, twelve years of curriculum work compressed into less than one. The backlash erupted fast: users canceled subscriptions, deleted the app, flooded TikTok with negative reviews. Duolingo deleted every post on TikTok and Instagram and went quiet for a while. About four weeks later von Ahn publicly walked back his earlier comments on LinkedIn; Fortune quoted him directly: "I do not see AI as replacing what our employees do," and he stated that hiring would continue at its prior pace. On the Q2 2025 earnings call he also conceded the AI comments were part of why DAU growth came in at the low end of guidance (40% YoY, against 60% a year earlier).

This pattern repeats at a lot of companies pushing AI top-down. Fortune reported that Klarna replaced 700 customer service agents with AI, then rehired humans six months later: the same script.

The line I keep coming back to:

Adoption isn't deployment, and deployment isn't utilization. — Ling Wu

An employee opening Cursor every day (even asking it what's for lunch) doesn't mean the company actually wired AI into its workflow; and even if they do build a few small systems with it, most companies have no yardstick to measure whether AI is actually lifting productivity. Two gaps in the middle, "treatment vs deployment" and "deployment vs utilization," both stall.

The view from the middle: risk vs flexibility

Inside this sandwich, the IT director has to answer the CEO's "we want AI-first" demand (push adoption) while managing the governance risk of vibe coding spreading (hold the line). That's enable + control at the same time: too much of either side and it flips.

The two extreme reactions are both versions of failing to hold both sides:

Reaction 1: Ban It

"From now on, employees can't build AI apps on their own: everything goes through IT's request process." The result is forcing everyone onto tools that tie their hands: IT's approved set usually lags well behind what builders use outside, and can't do what employees actually want. Employees won't stop just because IT objects (the CEO just said "use AI to boost productivity"), so they keep building, somewhere more hidden. The ban amplifies the visibility crisis into an underground crisis, and makes the IT director look like "the bottleneck blocking AI" in the CEO's eyes.

Reaction 2: Ignore It

"The audit isn't here yet, so don't worry about it." Leave it alone, and the problem grows on its own: colleagues unfamiliar with frontend frameworks and database choices each pick their own tools. Three people might end up on three different frameworks and four different databases; ten systems spring up at once, and the company is left holding a permutation of tech stacks waiting for someone to take over. When the auditor actually shows up, the IT director isn't just facing an audit nightmare; the tech-stack permutations from ignoring it all blow up at the same time.

The sandwich pain, said plainly: the CEO only watches whether adoption is moving, the employee only cares whether their own thing runs, while the risk (data leaks, compliance holes, audit failure) and the operational debt (who fixes it, who backs it up, who swaps tools) all land on one IT director — and nobody cares what he's carrying.

The right direction is the third path: pull the AI output into an environment IT can see, without stopping employees from building. Put another way: the sandbox becomes the new enterprise security perimeter ("sandbox is the new perimeter").

From this angle, the IT director does enable (employees keep using AI) + control (IT can see and manage it) at once: answering the CEO's top-down push, solving the governance risk, and not interrupting employees' daily workflow. What a sandbox actually is, and why it only became the default answer for the enterprise perimeter in the last year or two, gets its own section later.

A Vibe Coding Governance Framework: Visibility, Boundaries, Audit

To treat vibe coding governance as a serious enterprise-level issue, the industry already has a few existing standards addressing it:

  • NIST Cybersecurity Framework 2.0 (2024) — six functions: Govern, Identify, Protect, Detect, Respond, Recover. Covers the full cyber governance lifecycle from risk identification to incident response.
  • NIST AI Risk Management Framework (NIST AI 100-1, 2023) — four functions: Govern, Map, Measure, Manage. The baseline for governing AI risk as enterprise risk.
  • ISO/IEC 42001:2023 — the first international AI management system standard, aligned with existing ISMS frameworks like ISO 9001 and ISO 27001.

How these standards align, where SMBs should start landing them, and the implementation tradeoffs (how AI governance terms like quality gate, operational debt, technical debt, and paved path fit into your own process) get a full breakdown in a separate article.

This piece distills those standards into three layers, a framework that lets you quickly spot problems in vibe coding scenarios: visibility, boundaries, audit. Order matters; each layer solves a different problem:

LayerQuestion it answersConcrete artifactFailure mode if skipped
1. VisibilityWhat's running right now?One workspace, one dashboard, one billing entry pointIT can't name the service when it breaks
2. BoundariesWhat can each thing touch?Environment isolation, shared secret manager, seat-revocation automationData follows the employee out the door
3. AuditWho did what, when?Audit logs, resource ownership, deployment history"We don't know" is the auditor's finding

Layer 1: Visibility

The most basic job is to pull all AI apps into one place to look at. One billing entry point, one dashboard for every deployment, one audit log of who changed what.

What this means concretely: give employees an approved environment (a company-paid workspace, a company-managed account) and encourage their AI apps to deploy here: employees keep vibe coding, IT gets to see. Without visibility, the next two layers can't happen.

The extreme version of visibility-first governance is GitLab's public handbook: every internal policy, every tool decision, every operating procedure published in a public Git repo readable by anyone on the internet. Most companies will never go that far, and don't need to. But the underlying principle holds in the inverse: ungoverned shadow systems can't survive in an environment built around "everything is visible by default." The IT director's version is smaller in scope (one workspace, one dashboard), but the discipline is the same. Have a visible environment first, and the enable that follows actually means something.

Here's what it looks like in practice: Sam opens the company's IT workspace and the first thing he sees is twelve services side by side: marketing-membership-page (built by marketing, env var edited last week by Chen), hr-interview-scheduler (built by HR, running in Asia, used 2.3 GB this month), sales-customer-cleanup (built by sales, no deploy in two weeks)... each with an owner, last deploy, region, and cost. Sam doesn't need to ask anyone; the dashboard directly answers "what's running right now."

Layer 2: Boundaries

Visibility solves "I don't know what's running"; boundaries solve "I don't know what it's using."

What this means concretely is three things:

  • Environment isolation: dev / staging / production each with independent permissions, so an experiment can't accidentally hit production data.
  • Shared secret manager: credentials live in one managed place; the AI sees the reference but can't reach the plaintext.
  • Seat-revocation automation: when an employee leaves, every access path closes in one click, no playing whack-a-mole across five PaaS dashboards.

Employees' AI apps can connect to internal data, but the credentials never get read into a prompt, and access cuts off automatically when the employee leaves.

Here's what it looks like in practice: Chen, who left last week, once built customer-feedback-collector, running on his own GitHub OAuth account. Sam clicks "remove collaborator" in the workspace: the service's prod env automatically switches to "unclaimed," all access paths are revoked, and billing reverts to the company's main account. HR doesn't have to chase Chen for anything.

Layer 3: Audit

Audit logs tell you who deployed what, who changed which env var, who looked at which log; resource ownership gives "whose service is this" an answer; deployment history means employees don't have to keep change records themselves.

This layer only solves the governance surface, not compliance certification itself. When customers need SOC 2 / ISO 27001 / regional compliance sign-off, you have to map the plan's built-in tooling onto an audit framework: that usually needs a consultant or governance review, not something a PaaS should be responsible for.

Here's a concrete scenario: an employee builds their own workflow (a small tool written with Claude Code + n8n hitting the internal wiki API) to automatically draft customer-facing responses. To make the AI answer more accurately, they stuff internal wiki data into the prompt, but the internal wiki data isn't fully redacted: personnel lists, deal amounts, and past complaint records are all in there. The AI won't proactively distinguish what's public from what isn't, so some private data gets exposed inside the "responses meant for customers."

Set aside the accountability and legal questions in this scenario: the real core of the audit layer is process design, whether there's a checkpoint at the node where AI touches sensitive data. Producing logs when the auditor arrives is the floor; process design is what's actually being defended. In the AI era, nobody knows when the next public governance incident will hit, and gradually landing all three governance layers is what turns this kind of risk from a "disaster" into a "small incident that can be caught early."

One concrete way to do it is pushing the first checkpoint over to employees: before deploy, run a secret scan, a sensitive-data leak check, and dependency alerts, so employees fix it themselves and what IT inherits is a product that already cleared the first gate. The market already has plenty of services that automate this gate. How to set the checkpoints and how to build human-in-the-loop is the topic of another article.

Put visibility → boundaries → audit together and they solve problems at different levels; but all three share one prerequisite: you need a container IT can see and manage for them to land in. The Enterprise AI Governance Dilemma above already named that answer: the sandbox becomes the new enterprise security perimeter. The next section unpacks why this answer only surfaced in the last year or two, and what changed.

Shadow IT's New Security Perimeter: the Sandbox

We've been treating "sandbox" as the answer this whole time. Here's what it actually is.

In one sentence

A sandbox is a fenced playground: employees build freely inside (write code, deploy, wire in company data, whatever they want), but the fence is IT's. IT can see what's running, control who gets in, and nothing escapes the boundary.

The concept itself isn't new: browsers have sandboxes (web pages can't freely touch your local files), iOS has sandboxes (each app can only touch its own data), and language runtimes have their own isolation designs too, all at least a decade old. What's new is that in the last year or two it got pushed to the center of IT governance discussion and became the default answer for the enterprise security perimeter, because the three forces above (deployment threshold to zero, AI needing real data, agents bypassing the IDP) punched through the old line.

The sandbox is the common answer to all three forces: employees still get zero-friction deployment, but it runs inside a sandbox IT can see; the AI only sees the prepared data slice inside the sandbox and sensitive fields get handled before they leave it, so IT doesn't have to review every prompt one by one; and the sandbox catches both agent actions and employee systems as a single execution boundary. Companies specializing in agent sandboxes (E2B, Daytona, CodeSandbox SDK, Modal, Beam) sprang up fast over the past 12 months precisely because they saw this gap.

Where Does the Sandbox Itself Run?

There are three options: use the platform's built-in sandbox, fully self-host, or a managed sandbox (the PaaS model). Agreeing the sandbox is the answer is only step one: where the sandbox itself runs, who manages it, who fixes it when it breaks is the multiple-choice question every IT director is evaluating in 2026, and the three paths each have tradeoffs:

  • Use the platform's built-in sandbox (the default for Lovable / Bolt / v0): simple, but the data and artifacts are bound to the vendor, IT can't get an inventory, and the bill is often still on an employee's personal card.
  • Fully self-host: technically controllable, but you have to staff an infra team to maintain it. Employees vibe code to dodge IT process, while IT runs an extra system just to corral vibe coding in a sandbox, so it's not necessarily cheaper.
  • Managed sandbox (the PaaS model): pick the cloud, region, and data-sovereignty scope you want, but hand operations to the platform.

This choice can't be deferred: employee vibe coding won't wait for you.

Pull the three forces together and the definition of the enterprise security perimeter has been shifting all along: before the 2000s it was the firewall era, guarding "who can get onto the company network"; from 2010 to 2024 it became IDP / SaaS, guarding "who can log into which system"; from 2024 onward, in the vibe coding era, what you have to guard becomes "which sandbox the employee's, and the agent's, output runs in."

This sandbox draws an IT-visible boundary around employee freedom. Employees (and the agents employees launch) keep vibe coding inside it; IT doesn't review code, doesn't ban tools, doesn't redo anyone's work; it just maintains the sandbox itself: where it is, who can get in, what's running, how much it's using.

For the IT director this angle is liberating: his role flips from "the person who bans employees from doing X" to "the person who gives employees a place to do X": the former makes him a blocker, the latter an enabler. Same IT director; the difference is only which line he chooses to defend.

Three Steps for IT to Operationalize Vibe Coding Governance

If you're the IT director and your company is just entering the vibe coding stage:

Step 1: Run an inventory. Ask each department, "have you built anything with AI recently that's running?" The answer usually shocks people. Write it down; this is your visibility baseline.

Step 2: Open an approved workspace: centralized deployment, shared billing, audit logs, and seat management all at once. This ecosystem roughly splits into three layers, with common options:

  • Identity & access layer: Okta, Entra ID (formerly Azure AD), Auth0, JumpCloud, GitHub Enterprise SSO, managing "who can log into which system."
  • Deploy + billing + audit integration layer: Fly.io organizations, Heroku Teams, Render Teams, Vercel Teams, Zeabur Team Plan, managing "where things run, who can see them, who pays."
  • Risk & compliance automation layer: Vanta, Drata, Secureframe, Datadog Cloud Security, managing "audit prep and anomaly alerting."

A full stack usually mixes two or three layers. The choice isn't the point: getting employees' existing AI apps to move in, and new ones to default to running here, is.

Step 3: Set an audit cadence. Even if you don't have to pass a formal ISO or SOC 2, run a quarterly audit-log review and confirm every production service has an owner. This habit will save your life when a real audit shows up.

Different industries each find their own understanding of the AI governance problem. A two-person engine in comedy entertainment is a recent case: Sunny owns governance, Feng builds the systems, and they write all three layers above into their daily workflow.

Bringing Vibe Coding Systems Into Your IT Governance Framework

Back to Sam's story, three months later.

When that "membership campaign page is down" call came in, Sam couldn't even name the service. Now he opens the company's IT workspace and sees a single unified dashboard: marketing's membership campaign page here, sales' customer-data cleanup tool there, HR's interview scheduler too, each service's deployment status, who last changed which env var, how much resource it uses, and which region it runs in, all on one screen.

Employees keep building things with Cursor, and new AI apps default to deploying into this workspace. Sam doesn't have to keep asking each department "what have you built lately"; the dashboard answers that. When something breaks, he knows who to find, where the service runs, and where the data boundary is.

He didn't ban anyone from using AI. What he did was set up the sandbox, so employees' vibe coding naturally happens inside this approved space.

That's IT governance in the vibe coding era: design clearly where employees can run things, and who's allowed to build stops being the problem.

To get hands-on and stand up the sandbox directly, pulling employee vibe coding into an environment IT can see, Zeabur Team Plan is designed to do exactly this.