Back to blog
Agent-native infrastructure
·Plamen Petrov·7 min read

The MenuGen Problem: Why AI Agents Need Agent-Native Infrastructure

AI can write the app. ScaleMule is building the agent-native backend substrate that turns local prototypes into production systems.

Agent-native infrastructureAI infrastructurebackend infrastructurevibe codingagentic engineeringproduction systems

Listen to narration

0:00
AI narration0:00
Provider: ScaleMule

AI agents are getting better at writing application logic.

That is no longer the surprising part.

The harder question is what happens after the local prototype works.

The agent can write the app. The hard part is making the app real.

What ScaleMule provides

ScaleMule turns the MenuGen problem into a backend product surface: identity, organizations, tenant boundaries, storage, events, background work, policy, audit logs, and production configuration exposed through interfaces agents can use directly.

Instead of asking an AI agent to glue together auth, data, storage, queues, usage, environment state, and operational controls across many dashboards, ScaleMule gives the agent one production substrate to build against.

The goal is not to replace every specialized tool. The goal is to make the core backend layer coherent, inspectable, and safe enough for agentic engineering.

Karpathy's MenuGen story is one of the clearest public descriptions of this gap. MenuGen began as a vibe-coded app: take a photo of a restaurant menu and generate helpful visual explanations of the dishes. The first local version came together quickly. But making the app real meant dealing with much more than generated UI and model calls.

The work moved into the production substrate: deployment, environment variables, API keys, authentication, OAuth, DNS, payments, database state, work queues, rate limits, dev and prod configuration, and service-specific settings pages.

That is the MenuGen problem.

AI can increasingly write the app. The surrounding infrastructure is still designed for humans clicking through dashboards.

The app was not the hard part

The most important detail in the MenuGen story is not that an AI-assisted workflow produced a working application.

It is that the local demo created a false sense of completion.

The product felt close because the visible interface existed. But production software is more than interface and generated code. A real app needs durable identity, permission boundaries, data persistence, background work, billing, secure configuration, auditability, and deployment state.

Today, much of that state lives across separate services.

One dashboard owns deployment. Another owns authentication. Another owns OAuth. Another owns payments. Another owns storage. Another owns background jobs. Another owns rate limits. Another owns environment variables. Another owns production keys. Another owns team and security settings.

For a human developer, this is annoying but familiar.

For an AI agent, it is a broken surface.

The agent can modify files, run commands, and reason over text. But the critical production state often lives behind browser flows, nested settings pages, copied secrets, inconsistent documentation, and interfaces designed for humans in a pre-agent world.

Dashboard-native infrastructure does not compose well with agents

Most modern infrastructure was designed around a human operator.

That made sense for the last era of software. Developers read docs, opened tabs, clicked through settings, copied keys, configured DNS, pasted environment variables, and stitched together products across specialized SaaS tools.

Agentic engineering changes the interface requirement.

If an AI agent is expected to build, deploy, operate, and improve software, the infrastructure has to expose itself through surfaces agents can use reliably:

  • structured APIs
  • deterministic CLIs
  • Markdown-native documentation
  • clear configuration contracts
  • inspectable state
  • safe defaults
  • scoped permissions
  • audit trails
  • environment-aware provisioning
  • production guardrails

The issue is not that Vercel, Clerk, Stripe, Supabase, Google Cloud, or other services are weak products. They solve important pieces of the stack.

The issue is that the agent has to operate across too many fragmented human interfaces to make a product real.

The next infrastructure layer needs to reduce that fragmentation.

From vibe coding to production systems

Vibe coding lowers the cost of creating software-shaped artifacts.

Agentic engineering raises the question of how those artifacts become reliable systems.

That requires a different kind of backend layer.

Not just hosting.

Not just auth.

Not just a database.

Not just a queue.

Not just billing.

Not just logs.

The missing layer is a production substrate that combines the core backend primitives an AI-generated product needs, while exposing them through interfaces agents can operate directly.

That is the category ScaleMule is building toward.

ScaleMule is our answer to this infrastructure gap. We are building agent-native backend infrastructure for AI and API products: identity, tenant isolation, data, storage, events, functions, policy, auditability, and operational controls in one layer designed for the agent era.

The goal is simple:

The agent writes the product logic. ScaleMule carries the production substrate.

What an agent-native backend needs

An agent-native backend should not require the user to assemble production software through ten disconnected dashboards.

It should provide the primitives that most AI and API products need from the beginning.

Identity and access

Agents need a reliable way to create applications with users, organizations, roles, permissions, API keys, service accounts, and non-human actors.

Authentication is not enough. Agent-generated products need identity systems that understand tenants, teams, applications, and delegated actions.

Tenant-aware data

Many AI products are multi-tenant from day one.

Customer data, organization boundaries, user permissions, usage records, generated content, and operational metadata need to be isolated by default. Tenant isolation should be a core primitive, not an afterthought.

Storage and media

Modern AI products often handle files, images, transcripts, generated media, user uploads, and derived assets.

Agents should not have to wire object storage, metadata records, permission checks, and lifecycle behavior from scratch every time.

Events and background work

The MenuGen story points directly at the need for durable background workflows.

If an operation takes time, the system needs a job record, queue, state transitions, retries, progress updates, and failure handling. The user should not lose everything because a request timed out.

Policy and auditability

Agent-generated systems need strong operational boundaries.

Who created this key? Which user triggered this action? Which agent changed this configuration? What tenant did it affect? What happened before the failure?

Agent-native infrastructure needs audit events and policy controls by default.

Production configuration

Environment variables, secrets, webhooks, domains, rate limits, API keys, and deploy configuration should not be scattered across hidden dashboard state.

They should be inspectable, scriptable, and understandable by agents.

MenuGen pain vs. ScaleMule primitives

MenuGen pain

API keys and environment variables scattered across services

ScaleMule primitive

Scoped configuration and secret surfaces

MenuGen pain

Authentication and OAuth setup across dashboards

ScaleMule primitive

Tenant-aware identity and access

MenuGen pain

Database records and background queues wired manually

ScaleMule primitive

Durable data and event workflows

MenuGen pain

Timeout-prone processing

ScaleMule primitive

Background jobs with state transitions, retries, and failure handling

MenuGen pain

Hidden production state in service dashboards

ScaleMule primitive

Agent-readable API, CLI, and documentation surfaces

MenuGen pain

Debugging production behavior after deployment

ScaleMule primitive

Audit logs and operational state

MenuGen pain

Usage limits and commercial controls added late

ScaleMule primitive

Billing, metering, and policy primitives

The opposite of dashboard glue

The MenuGen problem is not only a developer-experience problem.

It is a category signal.

As AI agents become more capable, the bottleneck shifts from generating code to safely connecting that code to the real world.

Production infrastructure has to become legible to agents.

That means fewer browser-only configuration flows. Fewer copied secrets. Fewer hidden production states. Fewer disconnected service contracts. Fewer brittle handoffs between local code and deployed systems.

It means infrastructure that an agent can inspect, provision, modify, and verify through explicit contracts.

Karpathy's MenuGen writeup points toward more than another marketplace of disconnected parts. It points toward a more opinionated, preconfigured layer with surfaces LLMs can actually operate: APIs, CLIs, curl examples, Markdown documentation, and explicit configuration. That is the direction ScaleMule is built around.

ScaleMule exists for that world.

Our thesis

We believe the next generation of software will be created by a mix of human intent and agent execution.

But those agents need a production layer they can rely on.

The winning infrastructure will not only make developers faster. It will make agents safer, more capable, and more operationally useful.

That is why ScaleMule is building agent-native backend infrastructure for AI and API products.

A product should not feel eighty percent done when it is really twenty percent production-ready.

The agent should be able to move from prompt to product against a backend substrate designed for deployment, identity, data, storage, events, policy, auditability, and operations from the start.

That is the MenuGen problem.

And it is exactly the kind of infrastructure gap ScaleMule was built to close.

ScaleMule review

Watch ScaleMule handle the production substrate

See how an AI and API product can be built against one backend layer for identity, tenants, storage, events, policy, auditability, and operational state.

Sources and note

This post references Andrej Karpathy's public MenuGen writeup and the public conversation From Vibe Coding to Agentic Engineering. ScaleMule is not affiliated with, endorsed by, or sponsored by Andrej Karpathy, Sequoia Capital, Vercel, Clerk, Stripe, Supabase, Google, or any other third-party service mentioned.

Back to blog