Marco Patzelt
Back to Overview
January 2, 2026

The "Runtime Compiler": Using AI to generate "Last-Mile" Logic

Andrej Karpathy is right: Programming is being "refactored". If you are still writing static endpoints, you are already obsolete. Here is my implementation of an "Agentic Orchestration Layer" that generates code at runtime instead of casting it in concrete.

The "Runtime Compiler": Using AI to generate "Last-Mile" Logic

Bridging the Efficiency Gap

Andrej Karpathy recently noted that neglecting the new layer of AI abstraction feels like a "skill issue." For Stability-Focused Architects, however, this statement can feel at odds with our primary directive: reliability.

It is entirely reasonable for an infrastructure manager to reject the idea of non-deterministic code running in production. We spent the last decade building CI/CD pipelines to ensure that exactly what we wrote is exactly what runs. Introducing an LLM that "hallucinates" logic seems to violate the fundamental tenets of software engineering.

However, there is a strategic trade-off we must address. While our static architectures are robust, they are also rigid. By the time we scope, code, test, and deploy a new dashboard endpoint, the business question has often changed.

We do not need to replace our core logic. We need a hybrid model. We need to treat AI not as a replacement for the backend, but as a "Runtime Compiler" for the last mile of data analysis.

The Problem: Static Dashboards are Always Behind

Consider the standard development lifecycle for business intelligence:

  1. Requirement: Stakeholder asks, "What is the revenue per region adjusted for seasonal churn?"
  2. Implementation: Developer writes GET /revenue-adjusted.
  3. Execution: Developer writes complex SQL, adds a controller, and deploys.

This "Hardcoded Logic" approach is safe. It is also slow. Code is a liability; the more endpoints we maintain, the higher our technical debt. By strictly coupling business logic to compiled code, we create an Efficiency Gap. We are permanently one sprint behind the business's curiosity.

The Solution: The Read-Only Runtime Layer

I propose a "Strategic Separation of Concerns":

  • The Write Layer (Hard Code): Transactions, state changes, and core security remain in Java, Go, or Rust. We never let AI guess with the ledger.
  • The Read Layer (Runtime Logic): We use AI to generate the retrieval logic dynamically.

In this model, we architect an environment where an AI (specifically reasoning models like Gemini 3 Pro) acts as a Just-In-Time engineer. When a request comes in, the system decides dynamically:

  1. Schema Analysis: What data structures are available?
  2. Logic Generation: Write the SQL or Python required to answer the specific question.
  3. Verification: Test the logic before returning the answer.

Defining the Boundary: Architecture for Safety

To make this viable in an enterprise setting, we must define strict boundaries. "Hoping" the AI gets it right is not a strategy. I have implemented a reference architecture that addresses the reliability risk directly: Agentic Orchestration Layer Model on GitHub.

This system relies on three "CYA" (Cover Your Assets) protocols:

1. CAG over RAG (Deterministic Math)

Traditional RAG (Retrieval Augmented Generation) searches for text and summarizes it. For business logic, this is insufficient. A summary of a spreadsheet is not a calculation. I utilize CAG (Code Augmented Generation). When a stakeholder asks for "Average Basket Size," the system does not predict the next word. It writes a Python script, executes it in a secure, isolated sandbox (E2B), and returns the calculated output. We leverage the LLM for its reasoning, but we rely on the Python interpreter for the math.

2. System 2 Governance (The "Slow Down" Switch)

Fast answers are often wrong. We enforce a "System 2" thinking process using models with high reasoning capabilities. Before any code is generated, the model enters a mandatory "Thinking Loop." It validates the request against a safety_policy.md and business_rules.md. If the user asks for data they are not authorized to see, the logic generation is halted before it begins.

3. Environments as Containers

We do not rely on "Personas" (e.g., "You are a helpful assistant"). We build Environments. An environment is a bounded context defined by files:

  • Physics: What tools are executable? (Read-only SQL, Sandbox Python).
  • Geography: The exact schema definitions.
  • Constitution: The governance rules.

This allows us to swap context safely. The model remains the same, but the boundaries of its operation are hard-coded into the environment files.

Enforcing Trust: The Triangulation Protocol

The valid critique of AI in production is: "What if it writes bad SQL?" To mitigate this, I implemented a Consensus Loop (Triangulation Protocol).

When the system calculates a critical metric, it creates a self-verifying architecture:

  1. Path A (Database): The model writes a SQL query to fetch aggregate data.
  2. Path B (Sandbox): The model pulls raw data into a Python environment and calculates the metric algorithmically.
  3. Comparison: If Result A and Result B deviate by more than 1%, the system throws an error and refuses to answer.

This transforms the AI from a liability into a verifiable component. We are not trusting the model; we are trusting the consensus of two distinct computational paths.

The Strategic Pivot

Karpathy’s "alien tool" comment highlights that the manual for this new era hasn't been written yet. It is up to us to write it.

Legacy habits dictate that we must hard-code every interaction to ensure safety. However, the modern architect understands that safety comes from guardrails, not just static code.

The Path Forward:

  1. Maintain the Core: Keep your transaction layers boring, static, and secure.
  2. Open the Read Layer: Experiment with a Runtime Logic Layer for reporting and analysis.
  3. Verify via Code: Use sandboxed execution to verify AI outputs, rather than trusting the text stream.

You can examine the code patterns for this verification loop in my repository.

We are moving from being "Pipe Layers" to "Orchestrators." It is a shift in responsibility, but it is the only way to close the efficiency gap without sacrificing stability.

Let's
connect.

I am always open to exciting discussions about frontend architecture, performance, and modern web stacks.

Email me
Email me