Mission Briefing: An LLM-Operated Application Platform

Objective
Design a platform where a user can reshape a live application to match their mental model — in plain English — and the system can safely execute, deploy, and verify those changes without tutorials, manual setup, or infrastructure wrangling.

This is not “vibe coding”. This is operational software with agency - a guardrailed architecture where LLMs can be the maintainer.

Mission Context

I’ve observed that AI coding tools fail at the same moment:

Here’s what you should do next.

Suggesting what you should do next instead of just doing it.

But I’m not ready to let AI have infinite control.

Instead, I’ll give it authority, environment, and paved paths.

This Mission Briefing describes an architecture where:

  • the LLM is an operator, not a suggestion engine
  • execution happens inside clear guardrails
  • infrastructure is owned, not improvised
  • changes are live, observable, and reversible

Mission Stack

This architecture is grounded in tools that already work well together:

  • Hubot — command router, policy enforcement, audit log
  • Juphjacs — server-rendered HTML with WebSocket hot reload (morphdom)
  • LLMs — planning, diff generation, decision-making
  • Kubernetes — execution substrate the LLM can operate
  • Git — source of truth, diff boundary, rollback mechanism

Core Systems & Roles

1. Mission Control (Human Interface)

Purpose

  • Capture intent in natural language
  • Show diffs, previews, and deployment state
  • Never mutate the system directly

Key Principle Humans express intent.
They delegate perform mechanical steps.

2. Hubot (Command & Policy Authority)

Hubot is the control plane.

Responsibilities

  • Accept structured change requests
  • Invoke the LLM
  • Enforce guardrails
  • Apply changes via tools
  • Record every action

Hubot is the authoritative arm for the Human.

3. LLM Operator (Planner + Patch Author)

The LLM acts as an operator, not a free-form coder.

It may

  • Read repository files
  • Propose unified diffs
  • Call approved tools
  • Verify outcomes

It may not

  • Write files directly
  • Run arbitrary shell commands
  • Access secrets
  • Modify infrastructure outside paved paths

Critical Constraint The LLM proposes changes.
Hubot applies them.

4. Juphjacs Runtime (Living Application Surface)

Juphjacs provides a crucial capability:

  • Server-side rendering
  • Persistent WebSocket connection
  • DOM updates via morphdom (for web sites)

Effect

  • Server changes are visible immediately
  • No rebuild-refresh cycles
  • The UI becomes a live preview environment

This collapses feedback loops from minutes to seconds.

5. Kubernetes Substrate (Owned Execution Environment)

Kubernetes is not “DevOps plumbing”.
It is part of the mission system.

Managed via tools

  • Deployments
  • Preview environments
  • Promotions
  • Rollbacks
  • Configuration changes

The LLM does not guess how to deploy.
It operates the cluster through bounded actions codified in the Hubot framework.

The Paved Path Framework

Agency requires constraint.

Rules

  • LLM may edit /pages, /components, /lib
  • LLM may propose data migrations via tools
  • LLM may not edit /ops, secrets, or raw infra

This is how safety is enforced.

Toolbelt (LLM-Callable Capabilities)

Tools are the difference between suggesting and doing.

Repository Tools

  • repo.list_files()
  • repo.read_file(path)
  • repo.apply_diff(diff)
  • repo.create_file(path, contents)

Verification Tools

  • check.run_tests()
  • check.render_smoke_test(route)
  • runtime.tail_logs()

Kubernetes Tools

  • k8s.deploy_preview()
  • k8s.promote_release()
  • k8s.rollback(version)
  • k8s.get_events()

The LLM does not explain how to do these things.
It invokes them.

Failure Modes & Containment

This system assumes failure is inevitable.
The architecture exists to contain it.

1. Incorrect Code Changes

Failure

  • The LLM generates a change that breaks rendering or logic.

Containment

  • Changes are applied in a sandbox/worktree.
  • Tests and render smoke checks must pass.
  • Preview environment is required before promotion.
  • One-click rollback restores last-known-good state.

2. Misinterpreted User Intent

Failure

  • The LLM implements something that technically works but doesn’t match the user’s mental model.

Containment

  • Diffs are visible before deployment.
  • Previews update live via Juphjacs.
  • Promotion requires explicit user approval.
  • Small diffs are encouraged; large diffs require confirmation.

3. Cascading or Repeated Regressions

Failure

  • The LLM repeatedly “fixes” the same issue or introduces oscillating changes.

Containment

  • Every change is logged with intent, diff, and outcome.
  • Repeated reversions surface a signal.
  • The system can pause autonomous changes and require manual approval.

4. Runtime or Infrastructure Instability

Failure

  • A deployment causes crashes, restarts, or degraded performance.

Containment

  • Kubernetes health checks and readiness gates.
  • Telemetry monitored post-promotion.
  • Automatic rollback on error-rate thresholds.

5. Overreach or Unsafe Actions

Failure

  • The LLM attempts to modify restricted areas or perform unsafe operations.

Containment

  • Tool-level permission enforcement.
  • Hard boundaries on editable paths.
  • Secrets and raw infra are never directly accessible.

6. Silent Failure (The Most Dangerous Case)

Failure

  • The system appears to work, but user intent is subtly violated.

Containment

  • Visible carry-forward behavior in UI (e.g., Yesterday / Today view in Mission Briefing app).
  • Operational memory of prior intent.
  • Preference for explicit action over implicit automation.

Why This Architecture Works

Most AI coding tools fail because:

  • they lack execution authority
  • they lack a runtime
  • they lack boundaries

This system works because:

  • responsibility is explicit
  • authority is constrained
  • feedback is immediate
  • rollback is cheap

The LLM becomes a maintainer, not a narrator.

First Mission: Yesterday / Today Mission Briefing App

The initial mission is deliberately small:

  • server-rendered UI
  • persistent state
  • visible carry-forward behavior
  • rapid iteration via natural language

It validates the architecture before scaling it.

Closing Note

Software should adapt to humans — not the other way around.
AI becomes useful only when it’s allowed to act responsibly.

This Mission Briefing is how that becomes real.