All posts

Apr 1, 2026 · Engineering

Architecture overview

How a sentence becomes a running app — the iOS app, the main API, the sandbox, and how they fit together.

Mana has four moving parts. Understanding how they fit together makes everything else easier.

The iOS app

The thing on your phone. Built in Swift 6 / SwiftUI for iOS 18+. Owns the chat UI, the library of apps you've built, widgets, the Live Activity, and a WKWebView pool that renders Mana apps as if they were native.

The main server

Hono on Vercel. Owns:

  • Authentication (JWT + argon2-hashed refresh tokens)
  • App / skill / user CRUD via a Drizzle + Supabase Postgres database
  • Sandbox pool orchestration (claims pre-warmed Fly machines atomically)
  • The toolkit APIs your Mana apps call at runtime — AI, storage, push, cron, email, db, Composio integrations

The sandbox

A Fly.io machine, one per active session. Drives Claude Code via the Anthropic Agent SDK. Compiles Mana apps with esbuild, persists workspace state to R2 as a tar.gz, and exposes MCP tools that the agent calls (init_app, update_app, run_code, check_app, generate_image, upload_file).

The data flow

  1. iOS asks the main server for a sandbox. The server claims a pre-warmed Fly machine.
  2. iOS opens a WebSocket directly to the sandbox. This is the chat stream — the main server is no longer in the request path.
  3. The sandbox drives the agent. Tool calls hit the main server via HTTP callbacks for persistence; the agent's reply streams back to iOS as SSE.
  4. The sandbox suspends after 3 minutes idle and tar.gz's its workspace to R2. On the next session, the workspace is rehydrated.

Why this shape

  • iOS → sandbox direct WebSocket: one fewer hop in the chat stream, lower latency, cheaper at scale.
  • Per-session VMs: the agent has read/write/exec on its own machine; nothing is shared across users.
  • Main server as orchestrator: persistence, auth, billing, search, push, callbacks. Everything that needs durable state.