Architecture overview

Mana has four moving parts. Understanding how they fit together makes everything else easier.

The iOS app

The thing on your phone. Built in Swift 6 / SwiftUI for iOS 18+. Owns the chat UI, the library of apps you've built, widgets, the Live Activity, and a WKWebView pool that renders Mana apps as if they were native.

The main server

Hono on Vercel. Owns:

Authentication (JWT + argon2-hashed refresh tokens)
App / skill / user CRUD via a Drizzle + Supabase Postgres database
Sandbox pool orchestration (claims pre-warmed Fly machines atomically)
The toolkit APIs your Mana apps call at runtime — AI, storage, push, cron, email, db, Composio integrations

The sandbox

A Fly.io machine, one per active session. Drives Claude Code via the Anthropic Agent SDK. Compiles Mana apps with esbuild, persists workspace state to R2 as a tar.gz, and exposes MCP tools that the agent calls (init_app, update_app, run_code, check_app, generate_image, upload_file).

The data flow

iOS asks the main server for a sandbox. The server claims a pre-warmed Fly machine.
iOS opens a WebSocket directly to the sandbox. This is the chat stream — the main server is no longer in the request path.
The sandbox drives the agent. Tool calls hit the main server via HTTP callbacks for persistence; the agent's reply streams back to iOS as SSE.
The sandbox suspends after 3 minutes idle and tar.gz's its workspace to R2. On the next session, the workspace is rehydrated.

Why this shape

iOS → sandbox direct WebSocket: one fewer hop in the chat stream, lower latency, cheaper at scale.
Per-session VMs: the agent has read/write/exec on its own machine; nothing is shared across users.
Main server as orchestrator: persistence, auth, billing, search, push, callbacks. Everything that needs durable state.