Mana has four moving parts. Understanding how they fit together makes everything else easier.
The iOS app
The thing on your phone. Built in Swift 6 / SwiftUI for iOS 18+. Owns the chat UI, the library of apps you've built, widgets, the Live Activity, and a WKWebView pool that renders Mana apps as if they were native.
The main server
Hono on Vercel. Owns:
- Authentication (JWT + argon2-hashed refresh tokens)
- App / skill / user CRUD via a Drizzle + Supabase Postgres database
- Sandbox pool orchestration (claims pre-warmed Fly machines atomically)
- The toolkit APIs your Mana apps call at runtime — AI, storage, push, cron, email, db, Composio integrations
The sandbox
A Fly.io machine, one per active session. Drives Claude Code via the Anthropic Agent SDK. Compiles Mana apps with esbuild, persists workspace state to R2 as a tar.gz, and exposes MCP tools that the agent calls (init_app, update_app, run_code, check_app, generate_image, upload_file).
The data flow
- iOS asks the main server for a sandbox. The server claims a pre-warmed Fly machine.
- iOS opens a WebSocket directly to the sandbox. This is the chat stream — the main server is no longer in the request path.
- The sandbox drives the agent. Tool calls hit the main server via HTTP callbacks for persistence; the agent's reply streams back to iOS as SSE.
- The sandbox suspends after 3 minutes idle and tar.gz's its workspace to R2. On the next session, the workspace is rehydrated.
Why this shape
- iOS → sandbox direct WebSocket: one fewer hop in the chat stream, lower latency, cheaper at scale.
- Per-session VMs: the agent has read/write/exec on its own machine; nothing is shared across users.
- Main server as orchestrator: persistence, auth, billing, search, push, callbacks. Everything that needs durable state.