24 KiB
System Architecture
Below is a summary of the Production VPS and Development Laptop architectures. Both environments use Docker containers for consistency, with near-identical stacks where practical.
flowchart LR
%% Client
A(Browser / PWA)
Y(iOS App / Android App)
subgraph User
A
Y
end
%% LLM / Realtime
B(OpenAI Realtime API)
Z(Gemini Live API)
subgraph Large Language Model
B
Z
end
%% Server-side
C(Caddy)
I(Gitea + Actions + Repositories)
J(Gitea Runner)
D(Next.js Frontend)
E(FastAPI Backend + Agent Runtime)
G(LiveKit Server)
H[(PostgreSQL + pgvector)]
%% Client ↔ VPS
A <-- https://www.avaaz.ai --> C
A <-- https://app.avaaz.ai --> C
A & Y <-- https://api.avaaz.ai --> C
A & Y <-- wss://rtc.avaaz.ai --> C
A & Y <-- "udp://rtc.avaaz.ai:50000-60000 (WebRTC Media)" --> G
%% Caddy ↔ App
C <-- "http://frontend:3000 (app)" --> D
C <-- "http://backend:8000 (api)" --> E
C <-- "ws://livekit:7880 (WebRTC signaling)" --> G
C <-- "http://gitea:3000 (git)" --> I
%% App internal
D <-- "http://backend:8000" --> E
E <-- "postgresql://postgres:5432" --> H
E <-- "http://livekit:7880 (control)" --> G
E <-- "Agent joins via WebRTC" --> G
%% Agent ↔ LLM
E <-- "WSS/WebRTC (realtime)" --> B
E <-- "WSS (streaming)" --> Z
%% CI/CD
I <-- "CI/CD triggers" --> J
subgraph VPS
subgraph Infra
C
I
J
end
subgraph App
D
E
G
H
end
end
%% Development Environment
L(VS Code + Git + Docker)
M(Local Docker Compose)
N(Local Browser)
O(Local Frontend)
P(Local Backend)
Q[(Local Postgres)]
R(Local LiveKit)
L <-- "https://git.avaaz.ai/...git" --> C
L <-- "ssh://git@git.avaaz.ai:2222/..." --> I
L -- "docker compose up" --> M
M -- "Build & Run" --> O & P & Q & R
N <-- HTTP --> O & P
N <-- WebRTC --> R
O <-- HTTP --> P
P <-- SQL --> Q
P <-- HTTP/WebRTC --> R
P <-- WSS/WebRTC --> B
P <-- WSS --> Z
subgraph Development Laptop
L
M
N
subgraph Local App
O
P
Q
R
end
end
1. Production VPS
1.1 Components
Infra Stack
Docker Compose: ./infra/docker-compose.yml.
| Container | Description |
|---|---|
caddy |
Caddy – Reverse proxy with automatic HTTPS (TLS termination via Let’s Encrypt). |
gitea |
Gitea + Actions – Git server using SQLite. Automated CI/CD workflows. |
gitea-runner |
Gitea Runner – Executes CI/CD jobs defined in Gitea Actions workflows. |
App Stack
Docker Compose: ./app/docker-compose.yml.
| Container | Description |
|---|---|
frontend |
Next.js Frontend – SPA/PWA interface served from a Node.js-based Next.js server. |
backend |
FastAPI + Uvicorn Backend – API, auth, business logic, LiveKit orchestration, agent. |
postgres |
PostgreSQL + pgvector – Persistent relational database with vector search. |
livekit |
LiveKit Server – WebRTC signaling plus UDP media for real-time audio and data. |
The backend uses several Python packages such as UV, Ruff, FastAPI, FastAPI Users, FastAPI-pagination, FastStream, FastMCP, Pydantic, PydanticAI, Pydantic-settings, LiveKit Agent, Google Gemini Live API, OpenAI Realtime API, SQLAlchemy, Alembic, docling, Gunicorn, Uvicorn[standard], Pyright, Pytest, Hypothesis, and Httpx to deliver the services.
1.2 Network
- All containers join a shared
proxyDocker network. - Caddy can route to any service by container name.
- App services communicate internally:
- Frontend ↔ Backend
- Backend ↔ Postgres
- Backend ↔ LiveKit
- Backend (agent) ↔ LiveKit & external LLM realtime APIs
1.3 Public DNS Records
| Hostname | Record Type | Target | Purpose |
|---|---|---|---|
| www.avaaz.ai | CNAME | avaaz.ai | Marketing / landing site |
| avaaz.ai | A | 217.154.51.242 | Root domain |
| app.avaaz.ai | A | 217.154.51.242 | Next.js frontend (SPA/PWA) |
| api.avaaz.ai | A | 217.154.51.242 | FastAPI backend |
| rtc.avaaz.ai | A | 217.154.51.242 | LiveKit signaling + media |
| git.avaaz.ai | A | 217.154.51.242 | Gitea (HTTPS + SSH) |
1.4 Public Inbound Firewall Ports & Protocols
| Port | Protocol | Purpose |
|---|---|---|
| 80 | TCP | HTTP, ACME HTTP-01 challenge |
| 443 | TCP | HTTPS, WSS (frontend, backend, LiveKit) |
| 2222 | TCP | Git SSH via Gitea |
| 2885 | TCP | VPS SSH access |
| 3478 | UDP | STUN/TURN |
| 5349 | TCP | TURN over TLS |
| 7881 | TCP | LiveKit TCP fallback |
| 50000–60000 | UDP | LiveKit WebRTC media |
1.5 Routing
Caddy
Caddy routes traffic from public ports 80 and 443 to internal services.
https://www.avaaz.ai→http://frontend:3000https://app.avaaz.ai→http://frontend:3000https://api.avaaz.ai→http://backend:8000wss://rtc.avaaz.ai→ws://livekit:7880https://git.avaaz.ai→http://gitea:3000
Internal Container Network
frontend→http://backend:8000backend→postgres://postgres:5432backend→http://livekit:7880(control)backend→ws://livekit:7880(signaling)backend→udp://livekit:50000-60000(media)gitea-runner→/var/run/docker.sock(Docker API on host)
Outgoing
backend→https://api.openai.com/v1/realtime/sessionsbackend→wss://api.openai.com/v1/realtime?model=gpt-realtimebackend→wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent
1.6 Functional Layers
Data Layer
Infra:
-
SQLite (Gitea)
- Gitea stores Git metadata (users, repos, issues, Actions metadata) in
/data/gitea/gitea.db. - This is a file-backed SQLite database inside a persistent Docker volume.
- Repository contents are stored under
/data/git/, also volume-backed.
- Gitea stores Git metadata (users, repos, issues, Actions metadata) in
-
Gitea Runner State
- Gitea Actions runner stores its registration information and job metadata under
/data/.runner.
- Gitea Actions runner stores its registration information and job metadata under
App:
-
PostgreSQL with pgvector
- Primary relational database for users, lessons, transcripts, embeddings, and conversational context.
- Hosted in the
postgrescontainer with a persistent Docker volume. - Managed via SQLAlchemy and Alembic migrations in the backend.
-
LiveKit Ephemeral State
- Room metadata, participant states, and signaling information persist in memory within the
livekitcontainer. - LiveKit’s SFU media buffers and room state are not persisted across restarts.
- Room metadata, participant states, and signaling information persist in memory within the
Control Layer
Infra:
-
Caddy
- TLS termination (Let’s Encrypt).
- Reverse proxy and routing for all public domains.
- ACME certificate renewal.
-
Gitea
- Git hosting, pull/clone over SSH and HTTPS.
- CI/CD orchestration via Actions and internal APIs.
-
Gitea Runner
- Executes workflows and controls the Docker engine via
/var/run/docker.sock.
- Executes workflows and controls the Docker engine via
App:
-
FastAPI Backend
- Authentication and authorization (
/auth/login,/auth/refresh,/auth/me). - REST APIs for lessons, progress, documents, and file handling.
- LiveKit session management (room mapping
/sessions/default, token minting/sessions/default/token, agent configuration). - Calls out to OpenAI Realtime and Gemini Live APIs for AI-driven conversational behavior.
- Authentication and authorization (
-
LiveKit Server
- Manages room signaling, participant permissions, and session state.
- Exposes HTTP control endpoint for room and participant management.
Media Layer
App:
-
User Audio Path
- Browser/mobile → LiveKit:
- WSS signaling via
rtc.avaaz.ai→ Caddy →livekit:7880. - UDP audio and data channels via
rtc.avaaz.ai:50000–60000directly to LiveKit on the VPS.
- WSS signaling via
- WebRTC handles ICE, STUN/TURN, jitter buffers, and Opus audio encoding.
- Browser/mobile → LiveKit:
-
AI Agent Audio Path
- The agent logic inside the backend uses LiveKit Agent SDK to join rooms as a participant.
- Agent → LiveKit:
- WS signaling over the internal Docker network (
ws://livekit:7880). - UDP audio transport as part of its WebRTC session.
- WS signaling over the internal Docker network (
- Agent → LLM realtime API:
- Secure WSS/WebRTC connection to OpenAI Realtime or Gemini Live.
- The agent transcribes, processes, and generates audio responses, publishing them into the LiveKit room so the user hears natural speech.
1.7 CI/CD Pipeline
Production CI/CD is handled by Gitea Actions running on the VPS. The gitea-runner container has access to the host Docker daemon and is responsible for both validation and deployment:
.gitea/workflows/ci.yml– Continuous Integration (branch/PR validation, no deployment)..gitea/workflows/cd.yml– Continuous Deployment (tag-based releases to production).
Build Phase (CI Workflow: ci.yml)
Triggers
pushto:feature/**bugfix/**
pull_requesttargetingmain.
Runner & Environment
- Runs on the self-hosted runner labeled
linux_amd64. - Checks out the relevant branch or PR commit from the
avaaz-apprepository into the runner’s workspace.
Steps
-
Checkout code
Usesactions/checkout@v4to fetch the branch or PR head commit. -
Report triggering context
Logs the event type (pushorpull_request) and branches:- For
push: the source branch (e.g.,feature/foo). - For
pull_request: source and target (main).
- For
-
Static analysis & tests
- Run linters, type checkers, and unit tests for backend and frontend.
- Ensure the application code compiles/builds.
-
Build Docker images for CI
- Build images (e.g.,
frontend:ciandbackend:ci) to validate Dockerfiles and build chain. - These images are tagged for CI only and not used for production.
- Build images (e.g.,
-
Cleanup CI images
- Remove CI-tagged images at the end of the job (even on failure) to prevent disk usage from accumulating.
Outcome
- A green CI result on a branch/PR signals that:
- The code compiles/builds.
- Static checks and tests pass.
- Docker images can be built successfully.
- CI does not modify the production stack and does not depend on tags.
Deploy Phase (CD Workflow: cd.yml)
Triggers
- Creation of a Git tag matching
v*that points to a commit on themainbranch in theavaaz-apprepository.
Runner & Environment
- Runs on the same
linux_amd64self-hosted runner. - Checks out the exact commit referenced by the tag.
Steps
-
Checkout tagged commit
- Uses
actions/checkout@v4withref: ${{ gitea.ref }}to check out the tagged commit.
- Uses
-
Tag validation
- Fetches
origin/main. - Verifies that the tag commit is an ancestor of
origin/main(i.e., the tag points to code that has been merged intomain). - Fails the deployment if the commit is not in
main’s history.
- Fetches
-
Build & publish release
- Builds production Docker images for frontend, backend, LiveKit, etc., tagged with the version (e.g.,
v0.1.0). - Applies database migrations (e.g., via Alembic) if required.
- Builds production Docker images for frontend, backend, LiveKit, etc., tagged with the version (e.g.,
-
Restart production stack
- Restarts or recreates the app stack containers using the newly built/tagged images (e.g., via
docker compose -f docker-compose.yml up -d).
- Restarts or recreates the app stack containers using the newly built/tagged images (e.g., via
-
Health & readiness checks
- Probes key endpoints with
curl -f, such as:https://app.avaaz.aihttps://api.avaaz.ai/healthwss://rtc.avaaz.ai(signaling-level check)
- If checks fail, marks the deployment as failed and automatically rolls back to previous images.
- Probes key endpoints with
Outcome
- Only tagged releases whose commits are on the
mainbranch are deployed. - Deployment is explicit (tag-based), separated from CI validation.
1.8 Typical Workflows
User Login
- Browser loads the frontend from
https://app.avaaz.ai. - Frontend submits credentials to
POST https://api.avaaz.ai/auth/login. - Backend validates credentials and returns:
- A short-lived JWT access token
- A long-lived opaque refresh token
- A minimal user profile for immediate UI hydration
- Frontend stores tokens appropriately (access token in memory; refresh token in secure storage or an httpOnly cookie).
Load Persistent Session
- Frontend calls
GET https://api.avaaz.ai/sessions/default. - Backend retrieves or creates the user’s persistent conversational session, which encapsulates:
- Long-running conversation state
- Lesson and progress context
- Historical summary for LLM context initialization
- Backend prepares the session’s LLM context so that the agent can join with continuity.
Join the Live Conversation Session
- Frontend requests a LiveKit access token via
POST https://api.avaaz.ai/sessions/default/token. - Backend generates a new LiveKit token (short-lived, room-scoped), containing:
- Identity
- Publish/subscribe permissions
- Expiration (affecting initial join)
- Room ID corresponding to the session
- Frontend connects to the LiveKit server:
- WSS for signaling
- UDP/SCTP for low-latency audio and file transfer
- If the user disconnects, the frontend requests a new LiveKit token before rejoining, ensuring seamless continuity.
Conversation with AI Agent
- Backend configures the session’s AI agent using:
- Historical summary
- Current lesson state
- Language settings and mode (lesson, mock exam, free talk)
- The agent joins the same LiveKit room as a participant.
- All media flows through LiveKit:
- User → audio → LiveKit → Agent
- Agent → LLM realtime API → synthesized audio → LiveKit → User
- The agent guides the user verbally: continuing lessons, revisiting material, running mock exams, or free conversation.
The user experiences this as a continuous, ongoing session with seamless reconnection and state persistence.
1.9 Hardware
| Class | Description |
|---|---|
| system | Standard PC (i440FX + PIIX, 1996) |
| bus | Motherboard |
| memory | 96KiB BIOS |
| processor | AMD EPYC-Milan Processor |
| memory | 8GiB System Memory |
| bridge | 440FX - 82441FX PMC [Natoma] |
| bridge | 82371SB PIIX3 ISA [Natoma/Triton II] |
| communication | PnP device PNP0501 |
| input | PnP device PNP0303 |
| input | PnP device PNP0f13 |
| storage | PnP device PNP0700 |
| system | PnP device PNP0b00 |
| storage | 82371SB PIIX3 IDE [Natoma/Triton II] |
| bus | 82371SB PIIX3 USB [Natoma/Triton II] |
| bus | UHCI Host Controller |
| input | QEMU USB Tablet |
| bridge | 82371AB/EB/MB PIIX4 ACPI |
| display | QXL paravirtual graphic card |
| generic | Virtio RNG |
| storage | Virtio block device |
| disk | 257GB Virtual I/O device |
| volume | 238GiB EXT4 volume |
| volume | 4095KiB BIOS Boot partition |
| volume | 105MiB Windows FAT volume |
| volume | 913MiB EXT4 volume |
| network | Virtio network device |
| network | Ethernet interface |
| input | Power Button |
| input | AT Translated Set 2 keyboard |
| input | VirtualPS/2 VMware VMMouse |
2. Development Laptop
2.1 Components
App Stack (local Docker)
frontend(Next.js SPA)backend(FastAPI)postgres(PostgreSQL + pgvector)livekit(local LiveKit Server)
No Caddy is deployed locally; the browser talks directly to the mapped container ports on localhost.
2.2 Network
- All services run as Docker containers on a shared Docker network.
- Selected ports are published to
localhostfor direct access from the browser and local tools. - No public domains are used in development; everything is addressed via
http://localhost/....
2.3 Domains & IP Addresses
Local development uses:
http://localhost:3000→ frontend (Next.js dev/server container)http://localhost:8000→ backend API (FastAPI)- Example auth/session endpoints:
POST http://localhost:8000/auth/loginGET http://localhost:8000/sessions/defaultPOST http://localhost:8000/sessions/default/token
- Example auth/session endpoints:
ws://localhost:7880→ LiveKit signaling (local LiveKit server)udp://localhost:50000–60000→ LiveKit/WebRTC media
No /etc/hosts changes or TLS certificates are required; localhost acts as a secure origin for WebRTC.
2.4 Ports & Protocols
| Port | Protocol | Purpose |
|---|---|---|
| 3000 | TCP | Frontend (Next.js) |
| 8000 | TCP | Backend API (FastAPI) |
| 5432 | TCP | Postgres + pgvector |
| 7880 | TCP | LiveKit HTTP + WS signaling |
| 50000–60000 | UDP | LiveKit WebRTC media (audio, data) |
2.5 Routing
No local Caddy or reverse proxy layer is used; routing is direct via published ports.
Internal Container Routing (Docker network)
- Backend → Postgres:
postgres://postgres:5432 - Backend → LiveKit:
http://livekit:7880 - Frontend (server-side) → Backend:
http://backend:8000
Browser → Containers (via localhost)
- Browser → Frontend:
http://localhost:3000 - Browser → Backend API:
http://localhost:8000
Outgoing (from Backend)
backend→https://api.openai.com/v1/realtime/sessionsbackend→wss://api.openai.com/v1/realtime?model=gpt-realtimebackend→wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent
These calls mirror production agent behavior while pointing to the same cloud LLM realtime endpoints.
2.6 Functional Layers
Data Layer
- Local Postgres instance mirrors the production schema (including pgvector).
- Database migrations are applied via backend tooling (e.g., Alembic) to keep schema in sync.
Control Layer
- Backend runs full application logic locally:
- Authentication and authorization
- Lesson and progress APIs
- LiveKit session management (
/sessions/default,/sessions/default/token) and agent control
- Frontend integrates against the same API surface as production, only with
localhostURLs.
Media Layer
- Local LiveKit instance handles:
- WS/HTTP signaling on port 7880
- WebRTC media (audio + data channels) on UDP
50000–60000
- Agent traffic mirrors production logic:
- LiveKit ↔ Backend ↔ LLM realtime APIs (OpenAI / Gemini).
2.7 Typical Workflows
Developer Pushes Code
- Developer pushes to
git.avaaz.aiover HTTPS/SSL or SSH. - CI runs automatically (linting, tests, build validation). No deployment occurs.
- When a release is ready, the developer creates a version tag (
v*) on a commit inmain. - CD triggers: validates the tag, rebuilds from the tagged commit, deploys updated containers, then performs post-deploy health checks.
App Development
- Start the stack:
docker compose -f docker-compose.dev.yml up -d - Open the app in the browser:
http://localhost:3000 - Frontend calls the local backend for:
POST http://localhost:8000/auth/loginGET http://localhost:8000//sessions/defaultPOST http://localhost:8000//sessions/default/token
API Testing
-
Health check:
curl http://localhost:8000/health -
Auth and session testing:
curl -X POST http://localhost:8000/auth/login \ -H "Content-Type: application/json" \ -d '{"email": "user@example.com", "password": "password"}' curl http://localhost:8000/sessions/default \ -H "Authorization: Bearer <access_token>"
LiveKit Testing
- Frontend connects to LiveKit via:
- Signaling:
ws://localhost:7880 - WebRTC media:
udp://localhost:50000–60000
- Signaling:
- Backend issues local LiveKit tokens via
POST http://localhost:8000//sessions/default/token, then connects the AI agent to the local room.
2.8 Hardware
| Class | Description |
|---|---|
| system | HP Laptop 14-em0xxx |
| bus | 8B27 motherboard bus |
| memory | 128KiB BIOS |
| processor | AMD Ryzen 3 7320U |
| memory | 256KiB L1 cache |
| memory | 2MiB L2 cache |
| memory | 4MiB L3 cache |
| memory | 8GiB System Memory |
| bridge | Family 17h-19h PCIe Root Complex |
| generic | Family 17h-19h IOMMU |
| storage | SK hynix BC901 HFS256GE SSD |
| disk | 256GB NVMe disk |
| volume | 299MiB Windows FAT volume |
| volume | 238GiB EXT4 volume |
| network | RTL8852BE PCIe 802.11ax Wi-Fi |
| display | Mendocino integrated graphics |
| multimedia | Rembrandt Radeon High Definition Audio |
| generic | Family 19h PSP/CCP |
| bus | AMD xHCI Host Controller |
| input | Logitech M705 Mouse |
| input | Logitech K370s/K375s Keyboard |
| multimedia | Jabra SPEAK 510 USB |
| multimedia | Logitech Webcam C925e |
| communication | Bluetooth Radio |
| multimedia | HP True Vision HD Camera |
| bus | FCH SMBus Controller |
| bridge | FCH LPC Bridge |
| power | AE03041 Battery |
| input | Power Button |
| input | Lid Switch |
| input | HP WMI Hotkeys |
| input | AT Translated Set 2 Keyboard |
| input | Video Bus |
| input | SYNA32D9:00 06CB:CE17 Mouse |
| input | SYNA32D9:00 06CB:CE17 Touchpad |
| network | Ethernet Interface |