first commit

This commit is contained in:
2025-11-24 00:30:36 +01:00
commit aef53eb953
12 changed files with 2202 additions and 0 deletions

920
README.md Normal file
View File

@@ -0,0 +1,920 @@
# Product Description
<img src="img/logo.png" alt="avaaz.ai" height="90" align="right"/>
<p>
<b>avaaz.ai</b> is a mobile and web application featuring a motivating <b>conversational AI tutor</b> powered by advanced agentic capabilities. It teaches oral language skills through structured, interactive lessons that adapt to each students pace and performance. The core goal is to help students speak new languages confidently to <b>pass the B2 oral proficiency exam</b>.
</p>
## 1. Features
1. **Voice-First Conversational Engine** — Students engage in ultra-low-latency speech-to-speech interaction with the AI Tutor, enabling natural dialogue and instant corrective feedback using speech, text, and visuals.
2. **CEFR-B2 Aligned Curriculum with Real-Time AI Practice** — A full CEFR-based speaking progression up to B2, seamlessly integrated with adaptive AI conversation to bridge passive knowledge and active speaking skills.
3. **Immigrant-Focused Real-Life Scenarios** — Lessons target real-world contexts relevant to immigrants, such as workplace, healthcare, school, or daily interactions, enhancing integration and confidence in practical use.
4. **Mock Oral Exam Mode** — Simulates B2 oral exams and citizenship interviews with timed prompts, rubrics, and examiner-style feedback to build test readiness.
5. **Multilingual Scaffolding and Integrated Translation** — Learners receive UI support, bilingual explanations, and on-demand translations in their native language, helping low-confidence speakers stay engaged.
6. **Comprehensive Speaking Feedback** — Beyond pronunciation and grammar, learners get targeted insights on fluency, phrasing, coherence, and vocabulary range, aligned with B2 standards.
7. **Accent and Cultural Adaptation** — Lessons reflect local dialects and cultural etiquette relevant to the learners destination country, supporting realistic and socially appropriate speech.
8. **Immersive Role-Plays with Visual Cues** — Speaking simulations are enhanced with contextual images (e.g. menus, documents, locations) to deepen realism and task-based practice.
9. **Gamified Progress and Motivation** — Daily speaking challenges, streak rewards, and level badges reinforce habit formation and learning consistency.
10. **Post-Session Performance Summaries** — After each session, learners receive a concise report of strengths, errors, and suggested improvements to reinforce progress.
11. **Customizable AI Tutor Voices** — Learners can choose from various voice options with different genders or regional accents to diversify listening practice and personalize interaction.
12. **Cross-Device Learning Continuity** — Seamless synchronization ensures progress tracking and learning state remain consistent across mobile and web platforms.
## 2. Student & AI Tutor Interactions (English → Norwegian Bokmål)
### 2.1 Real-Time Conversational Flow
**Scenario:** A beginner is learning everyday greetings.
**AI Tutor (in Norwegian):** “Hei! Hvordan går det?”
**Student:** “Uh… I think that means, How are you?’”
**AI Tutor:** “Exactly! You can answer: *Jeg har det bra* — that means Im doing well. Try it: *Jeg har det bra.*
**Student:** “Jeg har det bra.”
**AI Tutor:** “Flott! Literally, it means I have it good. Norwegians use that to say theyre fine.”
**Benefit:** The AI provides instant translation and cultural context, helping English speakers grasp meaning and tone through natural conversation.
### 2.2 Adaptive Pronunciation, Fluency, and Grammar Support
**Scenario:** The learner practices sentence structure and pronunciation.
**AI Tutor:** “Lets say: *Jeg bor i Oslo*I live in Oslo.’”
**Student:** “Jeg bor på Oslo.”
**AI Tutor:** “Almost right! We say *i Oslo*, not *på Oslo*. *I* means in, while ** means on. Try again: *Jeg bor i Oslo.*
**Student:** “Jeg bor i Oslo.”
**AI Tutor:** “Perfekt! Now, about pronunciation — its *Oslo*, not *Osh-lo.* The *s* is soft. Listen: *Oslo.*
**Student:** “Oslo.”
**AI Tutor:** “Bra! That sounded natural.”
**Benefit:** The AI corrects grammar and pronunciation simultaneously, giving immediate, native-level feedback tailored to English-speaking learners.
### 2.3 Scenario-Based Oral Practice
**Scenario:** The learner prepares for a B2-level job interview.
**AI Tutor:** “Du er på et jobbintervju. Spørsmålet er: *Hvorfor vil du jobbe her?*Why do you want to work here?’”
**Student:** “Fordi jeg liker din selskap.”
**AI Tutor:** “Good start! In Norwegian, we say *firmaet ditt* instead of *din selskap*. Try: *Fordi jeg liker firmaet ditt og arbeidsmiljøet her.*Because I like your company and the work environment here.’”
**Student:** “Fordi jeg liker firmaet ditt og arbeidsmiljøet her.”
**AI Tutor:** “Veldig bra! That sounded confident and natural.”
**Benefit:** The learner rehearses authentic B2-level oral exam tasks and professional interactions with context-sensitive correction and adaptive feedback.
## 3. Pricing Plans
| **Plan** | **Hours** | **NOK** | **Purpose** |
| --------------------------------- | :-------: | ------------: | ------------------------------------------------------------------- |
| **First Light** | 1 | **Gratis** | A taste of the experience - no pressure, no card required. |
| **Spark** | 5 | **kr 999** | A short, flexible plan to get started or focus on specific skills. |
| **Glow** | 15 | **kr 1 999** | Great for consistent improvement and practicing real conversations. |
| **⭐ Shine** *(Recommended)* | 50 | **kr 5 999** | The sweet spot for building natural fluency and confidence. |
| **Radiance** | 200 | **kr 17 999** | Designed for dedicated learners seeking transformation. |
## 4. Configuration
### 4.1 Configure the VPS
#### 4.1.1 Configure the firewal at the VPS host
| Public IP |
| :------------: |
| 217.154.51.242 |
| Action | Allowed IP | Protocol | Port(s) | Description |
| :-----: | :--------: | :------: | ----------: | :------------ |
| Allow | Any | TCP | 80 | HTTP |
| Allow | Any | TCP | 443 | HTTPS |
| Allow | Any | TCP | 2222 | Git SSH |
| Allow | Any | TCP | 2885 | VPS SSH |
| Allow | Any | UDP | 3478 | STUN/TURN |
| Allow | Any | TCP | 5349 | TURN/TLS |
| Allow | Any | TCP | 7881 | LiveKit TCP |
| Allow | Any | UDP | 50000-60000 | LiveKit Media |
#### 4.1.2 Configure the DNS settings at domain registrar
| Host (avaaz.ai) | Type | Value |
| :-------------: | :---: | :------------: |
| @ | A | 217.154.51.242 |
| www | CNAME | avaaz.ai |
| app | A | 217.154.51.242 |
| api | A | 217.154.51.242 |
| rtc | A | 217.154.51.242 |
| git | A | 217.154.51.242 |
#### 4.1.3 Change the SSH port from 22 to 2885
1. Connect to the server.
```bash
ssh username@avaaz.ai
```
2. Edit the SSH configuration file.
```bash
sudo nano /etc/ssh/sshd_config
```
3. Add port 2885 to the file and comment out port 22.
```text
#Port 22
Port 2885
```
4. Save the file and exit the editor.
- Press `Ctrl+O`, then `Enter` to save, and `Ctrl+X` to exit.
5. Restart the SSH service.
```bash
sudo systemctl daemon-reload && sudo systemctl restart ssh.socket && sudo systemctl restart ssh.service
```
6. **Before closing the current session**, open a new terminal window and connect to the server to verify the changes work correctly.
```bash
ssh username@avaaz.ai # ssh: connect to host avaaz.ai port 22: Connection timed out
ssh username@avaaz.ai -p 2885
```
7. Once the connection is successful, close the original session safely.
#### 4.1.4 Build and deploy the infrastructure
1. Check with `dig git.avaaz.ai +short` wether the DNS settings have been propagated.
2. SSH into the VPS to install Docker & docker compose.
```bash
ssh username@avaaz.ai -p 2885
```
3. Update system packages.
```bash
sudo apt update && sudo apt upgrade -y
```
4. Install dependencies for Dockers official repo
```bash
sudo apt install -y \
ca-certificates \
curl \
gnupg \
lsb-release
```
5. Add Dockers official APT repo.
```bash
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg \
sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) \
signed-by=/etc/apt/keyrings/docker.gpg] \
https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
```
6. Install Docker Engine + compose plugin.
```bash
sudo apt install -y \
docker-ce \
docker-ce-cli \
containerd.io \
docker-buildx-plugin \
docker-compose-plugin
```
7. Verify the installation.
```bash
sudo docker --version
sudo docker compose version
```
8. Create the `/etc/docker/daemon.json` file to avoid issues with overusing disk for log data.
```bash
sudo nano /etc/docker/daemon.json
```
9. Paste the following.
```json
{
"log-driver": "local",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
```
10. Save the file and exit the editor.
- Press `Ctrl+O`, then `Enter` to save, and `Ctrl+X` to exit.
11. Restart the docker service to apply changes.
```bash
sudo systemctl daemon-reload
sudo systemctl restart docker
```
12. Create directory for infra stack in `/srv/infra`.
```bash
sudo mkdir -p /srv/infra
sudo chown -R $USER:$USER /srv/infra
cd /srv/infra
```
13. Create directories for Gitea (repos, config, etc.) and Runner persistent data. Gitea runs as UID/GID 1000 by default.
```bash
mkdir -p gitea-data gitea-runner-data
```
14. Create the `/srv/infra/docker-compose.yml` (Caddy + Gitea + Runner) file.
```bash
nano docker-compose.yml
```
15. Paste the following.
```yaml
services:
caddy:
# Use the latest official Caddy image
image: caddy:latest
# Docker Compose automatically generates container names: <folder>_<service>_<index>
container_name: caddy # Fixed name used by Docker engine
# Automatically restart unless manually stopped
restart: unless-stopped
ports:
# Expose HTTP (ACME + redirect)
- "80:80"
# Expose HTTPS/WSS (frontend, backend, LiveKit)
- "443:443"
volumes:
# Mount the Caddy config file read-only
- ./Caddyfile:/etc/caddy/Caddyfile:ro
# Caddy TLS certs (persistent Docker volume)
- caddy_data:/data
# Internal Caddy state/config
- caddy_config:/config
networks:
# Attach to the shared "proxy" network
- proxy
gitea:
# Official Gitea image with built-in Actions
image: gitea/gitea:latest
container_name: gitea # Fixed name used by Docker engine
# Auto-restart service
restart: unless-stopped
environment:
# Run Gitea as host user 1000 (prevents permission issues)
- USER_UID=1000
# Same for group
- USER_GID=1000
# Use SQLite (stored inside /data)
- GITEA__database__DB_TYPE=sqlite3
# Location of the SQLite DB
- GITEA__database__PATH=/data/gitea/gitea.db
# Custom config directory
- GITEA_CUSTOM=/data/gitea
volumes:
# Bind mount instead of Docker volume because:
# - We want repos, configs, SSH keys, and SQLite DB **visible and editable** on host
# - Easy backups (just copy `./gitea-data`)
# - Easy migration
# - Avoids losing data if Docker volumes are pruned
- ./gitea-data:/data
networks:
- proxy
ports:
# SSH for Git operations mapped to host 2222
- "2222:22"
gitea-runner:
# Official Gitea Actions Runner
image: gitea/act_runner:latest
container_name: gitea-runner # Fixed name used by Docker engine
restart: unless-stopped
depends_on:
# Runner requires Gitea to be available
- gitea
volumes:
# Runner uses host Docker daemon to spin up job containers (Docker-out-of-Docker)
- /var/run/docker.sock:/var/run/docker.sock
# Bind mount instead of volume because:
# - Runner identity is stored in /data/.runner
# - Must persist across container recreations
# - Prevents duplicated runner registrations in Gitea
# - Easy to inspect/reset via `./gitea-runner-data/.runner`
- ./gitea-runner-data:/data
environment:
# Base URL of your Gitea instance
- GITEA_INSTANCE_URL=${GITEA_INSTANCE_URL}
# One-time registration token
- GITEA_RUNNER_REGISTRATION_TOKEN=${GITEA_RUNNER_REGISTRATION_TOKEN}
# Human-readable name for the runner
- GITEA_RUNNER_NAME=${GITEA_RUNNER_NAME}
# Runner labels (e.g., ubuntu-latest)
- GITEA_RUNNER_LABELS=${GITEA_RUNNER_LABELS}
# Set container timezone to UTC for consistent logs
- TZ=Etc/UTC
networks:
- proxy
# Start runner using persisted config
command: ["act_runner", "daemon", "--config", "/data/.runner"]
networks:
proxy:
# Shared network for Caddy + Gitea (+ later app stack)
name: proxy
# Default Docker bridge network
driver: bridge
volumes:
# Docker volume for Caddy TLS data (safe to keep inside Docker)
caddy_data:
name: caddy_data
# Docker volume for internal Caddy configs/state
caddy_config:
name: caddy_config
```
16. Save the file and exit the editor.
- Press `Ctrl+O`, then `Enter` to save, and `Ctrl+X` to exit.
17. Create the `/srv/infra/.env` file with environment variables.
```bash
nano .env
```
18. Paste the following:
```env
# Base URL of your Gitea instance (used by the runner to register itself
# and to send/receive workflow job information).
GITEA_INSTANCE_URL=https://git.avaaz.ai
# One-time registration token generated in:
# Gitea → Site Administration → Actions → Runners → "Generate Token"
# This MUST be filled in once, so the runner can register.
# After registration, the runner stores its identity inside ./gitea-runner-data/.runner
# and this value is no longer needed (can be left blank).
GITEA_RUNNER_REGISTRATION_TOKEN=
# Human-readable name for this runner.
# This is shown in the Gitea UI so you can distinguish multiple runners:
# Example: "vps-runner", "staging-runner", "gpu-runner"
GITEA_RUNNER_NAME=gitea-runner
# Runner labels allow workflows to choose specific runners.
# The label format is: label[:schema[:args]]
# - "ubuntu-latest" is the <label> name that workflows request using runs-on: [ "ubuntu-latest" ].
# - "docker://" is the <schema> indicating the job runs inside a separate Docker container.
# - "catthehacker/ubuntu:act-latest" is the <args>, specifying the Docker image to use for the container.
# Workflows can target this using:
# runs-on: [ "ubuntu-latest" ]
GITEA_RUNNER_LABELS=ubuntu-latest:docker://catthehacker/ubuntu:act-latest
```
19. Save the file and exit the editor.
- Press `Ctrl+O`, then `Enter` to save, and `Ctrl+X` to exit.
20. Create `/srv/infra/Caddyfile` to configure Caddy.
```bash
nano Caddyfile
```
21. Paste the following:
```caddy
{
# Global Caddy options.
#
# auto_https on
# - Caddy listens on port 80 for every host (ACME + redirect).
# - Automatically issues HTTPS certificates.
# - Automatically redirects HTTP → HTTPS unless disabled.
#
}
# ------------------------------------------------------------
# Redirect www → root domain
# ------------------------------------------------------------
www.avaaz.ai {
# Permanent redirect to naked domain
redir https://avaaz.ai{uri} permanent
}
# ------------------------------------------------------------
# Marketing site (optional — if frontend handles it, remove this)
# Redirect root → app
# ------------------------------------------------------------
avaaz.ai {
# If you have a static marketing page, serve it here.
# If not, redirect visitors to the app.
redir https://app.avaaz.ai{uri}
}
# ------------------------------------------------------------
# Frontend (Next.js)
# Public URL: https://app.avaaz.ai
# Internal target: frontend:3000
# ------------------------------------------------------------
app.avaaz.ai {
# Reverse-proxy HTTPS traffic to the frontend container
reverse_proxy frontend:3000
# Access log for debugging frontend activity
log {
output file /data/app-access.log
}
# Compression for faster delivery of JS, HTML, etc.
encode gzip zstd
}
# ------------------------------------------------------------
# Backend (FastAPI)
# Public URL: https://api.avaaz.ai
# Internal target: backend:8000
# ------------------------------------------------------------
api.avaaz.ai {
# Reverse-proxy all API traffic to FastAPI
reverse_proxy backend:8000
# Access log — useful for monitoring API traffic and debugging issues
log {
output file /data/api-access.log
}
# Enable response compression (JSON, text, etc.)
encode gzip zstd
}
# ------------------------------------------------------------
# LiveKit (signaling only — media uses direct UDP)
# Public URL: wss://rtc.avaaz.ai
# Internal target: livekit:7880
# ------------------------------------------------------------
rtc.avaaz.ai {
# LiveKit uses WebSocket signaling, so we reverse-proxy WS → WS
reverse_proxy livekit:7880
# Access log — helps diagnose WebRTC connection failures
log {
output file /data/rtc-access.log
}
# Compression not needed for WS traffic, but harmless
encode gzip zstd
}
# ------------------------------------------------------------
# Gitea (Git server UI + HTTPS + SSH clone)
# Public URL: https://git.avaaz.ai
# Internal target: gitea:3000
# ------------------------------------------------------------
git.avaaz.ai {
# Route all HTTPS traffic to Giteas web UI
reverse_proxy gitea:3000
# Log all Git UI requests and API access
log {
output file /data/git-access.log
}
# Compress UI responses
encode gzip zstd
}
```
22. Save the file and exit the editor.
- Press `Ctrl+O`, then `Enter` to save, and `Ctrl+X` to exit.
23. Start the stack from `/srv/infra`.
```bash
sudo docker compose pull # fetch images: caddy, gitea, act_runner
sudo docker compose up -d # start all containers in the background
```
24. Verify that the status of all the containers are `Up`.
```bash
sudo docker compose ps -a
```
25. Open `https://git.avaaz.ai` in your browser. Caddy should have already obtained a cert and you should see the Gitea installer.
26. Configure database settings.
- **Database Type:** `SQLite3`
- **Path:** `/data/gitea/gitea.db` *(matches `GITEA__database__PATH`)*
27. Configure general settings.
- **Site Title:** default *(`Gitea: Git with a cup of tea`)*
- **Repository Root Path:** default *(`/data/git/repositories`)*
- **LFS Root Path:** default *(`/data/git/lfs`)*
28. Configure server settings.
- **Domain:** `git.avaaz.ai` *(external HTTPS via Caddy)*
- **SSH Port:** `2222` *(external SSH port)*
- **HTTP Port:** `3000` *(internal HTTP port)*
- **Gitea Base URL / ROOT_URL:** `https://git.avaaz.ai/`
29. Create the admin account (username + password + email) and finish installation.
30. Edit Gitea `/data/gitea/conf/app.ini` at the host bind mount `/srv/infra/gitea-data/gitea/conf/app.ini`.
```bash
nano gitea-data/gitea/conf/app.ini
```
31. Add/verify the following sections.
```ini
[server]
; Gitea serves HTTP internally (Caddy handles HTTPS externally)
PROTOCOL = http
; External hostname used for links and redirects
DOMAIN = git.avaaz.ai
; Hostname embedded in SSH clone URLs
SSH_DOMAIN = git.avaaz.ai
; Internal container port Gitea listens on (Caddy reverse-proxies to this)
HTTP_PORT = 3000
; Public-facing base URL (MUST be HTTPS when behind Caddy)
ROOT_URL = https://git.avaaz.ai/
; Enable Gitea's built-in SSH server inside the container
DISABLE_SSH = false
; Host-side SSH port exposed by Docker (mapped to container:22)
SSH_PORT = 2222
; Container-side SSH port (always 22 inside the container)
SSH_LISTEN_PORT = 22
[database]
; SQLite database file stored in bind-mounted volume
PATH = /data/gitea/gitea.db
; Using SQLite (sufficient for single-node small/medium setups)
DB_TYPE = sqlite3
[security]
; Prevent web-based reinstallation (crucial for a secured instance)
INSTALL_LOCK = true
; Auto-generated on first startup; DO NOT change or delete
SECRET_KEY =
[actions]
; Enable Gitea Actions (CI/CD)
ENABLED = true
; Default platform to get action plugins, github for https://github.com, self for the current Gitea instance.
DEFAULT_ACTIONS_URL = github
```
32. Restart Gitea to apply changes.
```bash
sudo docker compose restart gitea
```
33. Check if Actions is enabled.
1. Log in as admin at `https://git.avaaz.ai`.
2. Go to **Site Administration**.
3. Look for a menu item **Actions**. If `[actions] ENABLED = true` in `app.ini`, there will be options related to **Runners**, allowing management of instance-level action runners. Otherwise, the Actions menu item in the Site Administration panel will not appear, indicating the feature is globally disabled.
34. Get registration token to register the Gitea Actions runner and create a *user* account.
1. Log in as admin at `https://git.avaaz.ai`.
2. Go to **Site Administration → Actions → Runners**.
3. Choose **Create new Runner**.
4. Copy the **Registration Token**.
5. Create a *user* account.
35. Edit `.env` to add the token.
```bash
nano .env
```
36. Paste the Registration Token after `=` without spaces.
```env
# One-time registration token generated in:
# Gitea → Site Administration → Actions → Runners → "Generate Token"
# This MUST be filled in once, so the runner can register.
# After registration, the runner stores its identity inside ./gitea-runner-data/.runner
# and this value is no longer needed (can be left blank).
GITEA_RUNNER_REGISTRATION_TOKEN=
```
37. Check for configuration changes and restart the container `gitea-runner`.
```bash
sudo docker compose up -d gitea-runner
```
38. Confirm that the Gitea instance URL, Runner name, and Runner labels in `gitea-runner-data/.runner` file are the same as the values in the `.env` file. Fix it using `nano gitea-runner-data/.runner` if different.
39. Verify that the Runner is connected to `https://git.avaaz.ai` and is polling for jobs.
```bash
sudo docker logs -f gitea-runner
```
40. Generate an SSH key on laptop. Accept the defaults and optionally set a passphrase. The public key is placed in `~/.ssh/id_ed25519.pub`.
```bash
ssh-keygen -t ed25519 -C "user@avaaz.ai"
```
41. Add the public key to Gitea.
1. Log into `https://git.avaaz.ai` as *user*.
2. Go to **Profile → Settings → SSH / GPG Keys → Add Key**.
3. Paste the contents starting with `ssh-ed25519` in `~/.ssh/id_ed25519.pub`.
4. Save.
42. Test SSH remote on laptop.
```bash
ssh -T -p 2222 git@git.avaaz.ai
```
43. Type `yes` to tell SSH client to trust the fingerprint and press `Enter`. Enter the passphrase and verify the response *You've successfully authenticated..., but Gitea does not provide shell access.*
44. Confirm that Giteas **clone URLs** of a repo show `ssh://git@git.avaaz.ai:2222/<user>/<repo>.git`.
45. Upgrade Docker images safely.
```bash
sudo docker compose pull # pull newer images
sudo docker compose up -d # recreate containers with new images
```
46. Restart the whole infra stack.
```bash
sudo docker compose restart # restart all containers
```
47. Check logs for troubleshooting.
```bash
sudo docker logs -f caddy # shows “obtaining certificate” or ACME errors if HTTPS fails.
sudo docker logs -f gitea # shows DB/permissions problems, config issues, etc.
sudo docker logs -f gitea-runner # shows registration/connection/job-execution issues.
```
#### 4.1.5 Validate the infrastructure
1. Confirm that all containers `caddy`, `gitea`, and `gitea-runner` are `Up`.
```bash
sudo docker compose ps -a
```
2. Confirm that `https://git.avaaz.ai` shows Gitea login page with a valid TLS cert (padlock icon) when opened in a browser.
3. Confirm the response *You've successfully authenticated..., but Gitea does not provide shell access.* when connecting to Gitea over SSH.
```bash
ssh -T -p 2222 git@git.avaaz.ai
```
4. Create a `test` repo in Gitea and confirm cloning it.
```bash
git clone ssh://git@git.avaaz.ai:2222/<your-user>/test.git
```
5. Confirm that the Actions runner `gitea-runner` is registered and online with status **Idle**.
1. Log in as admin at `https://git.avaaz.ai`.
2. Go to **Site Administration → Actions → Runners**.
6. Add `.gitea/workflows/test.yml` to the `test` repo root, commit and push.
```yaml
# Workflow Name
name: Test Workflow
# Trigger on a push event to any branch
on:
push:
branches:
# This means 'any branch'
- '**'
# Define the jobs to run
jobs:
hello:
# Specify the runner image to use
runs-on: [ "ubuntu-latest" ]
# Define the steps for this job
steps:
- name: Run a Test Script
run: echo "Hello from Gitea Actions!"
```
7. Confirm a workflow run appears in Gitea → test repo → **Actions** tab and progresses from queued → in progress → success.
8. Confirm the logs show the job picked up, container created, and the “Hello from Gitea Actions!” output.
```bash
sudo docker logs -f gitea-runner
```
### 4.2 Configure the Development Laptop
#### 4.2.1 Run Applicaiton
1. Removes all cached Python packages stored by pip, removes local Python cache files, clears the cache used by uv, and forcibly clear the cache for Node.js.
```bash
uv tool install cleanpy
pip cache purge && cleanpy . && uv cache clean && npm cache clean --force
```
2. Resolve dependencies from your *pyproject.toml* and upgrade all packages. Synchronize the virtual environment with the dependencies specified in the *uv.lock* including packages needed for development.
```bash
cd backend
uv lock --upgrade
uv sync --dev
```
3. Lint and check code for errors, style issues, and potential bugs, and try to fix them. Discover and run tests in *tests/*.
```bash
cd backend
uv run ruff check --fix && uv run pytest
```
4. Starts a local development API server, visible at port 8000, and automatically reloads the server as you make code changes.
```bash
cd backend
uv run uvicorn src.main:app --reload --port 8000
```
5. Scans dependencies for security vulnerabilities and attempts to automatically fix them by force-updating to the latest secure versions.
```bash
cd frontend
npm audit fix --force
```
6. Install dependencies from *package.json*, then update those dependencies to the latest allowed versions based on version ranges. Next, check the source code for stylistic and syntax errors according to configured rules. Finally, compile or bundle the application for deployment or production use.
```bash
cd frontend
npm install && npm update && npm run lint && npm run build
```
7. Execute start script in *package.json*, launch your Node.js application in production mode.
```bash
cd frontend
npm run start
```
## 5. Example Project Structure
```bash
avaaz.ai/
├── .dockerignore # Specifies files and directories to exclude from Docker builds, such as .git, node_modules, and build artifacts, to optimize image sizes.
├── .gitignore # Lists files and patterns to ignore in Git, including .env, __pycache__, node_modules, and logs, preventing sensitive or temporary files from being committed.
├── .gitattributes # Controls Gits handling of files across platforms (e.g. normalizing line endings with * text=auto), and can force certain files to be treated as binary or configure diff/merge drivers.
├── .env.example # Template for environment variables, showing required keys like DATABASE_URL, GEMINI_API_KEY, LIVEKIT_API_KEY without actual values.
├── docker-compose.dev.yml # Docker Compose file for development environment: defines services for local frontend, backend, postgres, livekit with volume mounts for hot-reloading.
├── docker-compose.prod.yml # Docker Compose file for production: defines services for caddy, gitea (if integrated), frontend, backend, postgres, livekit with optimized settings and no volumes for code.
├── README.md # Project overview: includes setup instructions, architecture diagram (embed the provided Mermaid), contribution guidelines, and deployment steps.
├── .gitea/ # Directory for Gitea-specific configurations, as the repo is hosted on Gitea.
│ └── workflows/ # Contains YAML files for Gitea Actions workflows, enabling CI/CD pipelines.
│ ├── ci.yml # Workflow for continuous integration: runs tests, linting (Ruff), type checks, and builds on pull requests or pushes.
│ └── cd.yml # Workflow for continuous deployment: triggers builds and deploys Docker images to the VPS on merges to main.
├── .vscode/ # Editor configuration for VS Code to standardize the development environment for all contributors.
│ ├── extensions.json # Recommends VS Code extensions (e.g. Python, ESLint, Docker, GitLens) so developers get linting, formatting, and container tooling out of the box.
│ └── settings.json # Workspace-level VS Code settings: formatter on save, path aliases, Python/TypeScript language server settings, lint integration (Ruff, ESLint), and file exclusions.
├── backend/ # Root for the FastAPI backend, following Python best practices for scalable applications (inspired by FastAPI's "Bigger Applications" guide).
│ ├── Dockerfile # Builds the backend container: installs UV, copies pyproject.toml, syncs dependencies, copies source code, sets entrypoint to Gunicorn/Uvicorn.
│ ├── pyproject.toml # Project metadata and dependencies: uses UV for dependency management, specifies FastAPI, SQLAlchemy, Pydantic, LiveKit-Agent, etc.; includes [tool.uv], [tool.ruff] sections for config.
│ ├── uv.lock # Lockfile generated by UV, ensuring reproducible dependencies across environments.
│ ├── ruff.toml # Configuration for Ruff linter and formatter (can be in pyproject.toml): sets rules for Python code style, ignoring certain errors if needed.
│ ├── alembic.ini # Configuration for Alembic migrations: points to SQLAlchemy URL and script location.
│ ├── alembic/ # Directory for database migrations using Alembic, integrated with SQLAlchemy.
│ │ ├── env.py # Alembic environment script: sets up the migration context with SQLAlchemy models and pgvector support.
│ │ ├── script.py.mako # Template for generating migration scripts.
│ │ └── versions/ # Auto-generated migration files: each represents a database schema change, e.g., create_tables.py.
│ ├── src/ # Source code package: keeps business logic isolated, importable as 'from src import ...'.
│ │ ├── __init__.py # Makes src a package.
│ │ ├── main.py # FastAPI app entrypoint: initializes app, includes routers, sets up middleware, connects to Gemini Live via prompts.
│ │ ├── config.py # Application settings: uses Pydantic-settings to load from .env, e.g., DB_URL, API keys for Gemini, LiveKit, Stripe (for pricing plans).
│ │ ├── api/ # API-related modules: organizes endpoints and dependencies.
│ │ │ ├── __init__.py # Package init.
│ │ │ ├── dependencies.py # Global dependencies: e.g., current_user via FastAPI Users, database session.
│ │ │ └── v1/ # Versioned API: allows future versioning without breaking changes.
│ │ │ └── routers/ # API routers: modular endpoints.
│ │ │ ├── auth.py # Handles authentication: uses FastAPI Users for JWT/OAuth, user registration/login.
│ │ │ ├── users.py # User management: progress tracking, plan subscriptions.
│ │ │ ├── lessons.py # Lesson endpoints: structured oral language lessons, progress tracking.
│ │ │ ├── chat.py # Integration with LiveKit and Gemini: handles conversational AI tutor sessions.
│ │ │ └── documents.py # Document upload and processing: endpoints for file uploads, using Docling for parsing and semantic search prep.
│ │ ├── core/ # Core utilities: shared across the app.
│ │ │ ├── __init__.py # Package init.
│ │ │ └── security.py # Security functions: hashing, JWT handling via FastAPI Users.
│ │ ├── db/ # Database layer: SQLAlchemy setup with pgvector for vector embeddings (e.g., for AI tutor memory).
│ │ │ ├── __init__.py # Package init.
│ │ │ ├── base.py # Base model class for SQLAlchemy declarative base.
│ │ │ ├── session.py # Database session management: async session maker.
│ │ │ └── models/ # SQLAlchemy models.
│ │ │ ├── __init__.py # Exports all models.
│ │ │ ├── user.py # User model: includes fields for progress, plan, proficiency.
│ │ │ ├── lesson.py # Lesson and session models: tracks user interactions, B2 exam prep.
│ │ │ └── document.py # Document chunk model: for semantic search, with text, metadata, embedding (pgvector).
│ │ ├── schemas/ # Pydantic schemas: for API validation and serialization.
│ │ │ ├── __init__.py # Exports schemas.
│ │ │ ├── user.py # User schemas: create, read, update.
│ │ │ ├── lesson.py # Lesson schemas: input/output for AI interactions.
│ │ │ └── document.py # Document schemas: for upload responses and search queries.
│ │ └── services/ # Business logic services: decoupled from API.
│ │ ├── __init__.py # Package init.
│ │ ├── llm.py # Gemini Live integration: prompt engineering for conversational tutor.
│ │ ├── payment.py # Handles pricing plans: integrates with Stripe for subscriptions (Spark, Glow, etc.).
│ │ └── document.py # Docling processing: parses files, chunks, embeds (via Gemini), stores for semantic search.
│ └── tests/ # Unit and integration tests: uses pytest, Hypothesis for property-based testing, httpx for API testing.
│ ├── __init__.py # Package init.
│ ├── conftest.py # Pytest fixtures: e.g., test database, mock Gemini.
│ └── test_users.py # Example test file: tests user endpoints.
├── frontend/ # Root for Next.js frontend and PWA, following Next.js app router best practices (2025 standards: improved SSR, layouts).
│ ├── Dockerfile # Builds the frontend container: installs dependencies, builds Next.js, serves with Node.
│ ├── .eslintrc.json # ESLint configuration: extends next/core-web-vitals, adds rules for code quality.
│ ├── next.config.js # Next.js config: enables PWA, images optimization, API routes if needed.
│ ├── package.json # Node dependencies: includes next, react, @livekit/client for WebRTC, axios or fetch for API calls.
│ ├── package-lock.json # Lockfile for reproducible npm installs.
│ ├── tsconfig.json # TypeScript config: targets ES2022, includes paths for components.
│ ├── app/ # App router directory: pages, layouts, loading states.
│ │ ├── globals.css # Global styles: Tailwind or CSS modules.
│ │ ├── layout.tsx # Root layout: includes providers, navigation.
│ │ ├── page.tsx # Home page: landing for avaaz.ai.
│ │ └── components/ # Reusable UI components.
│ │ ├── ChatInterface.tsx # Component for conversational tutor using LiveKit WebRTC.
│ │ └── ProgressTracker.tsx # Tracks user progress toward B2 exam.
│ ├── lib/ # Utility functions: API clients, hooks.
│ │ └── api.ts # API client: typed fetches to backend endpoints.
│ └── public/ # Static assets.
│ ├── favicon.ico # Site icon.
│ └── manifest.json # PWA manifest: for mobile app-like experience.
├── infra/ # Infrastructure configurations: Dockerfiles and configs for supporting services, keeping them separate for scalability.
│ ├── caddy/ # Caddy reverse proxy setup.
│ │ ├── Dockerfile # Extends official Caddy image, copies Caddyfile.
│ │ └── Caddyfile # Caddy config: routes www.avaaz.ai to frontend, api.avaaz.ai to backend, WSS to LiveKit; auto HTTPS.
│ ├── gitea/ # Gitea git server (added for customization if needed; otherwise use official image directly in Compose).
│ │ ├── Dockerfile # Optional: Extends official Gitea image, copies custom config for Actions integration.
│ │ └── app.ini # Gitea config: sets up server, database, Actions runner.
│ ├── livekit/ # LiveKit server for real-time audio/video in tutor sessions.
│ │ ├── Dockerfile # Extends official LiveKit image, copies config.
│ │ └── livekit.yaml # LiveKit config: API keys, room settings, agent integration for AI tutor.
│ └── postgres/ # PostgreSQL with pgvector.
│ ├── Dockerfile # Extends postgres image, installs pgvector extension.
│ └── init/ # Initialization scripts.
│ └── 00-pgvector.sql # SQL to create pgvector extension on db init.
└── docs/ # Documentation: architecture, APIs, etc.
└── architecture.md # Detailed system explanation, including the provided Mermaid diagram.
```

613
docs/architecture.md Normal file
View File

@@ -0,0 +1,613 @@
# System Architecture
Below is a summary of the **Production VPS** and **Development Laptop** architectures. Both environments use Docker containers for consistency, with near-identical stacks where practical.
```mermaid
flowchart LR
%% Client
A(Browser / PWA)
Y(iOS App / Android App)
subgraph User
A
Y
end
%% LLM / Realtime
B(OpenAI Realtime API)
Z(Gemini Live API)
subgraph Large Language Model
B
Z
end
%% Server-side
C(Caddy)
I(Gitea + Actions + Repositories)
J(Gitea Runner)
D(Next.js Frontend)
E(FastAPI Backend + Agent Runtime)
G(LiveKit Server)
H[(PostgreSQL + pgvector)]
%% Client ↔ VPS
A <-- https://www.avaaz.ai --> C
A <-- https://app.avaaz.ai --> C
A & Y <-- https://api.avaaz.ai --> C
A & Y <-- wss://rtc.avaaz.ai --> C
A & Y <-- "udp://rtc.avaaz.ai:50000-60000 (WebRTC Media)" --> G
%% Caddy ↔ App
C <-- "http://frontend:3000 (app)" --> D
C <-- "http://backend:8000 (api)" --> E
C <-- "ws://livekit:7880 (WebRTC signaling)" --> G
C <-- "http://gitea:3000 (git)" --> I
%% App internal
D <-- "http://backend:8000" --> E
E <-- "postgresql://postgres:5432" --> H
E <-- "http://livekit:7880 (control)" --> G
E <-- "Agent joins via WebRTC" --> G
%% Agent ↔ LLM
E <-- "WSS/WebRTC (realtime)" --> B
E <-- "WSS (streaming)" --> Z
%% CI/CD
I <-- "CI/CD triggers" --> J
subgraph VPS
subgraph Infra
C
I
J
end
subgraph App
D
E
G
H
end
end
%% Development Environment
L(VS Code + Git + Docker)
M(Local Docker Compose)
N(Local Browser)
O(Local Frontend)
P(Local Backend)
Q[(Local Postgres)]
R(Local LiveKit)
L <-- "https://git.avaaz.ai/...git" --> C
L <-- "ssh://git@git.avaaz.ai:2222/..." --> I
L -- "docker compose up" --> M
M -- "Build & Run" --> O & P & Q & R
N <-- HTTP --> O & P
N <-- WebRTC --> R
O <-- HTTP --> P
P <-- SQL --> Q
P <-- HTTP/WebRTC --> R
P <-- WSS/WebRTC --> B
P <-- WSS --> Z
subgraph Development Laptop
L
M
N
subgraph Local App
O
P
Q
R
end
end
```
## 1. Production VPS
### 1.1 Components
#### Infra Stack
Docker Compose from the `avaaz-infra` Git repository is cloned to `/srv/infra/docker-compose.yml` on the VPS.
| Container | Description |
| -------------- | ----------------------------------------------------------------------------------- |
| `caddy` | **Caddy** Reverse proxy with automatic HTTPS (TLS termination via Lets Encrypt). |
| `gitea` | **Gitea + Actions** Git server using SQLite. Automated CI/CD workflows. |
| `gitea-runner` | **Gitea Runner** Executes CI/CD jobs defined in Gitea Actions workflows. |
#### App Stack
Docker Compose from the `avaaz-app` Git repository is cloned to `/srv/app/docker-compose.yml` on the VPS.
| Container | Description |
| ---------- | ----------------------------------------------------------------------------------------- |
| `frontend` | **Next.js Frontend** SPA/PWA interface served from a Node.js-based Next.js server. |
| `backend` | **FastAPI + Uvicorn Backend** API, auth, business logic, LiveKit orchestration, agent. |
| `postgres` | **PostgreSQL + pgvector** Persistent relational database with vector search. |
| `livekit` | **LiveKit Server** WebRTC signaling plus UDP media for real-time audio and data. |
The `backend` uses several Python packages such as UV, Ruff, FastAPI, FastAPI Users, FastAPI-pagination, FastStream, Pydantic, PydanticAI, Pydantic-settings, LiveKit Agent, Google Gemini Live API, OpenAI Realtime API, SQLAlchemy, Alembic, docling, Gunicorn, Uvicorn[standard], Pyright, Pytest, Hypothesis, and Httpx to deliver the services.
### 1.2 Network
- All containers join a shared `proxy` Docker network.
- Caddy can route to any service by container name.
- App services communicate internally:
- Frontend ↔ Backend
- Backend ↔ Postgres
- Backend ↔ LiveKit
- Backend (agent) ↔ LiveKit & external LLM realtime APIs
### 1.3 Public DNS Records
| Hostname | Record Type | Target | Purpose |
| -------------------- | :---------: | -------------- | -------------------------------- |
| **www\.avaaz\.ai** | CNAME | avaaz.ai | Marketing / landing site |
| **avaaz.ai** | A | 217.154.51.242 | Root domain |
| **app.avaaz.ai** | A | 217.154.51.242 | Next.js frontend (SPA/PWA) |
| **api.avaaz.ai** | A | 217.154.51.242 | FastAPI backend |
| **rtc.avaaz.ai** | A | 217.154.51.242 | LiveKit signaling + media |
| **git.avaaz.ai** | A | 217.154.51.242 | Gitea (HTTPS + SSH) |
### 1.4 Public Inbound Firewall Ports & Protocols
| Port | Protocol | Purpose |
| -------------: | :------: | --------------------------------------- |
| **80** | TCP | HTTP, ACME HTTP-01 challenge |
| **443** | TCP | HTTPS, WSS (frontend, backend, LiveKit) |
| **2222** | TCP | Git SSH via Gitea |
| **2885** | TCP | VPS SSH access |
| **3478** | UDP | STUN/TURN |
| **5349** | TCP | TURN over TLS |
| **7881** | TCP | LiveKit TCP fallback |
| **5000060000**| UDP | LiveKit WebRTC media |
### 1.5 Routing
#### Caddy
Caddy routes traffic from public ports 80 and 443 to internal services.
- `https://www.avaaz.ai``http://frontend:3000`
- `https://app.avaaz.ai``http://frontend:3000`
- `https://api.avaaz.ai``http://backend:8000`
- `wss://rtc.avaaz.ai``ws://livekit:7880`
- `https://git.avaaz.ai``http://gitea:3000`
#### Internal Container Network
- `frontend``http://backend:8000`
- `backend``postgres://postgres:5432`
- `backend``http://livekit:7880` (control)
- `backend``ws://livekit:7880` (signaling)
- `backend``udp://livekit:50000-60000` (media)
- `gitea-runner``/var/run/docker.sock` (Docker API on host)
#### Outgoing
- `backend``https://api.openai.com/v1/realtime/sessions`
- `backend``wss://api.openai.com/v1/realtime?model=gpt-realtime`
- `backend``wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`
### 1.6 Functional Layers
#### Data Layer
**Infra:**
- **SQLite (Gitea)**
- Gitea stores Git metadata (users, repos, issues, Actions metadata) in `/data/gitea/gitea.db`.
- This is a file-backed SQLite database inside a persistent Docker volume.
- Repository contents are stored under `/data/git/`, also volume-backed.
- **Gitea Runner State**
- Gitea Actions runner stores its registration information and job metadata under `/data/.runner`.
**App:**
- **PostgreSQL with pgvector**
- Primary relational database for users, lessons, transcripts, embeddings, and conversational context.
- Hosted in the `postgres` container with a persistent Docker volume.
- Managed via SQLAlchemy and Alembic migrations in the backend.
- **LiveKit Ephemeral State**
- Room metadata, participant states, and signaling information persist in memory within the `livekit` container.
- LiveKits SFU media buffers and room state are **not** persisted across restarts.
#### Control Layer
**Infra:**
- **Caddy**
- TLS termination (Lets Encrypt).
- Reverse proxy and routing for all public domains.
- ACME certificate renewal.
- **Gitea**
- Git hosting, pull/clone over SSH and HTTPS.
- CI/CD orchestration via Actions and internal APIs.
- **Gitea Runner**
- Executes workflows and controls the Docker engine via `/var/run/docker.sock`.
**App:**
- **FastAPI Backend**
- Authentication and authorization (`/auth/login`, `/auth/refresh`, `/auth/me`).
- REST APIs for lessons, progress, documents, and file handling.
- LiveKit session management (room mapping `/sessions/default`, token minting `/sessions/default/token`, agent configuration).
- Calls out to OpenAI Realtime and Gemini Live APIs for AI-driven conversational behavior.
- **LiveKit Server**
- Manages room signaling, participant permissions, and session state.
- Exposes HTTP control endpoint for room and participant management.
#### Media Layer
**App:**
- **User Audio Path**
- Browser/mobile → LiveKit:
- WSS signaling via `rtc.avaaz.ai` → Caddy → `livekit:7880`.
- UDP audio and data channels via `rtc.avaaz.ai:5000060000` directly to LiveKit on the VPS.
- WebRTC handles ICE, STUN/TURN, jitter buffers, and Opus audio encoding.
- **AI Agent Audio Path**
- The agent logic inside the backend uses LiveKit Agent SDK to join rooms as a participant.
- Agent → LiveKit:
- WS signaling over the internal Docker network (`ws://livekit:7880`).
- UDP audio transport as part of its WebRTC session.
- Agent → LLM realtime API:
- Secure WSS/WebRTC connection to OpenAI Realtime or Gemini Live.
- The agent transcribes, processes, and generates audio responses, publishing them into the LiveKit room so the user hears natural speech.
### 1.7 CI/CD Pipeline
Production CI/CD is handled by **Gitea Actions** running on the VPS. The `gitea-runner` container has access to the host Docker daemon and is responsible for both validation and deployment:
- `.gitea/workflows/ci.yml` **Continuous Integration** (branch/PR validation, no deployment).
- `.gitea/workflows/cd.yml` **Continuous Deployment** (tag-based releases to production).
#### Build Phase (CI Workflow: `ci.yml`)
**Triggers**
- `push` to:
- `feature/**`
- `bugfix/**`
- `pull_request` targeting `main`.
**Runner & Environment**
- Runs on the self-hosted runner labeled `linux_amd64`.
- Checks out the relevant branch or PR commit from the `avaaz-app` repository into the runners workspace.
**Steps**
1. **Checkout code**
Uses `actions/checkout@v4` to fetch the branch or PR head commit.
2. **Report triggering context**
Logs the event type (`push` or `pull_request`) and branches:
- For `push`: the source branch (e.g., `feature/foo`).
- For `pull_request`: source and target (`main`).
3. **Static analysis & tests**
- Run linters, type checkers, and unit tests for backend and frontend.
- Ensure the application code compiles/builds.
4. **Build Docker images for CI**
- Build images (e.g., `frontend:ci` and `backend:ci`) to validate Dockerfiles and build chain.
- These images are tagged for CI only and not used for production.
5. **Cleanup CI images**
- Remove CI-tagged images at the end of the job (even on failure) to prevent disk usage from accumulating.
**Outcome**
- A green CI result on a branch/PR signals that:
- The code compiles/builds.
- Static checks and tests pass.
- Docker images can be built successfully.
- CI does **not** modify the production stack and does **not** depend on tags.
#### Deploy Phase (CD Workflow: `cd.yml`)
**Triggers**
- Creation of a Git tag matching `v*` that points to a commit on the `main` branch in the `avaaz-app` repository.
**Runner & Environment**
- Runs on the same `linux_amd64` self-hosted runner.
- Checks out the exact commit referenced by the tag.
**Steps**
1. **Checkout tagged commit**
- Uses `actions/checkout@v4` with `ref: ${{ gitea.ref }}` to check out the tagged commit.
2. **Tag validation**
- Fetches `origin/main`.
- Verifies that the tag commit is an ancestor of `origin/main` (i.e., the tag points to code that has been merged into `main`).
- Fails the deployment if the commit is not in `main`s history.
3. **Build & publish release**
- Builds production Docker images for frontend, backend, LiveKit, etc., tagged with the version (e.g., `v0.1.0`).
- Applies database migrations (e.g., via Alembic) if required.
4. **Restart production stack**
- Restarts or recreates the app stack containers using the newly built/tagged images (e.g., via `docker compose -f docker-compose.yml up -d`).
5. **Health & readiness checks**
- Probes key endpoints with `curl -f`, such as:
- `https://app.avaaz.ai`
- `https://api.avaaz.ai/health`
- `wss://rtc.avaaz.ai` (signaling-level check)
- If checks fail, marks the deployment as failed and automatically rolls back to previous images.
**Outcome**
- Only tagged releases whose commits are on the `main` branch are deployed.
- Deployment is explicit (tag-based), separated from CI validation.
### 1.8 Typical Workflows
#### User Login
1. Browser loads the frontend from `https://app.avaaz.ai`.
2. Frontend submits credentials to `POST https://api.avaaz.ai/auth/login`.
3. Backend validates credentials and returns:
- A short-lived JWT **access token**
- A long-lived opaque **refresh token**
- A minimal user profile for immediate UI hydration
4. Frontend stores tokens appropriately (access token in memory; refresh token in secure storage or an httpOnly cookie).
#### Load Persistent Session
1. Frontend calls `GET https://api.avaaz.ai/sessions/default`.
2. Backend retrieves or creates the users **persistent conversational session**, which encapsulates:
- Long-running conversation state
- Lesson and progress context
- Historical summary for LLM context initialization
3. Backend prepares the sessions LLM context so that the agent can join with continuity.
#### Join the Live Conversation Session
1. Frontend requests a LiveKit access token via `POST https://api.avaaz.ai/sessions/default/token`.
2. Backend generates a **new LiveKit token** (short-lived, room-scoped), containing:
- Identity
- Publish/subscribe permissions
- Expiration (affecting initial join)
- Room ID corresponding to the session
3. Frontend connects to the LiveKit server:
- WSS for signaling
- UDP/SCTP for low-latency audio and file transfer
4. If the user disconnects, the frontend requests a new LiveKit token before rejoining, ensuring seamless continuity.
#### Conversation with AI Agent
1. Backend configures the sessions **AI agent** using:
- Historical summary
- Current lesson state
- Language settings and mode (lesson, mock exam, free talk)
2. The agent joins the same LiveKit room as a participant.
3. All media flows through LiveKit:
- User → audio → LiveKit → Agent
- Agent → LLM realtime API → synthesized audio → LiveKit → User
4. The agent guides the user verbally: continuing lessons, revisiting material, running mock exams, or free conversation.
The user experiences this as a **continuous, ongoing session** with seamless reconnection and state persistence.
### 1.9 Hardware
| Class | Description |
|----------------|-------------------------------------------|
| system | Standard PC (i440FX + PIIX, 1996) |
| bus | Motherboard |
| memory | 96KiB BIOS |
| processor | AMD EPYC-Milan Processor |
| memory | 8GiB System Memory |
| bridge | 440FX - 82441FX PMC [Natoma] |
| bridge | 82371SB PIIX3 ISA [Natoma/Triton II] |
| communication | PnP device PNP0501 |
| input | PnP device PNP0303 |
| input | PnP device PNP0f13 |
| storage | PnP device PNP0700 |
| system | PnP device PNP0b00 |
| storage | 82371SB PIIX3 IDE [Natoma/Triton II] |
| bus | 82371SB PIIX3 USB [Natoma/Triton II] |
| bus | UHCI Host Controller |
| input | QEMU USB Tablet |
| bridge | 82371AB/EB/MB PIIX4 ACPI |
| display | QXL paravirtual graphic card |
| generic | Virtio RNG |
| storage | Virtio block device |
| disk | 257GB Virtual I/O device |
| volume | 238GiB EXT4 volume |
| volume | 4095KiB BIOS Boot partition |
| volume | 105MiB Windows FAT volume |
| volume | 913MiB EXT4 volume |
| network | Virtio network device |
| network | Ethernet interface |
| input | Power Button |
| input | AT Translated Set 2 keyboard |
| input | VirtualPS/2 VMware VMMouse |
## 2. Development Laptop
### 2.1 Components
#### App Stack (local Docker)
- `frontend` (Next.js SPA)
- `backend` (FastAPI)
- `postgres` (PostgreSQL + pgvector)
- `livekit` (local LiveKit Server)
No Caddy is deployed locally; the browser talks directly to the mapped container ports on `localhost`.
### 2.2 Network
- All services run as Docker containers on a shared Docker network.
- Selected ports are published to `localhost` for direct access from the browser and local tools.
- No public domains are used in development; everything is addressed via `http://localhost/...`.
### 2.3 Domains & IP Addresses
Local development uses:
- `http://localhost:3000` → frontend (Next.js dev/server container)
- `http://localhost:8000` → backend API (FastAPI)
- Example auth/session endpoints:
- `POST http://localhost:8000/auth/login`
- `GET http://localhost:8000/sessions/default`
- `POST http://localhost:8000/sessions/default/token`
- `ws://localhost:7880` → LiveKit signaling (local LiveKit server)
- `udp://localhost:5000060000` → LiveKit/WebRTC media
No `/etc/hosts` changes or TLS certificates are required; `localhost` acts as a secure origin for WebRTC.
### 2.4 Ports & Protocols
| Port | Protocol | Purpose |
|-------------:|:--------:|------------------------------------|
| 3000 | TCP | Frontend (Next.js) |
| 8000 | TCP | Backend API (FastAPI) |
| 5432 | TCP | Postgres + pgvector |
| 7880 | TCP | LiveKit HTTP + WS signaling |
| 5000060000 | UDP | LiveKit WebRTC media (audio, data) |
### 2.5 Routing
No local Caddy or reverse proxy layer is used; routing is direct via published ports.
#### Internal Container Routing (Docker network)
- Backend → Postgres: `postgres://postgres:5432`
- Backend → LiveKit: `http://livekit:7880`
- Frontend (server-side) → Backend: `http://backend:8000`
#### Browser → Containers (via localhost)
- Browser → Frontend: `http://localhost:3000`
- Browser → Backend API: `http://localhost:8000`
#### Outgoing (from Backend)
- `backend``https://api.openai.com/v1/realtime/sessions`
- `backend``wss://api.openai.com/v1/realtime?model=gpt-realtime`
- `backend``wss://generativelanguage.googleapis.com/ws/google.ai.generativelanguage.v1beta.GenerativeService.BidiGenerateContent`
These calls mirror production agent behavior while pointing to the same cloud LLM realtime endpoints.
### 2.6 Functional Layers
#### Data Layer
- Local Postgres instance mirrors the production schema (including pgvector).
- Database migrations are applied via backend tooling (e.g., Alembic) to keep schema in sync.
#### Control Layer
- Backend runs full application logic locally:
- Authentication and authorization
- Lesson and progress APIs
- LiveKit session management (`/sessions/default`, `/sessions/default/token`) and agent control
- Frontend integrates against the same API surface as production, only with `localhost` URLs.
#### Media Layer
- Local LiveKit instance handles:
- WS/HTTP signaling on port 7880
- WebRTC media (audio + data channels) on UDP `5000060000`
- Agent traffic mirrors production logic:
- LiveKit ↔ Backend ↔ LLM realtime APIs (OpenAI / Gemini).
### 2.7 Typical Workflows
#### Developer Pushes Code
1. Developer pushes to `git.avaaz.ai` over HTTPS/SSL or SSH.
2. CI runs automatically (linting, tests, build validation). No deployment occurs.
3. When a release is ready, the developer creates a version tag (`v*`) on a commit in `main`.
4. CD triggers: validates the tag, rebuilds from the tagged commit, deploys updated containers, then performs post-deploy health checks.
#### App Development
- Start the stack: `docker compose -f docker-compose.dev.yml up -d`
- Open the app in the browser: `http://localhost:3000`
- Frontend calls the local backend for:
- `POST http://localhost:8000/auth/login`
- `GET http://localhost:8000//sessions/default`
- `POST http://localhost:8000//sessions/default/token`
#### API Testing
- Health check: `curl http://localhost:8000/health`
- Auth and session testing:
```bash
curl -X POST http://localhost:8000/auth/login \
-H "Content-Type: application/json" \
-d '{"email": "user@example.com", "password": "password"}'
curl http://localhost:8000/sessions/default \
-H "Authorization: Bearer <access_token>"
```
#### LiveKit Testing
- Frontend connects to LiveKit via:
- Signaling: `ws://localhost:7880`
- WebRTC media: `udp://localhost:5000060000`
- Backend issues local LiveKit tokens via `POST http://localhost:8000//sessions/default/token`, then connects the AI agent to the local room.
### 2.8 Hardware
| Class | Description |
|----------------|--------------------------------------------|
| system | HP Laptop 14-em0xxx |
| bus | 8B27 motherboard bus |
| memory | 128KiB BIOS |
| processor | AMD Ryzen 3 7320U |
| memory | 256KiB L1 cache |
| memory | 2MiB L2 cache |
| memory | 4MiB L3 cache |
| memory | 8GiB System Memory |
| bridge | Family 17h-19h PCIe Root Complex |
| generic | Family 17h-19h IOMMU |
| storage | SK hynix BC901 HFS256GE SSD |
| disk | 256GB NVMe disk |
| volume | 299MiB Windows FAT volume |
| volume | 238GiB EXT4 volume |
| network | RTL8852BE PCIe 802.11ax Wi-Fi |
| display | Mendocino integrated graphics |
| multimedia | Rembrandt Radeon High Definition Audio |
| generic | Family 19h PSP/CCP |
| bus | AMD xHCI Host Controller |
| input | Logitech M705 Mouse |
| input | Logitech K370s/K375s Keyboard |
| multimedia | Jabra SPEAK 510 USB |
| multimedia | Logitech Webcam C925e |
| communication | Bluetooth Radio |
| multimedia | HP True Vision HD Camera |
| bus | FCH SMBus Controller |
| bridge | FCH LPC Bridge |
| power | AE03041 Battery |
| input | Power Button |
| input | Lid Switch |
| input | HP WMI Hotkeys |
| input | AT Translated Set 2 Keyboard |
| input | Video Bus |
| input | SYNA32D9:00 06CB:CE17 Mouse |
| input | SYNA32D9:00 06CB:CE17 Touchpad |
| network | Ethernet Interface |

28
docs/docker.md Normal file
View File

@@ -0,0 +1,28 @@
# Docker
## Remove Docker Containers
⚠️ WARNING: Data Loss Imminent
The following command will delete ALL Docker data, including stopped containers, images, and volumes (which contain persistent application data like databases, configurations, and git repositories).
1. Permanently erase all docker containers in `docker-compose.yml`.
```bash
cd /srv/infra
sudo docker compose down -v --rmi all --remove-orphans
```
2. Permanently delete Bind Mount data.
```bash
sudo rm -rf ./gitea-data
sudo rm -rf ./gitea-runner-data
```
3. Verify that no components remain.
```bash
sudo docker ps -a
sudo docker images -a
sudo docker volume ls
```

105
docs/git.md Normal file
View File

@@ -0,0 +1,105 @@
# Git
* [GitHub Flow](https://dev.to/karmpatel/git-branching-strategies-a-comprehensive-guide-24kh) branching strategy is used.
* Direct push to `main` branch is prohibited.
* Only merges to the `main` branch via Pull Requests from `feature/...` or `bugfix/...` branches are allowed.
* Tags are created for releases on the `main` branch.
## Pull Request
1. Ensure your main branch is protected, so that direct push is disabled.
2. Update the local main branch.
```bash
git checkout main
git pull origin main
```
3. Create a new branch, `feature/new-branch`, with a descriptive name and switch to it.
```bash
git checkout -b feature/new-branch
```
4. Add any changes.
```bash
git add .
```
5. Commit the changes with a message.
```bash
git commit -m "new branch: create all boiler plate files"
```
6. Push the new branch, `feature/new-branch`, to the remote Gitea repository.
```bash
git push origin feature/new-branch
```
7. Create a new Pull Request from `feature/new-branch` to the `main` branch.
8. Review and Merge the new branch into the `main` branch.
9. Switch back to the `main` branch.
```bash
git checkout main
```
10. Pull the latest changes that include the merged PR.
```bash
git pull origin main
```
11. Delete the local `feature/new-branch` branch.
```bash
git branch -d feature/new-branch
```
12. Delete the remote branch after merging if Gitea did _not_ delete it.
```bash
git push origin --delete feature/new-branch
```
13. Create a new branch for upcoming work, for example `feature/dev`.
```bash
git checkout -b feature/dev
```
## Troubleshooting
### Accidentally created a commit on local `main` branch instead of a feature branch
1. Create a new branch pointing to the extra commit.
```bash
git branch feature/new-branch
```
2. Reset the local `main` branch to match the remote `origin/main`.
```bash
git reset --hard origin/main
```
3. Switch over to the new feature branch
```bash
git checkout feature/new-branch
```
4. Push the new branch.
```bash
git push origin feature/new-branch
```
5. Create a Pull Request.

BIN
img/favicon.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 800 KiB

BIN
img/logo.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 984 KiB

24
infra/.env Normal file
View File

@@ -0,0 +1,24 @@
# Base URL of your Gitea instance (used by the runner to register itself
# and to send/receive workflow job information).
GITEA_INSTANCE_URL=https://git.avaaz.ai
# One-time registration token generated in:
# Gitea → Site Administration → Actions → Runners → "Generate Token"
# This MUST be filled in once, so the runner can register.
# After registration, the runner stores its identity inside ./gitea-runner-data/.runner
# and this value is no longer needed (can be left blank).
GITEA_RUNNER_REGISTRATION_TOKEN=
# Human-readable name for this runner.
# This is shown in the Gitea UI so you can distinguish multiple runners:
# Example: "vps-runner", "staging-runner", "gpu-runner"
GITEA_RUNNER_NAME=gitea-runner
# Runner labels allow workflows to choose specific runners.
# The label format is: label[:schema[:args]]
# - "ubuntu-latest" is the <label> name that workflows request using runs-on: [ "ubuntu-latest" ].
# - "docker://" is the <schema> indicating the job runs inside a separate Docker container.
# - "catthehacker/ubuntu:act-latest" is the <args>, specifying the Docker image to use for the container.
# Workflows can target this using:
# runs-on: [ "ubuntu-latest" ]
GITEA_RUNNER_LABELS=ubuntu-latest:docker://catthehacker/ubuntu:act-latest

View File

@@ -0,0 +1,118 @@
name: Continuous Deployment # Name of this workflow as shown in the CI/CD UI
on: # Section defining which events trigger this workflow
push: # Trigger when a push event occurs
tags: # Limit triggers to pushes involving tags
- 'v*' # Only run for version tags that start with 'v' (e.g., v0.0.1, v1.2.3)
workflow_dispatch: # Allow this workflow to be triggered manually from the UI
inputs: # Optional inputs for manual deployments
version: # Input used to document which version is being deployed manually
description: "Version to deploy when triggering manually (informational only)" # Help text for the version input
required: false # This input is optional
default: "manual-trigger" # Default value when no explicit version is supplied
permissions: # Default permissions for the token used in this workflow
contents: read # Allow reading repository contents (needed for checkout)
packages: write # Allow pushing packages/images to registries (adjust or remove if not needed)
id-token: write # Allow issuing OIDC tokens for cloud providers (remove if not used)
jobs: # Collection of jobs defined in this workflow
deploy: # Job responsible for deploying tagged releases
name: Deploy tagged release # Human-readable name for this deployment job
runs-on: ubuntu-latest # Use the latest Ubuntu Linux runner image
timeout-minutes: 30 # Automatically fail the job if it runs longer than 30 minutes
concurrency: # Prevent overlapping deployments
group: deploy-main-tags # Group key: serialize all deployments that share this identifier
cancel-in-progress: true # Cancel any running deployment in this group when a new one starts
environment: # Associate this job with a deployment environment
name: production # Label the environment as 'production' for visibility and protections
steps: # Ordered list of steps in this job
- name: Checkout Code # Step to check out the repository at the tagged commit
uses: actions/checkout@v6 # Standard checkout action (Gitea-compatible)
with: # Options for configuring the checkout behavior
ref: ${{ gitea.ref }} # Check out the specific commit referenced by the pushed tag
fetch-depth: 0 # Fetch full history so ancestry checks and branch analysis are reliable
- name: Verify tag commit is on current or historical commit of 'remote/main' # Ensure the tag commit comes from the assumed main branch
shell: bash # Explicitly use bash for this script
run: | # Begin multi-line bash script
set -euo pipefail # Enable strict mode: exit on error, unset var, or failed pipeline command
REMOTE_NAME="remote" # Fixed assumption: the relevant remote is named "remote"
MAIN_BRANCH_NAME="main" # Fixed assumption: the primary branch on the remote is named "main"
echo "Assuming remote name: ${REMOTE_NAME}" # Log the assumed remote name
echo "Assuming main branch name on remote: ${MAIN_BRANCH_NAME}" # Log the assumed main branch name
if ! git remote | grep -qx "${REMOTE_NAME}"; then # Check that the assumed remote actually exists
echo "❌ Expected remote '${REMOTE_NAME}' not found in repository remotes." # Explain missing remote
git remote -v # Show the actual configured remotes for debugging
exit 1 # Fail the job because we cannot safely validate without the expected remote
fi # End of remote existence check
TAG_COMMIT="$(git rev-parse HEAD)" # Determine the commit hash currently checked out (the tag target)
echo "Tag points to commit: ${TAG_COMMIT}" # Log which commit the tag references
echo "Fetching '${MAIN_BRANCH_NAME}' from remote '${REMOTE_NAME}'..." # Log the fetch action
if ! git fetch "${REMOTE_NAME}" "${MAIN_BRANCH_NAME}" > /dev/null 2>&1; then # Fetch remote/main silently and detect failure
echo "❌ Failed to fetch branch '${MAIN_BRANCH_NAME}' from remote '${REMOTE_NAME}'." # Explain fetch failure
echo " Ensure '${REMOTE_NAME}/${MAIN_BRANCH_NAME}' exists and is accessible." # Suggest verifying remote branch presence
exit 1 # Fail the job because we cannot validate against main
fi # End of fetch error check
MAIN_REF="${REMOTE_NAME}/${MAIN_BRANCH_NAME}" # Construct the fully qualified remote/main reference
echo "Discovering remote branches that contain the tag commit ${TAG_COMMIT}..." # Log start of branch discovery
BRANCHES_RAW="$(git branch -r --contains "${TAG_COMMIT}" || true)" # List remote branches whose history contains the tag commit (may be empty)
echo "Raw remote branches containing tag commit:" # Introductory log for raw remote-branch list
echo "${BRANCHES_RAW}" # Output the raw remote-tracking branches
BRANCHES_CLEANED="$(echo "${BRANCHES_RAW}" | sed 's|^[[:space:]]*||;s|.*/||')" # Trim spaces and strip remote prefixes, leaving branch names only
echo "Branch names containing tag commit (remote prefixes stripped):" # Introductory log for cleaned branch names
echo "${BRANCHES_CLEANED}" # Output the cleaned list of branch names for human inspection
if echo "${BRANCHES_RAW}" | sed 's|^[[:space:]]*||' | grep -qx "${MAIN_REF}"; then # Check if remote/main itself contains the tag commit
echo "${MAIN_REF} is listed as containing the tag commit." # Log that remote/main is explicitly reported as containing the commit
else # Branch taken when remote/main is not listed among the containing branches
echo "Note: '${MAIN_REF}' is not explicitly listed among branches containing the tag commit;" # Note about absence in the listing
echo " proceeding to verify via merge-base ancestry check as the final source of truth." # Explain that merge-base will be used to decide
fi # End of explicit listing check
echo "Verifying that tag commit ${TAG_COMMIT} is an ancestor of '${MAIN_REF}'..." # Log the start of ancestry verification
if git merge-base --is-ancestor "${TAG_COMMIT}" "${MAIN_REF}"; then # Check if the tag commit is contained in the history of remote/main
echo "✅ Tag commit is part of '${MAIN_REF}' history." # Success: tag commit is in remote/main history
echo " This means the tag was created on the current or historical commit of the main branch on remote '${REMOTE_NAME}'." # Clarify the semantic meaning
else # Branch taken when the tag commit is not reachable from remote/main
echo "❌ Tag commit is NOT part of '${MAIN_REF}' history." # Failure: invalid tag source
echo " Deployment is only allowed for tags created on the current or historical commit of '${MAIN_REF}'." # Explain the policy being enforced
exit 1 # Fail the job to prevent deployment from a non-main branch
fi # End of ancestry validation conditional
- name: Build and Publish Release # Step that builds and deploys the validated tagged release
shell: bash # Use bash for this deployment script
env: # Environment variables used during build and deployment
TAG_NAME: ${{ gitea.ref_name }} # Tag name (e.g., v0.0.1, v1.2.3) from the workflow context
CI: "true" # Conventional flag signaling that commands run in a CI/CD environment
run: | # Begin multi-line bash script for build and deploy
set -euo pipefail # Enforce strict error handling during deployment
echo "Proceeding with building and deploying version ${TAG_NAME}..." # Log which version is being deployed
# --- Begin placeholder for production build and deployment logic --- # Marker for project-specific deployment implementation
# Example for container-based deployments: # Example of a containerized deployment sequence
# make test # Run tests one more time as a safeguard before deployment
# make build-release # Build application artifacts for production
# docker build -t myorg/myapp:${TAG_NAME} . # Build Docker image tagged with the version
# docker push myorg/myapp:${TAG_NAME} # Push Docker image to the container registry
# ./deploy_to_production.sh "${TAG_NAME}" # Run custom deployment script using the version tag
# Example for non-container deployments: # Example of a file-based or script-based deployment sequence
# ./scripts/package.sh "${TAG_NAME}" # Package application into production-ready artifacts
# ./scripts/deploy.sh "${TAG_NAME}" # Deploy artifacts to servers or hosting platform
# --- End placeholder for production build and deployment logic --- # End of deployment example section
echo "Build and deployment steps completed for version ${TAG_NAME} (assuming real commands are configured above)." # Summary log for successful deployment step

View File

@@ -0,0 +1,90 @@
name: Continuous Integration # Name of this workflow as shown in the Actions/CI UI
on: # Section defining which events trigger this workflow
push: # Trigger when code is pushed
branches: # Branch patterns that should trigger this workflow on push
- 'feature/**' # Run CI for all branches under feature/ (e.g., feature/new-api)
- 'bugfix/**' # Run CI for all branches under bugfix/ (e.g., bugfix/fix-login)
pull_request: # Trigger when a pull request event occurs
branches: # Pull requests targeting these base branches will trigger this workflow
- main # Run CI for pull requests whose base (target) branch is main
types: # Specific pull request activity types that trigger this workflow
- opened # Trigger when a pull request is opened
- reopened # Trigger when a previously closed pull request is reopened
- synchronize # Trigger when new commits are pushed to the pull request source branch
workflow_dispatch: # Allow this workflow to be triggered manually from the UI
inputs: # Optional inputs for manual runs
reason: # Input describing why CI was triggered manually
description: "Reason for manually running Continuous Integration" # Help text for this input
required: false # This input is optional
default: "manual-trigger" # Default value when no explicit reason is provided
permissions: # Default permissions for the CI token used in this workflow
contents: read # Allow reading repository contents (required for checking out code)
# Add further permissions here if CI needs them (e.g., packages: read, issues: write, etc.)
jobs: # Collection of jobs in this workflow
validate: # Job responsible for validating changes (build, tests, etc.)
name: Validate and test changes # Human-readable name for this job
runs-on: ubuntu-latest # Use the latest Ubuntu Linux runner image
timeout-minutes: 20 # Fail the job automatically if it runs longer than 20 minutes
concurrency: # Prevent overlapping CI runs for the same ref
group: ci-${{ gitea.ref_name }}-validate # Group key: serialize CI runs per branch/tag name
cancel-in-progress: true # Cancel any in-progress job in this group when a new one starts
steps: # Ordered list of steps in this job
- name: Checkout Code # Step to fetch the repository contents
uses: actions/checkout@v6 # Standard checkout action (Gitea-compatible)
with: # Options configuring the checkout behavior
fetch-depth: 0 # Fetch full history so advanced git operations are possible if needed
- name: Report Triggering Event and Branches # Step to log which event and branches triggered CI
shell: bash # Explicitly use bash for this script
run: | # Begin multi-line bash script
set -euo pipefail # Enable strict mode: exit on error, unset var, or failed pipe
EVENT_TYPE="${{ gitea.event_name }}" # Capture the event type (push, pull_request, etc.)
echo "Workflow triggered by event type: ${EVENT_TYPE}" # Log the event type
if [ "${EVENT_TYPE}" = "push" ]; then # Branch for push events
echo "Pushed to branch: ${{ gitea.ref_name }}" # Log the branch name for push events
elif [ "${EVENT_TYPE}" = "pull_request" ]; then # Branch for pull request events
echo "Pull request source branch (head_ref): ${{ gitea.head_ref }}" # Log the PR source branch
echo "Pull request target branch (base_ref): ${{ gitea.base_ref }}" # Log the PR target branch
else # Branch for unexpected event types
echo "Unexpected event type: ${EVENT_TYPE}" # Log a warning for unknown events
fi # End of event-type conditional
# - name: Restore Dependency Cache # OPTIONAL: uncomment and configure for your language/toolchain
# uses: actions/cache@v4 # Cache action for speeding up dependency installation
# with: # Cache configuration
# path: | # Paths to cache (example: Node.js dependencies)
# node_modules
# key: deps-${{ runner.os }}-${{ hashFiles('package-lock.json') }} # Cache key based on OS and lockfile
# restore-keys: | # Fallback keys for partial cache hits
# deps-${{ runner.os }}- # Broader prefix allowing reuse of older caches
- name: Run Build and Tests # Main CI step to build and test the project
shell: bash # Use bash shell for the build/test script
env: # Environment variables available to this step
CI: "true" # Conventional flag signaling that commands run in a CI environment
run: | # Begin multi-line bash script for build and tests
set -euo pipefail # Enforce strict error handling during build and tests
echo "Building and testing the branch..." # High-level log for build/test phase
# --- Placeholder for actual build and test commands --- # Marker for project-specific CI logic
# Example for a Node.js project: # Example showing a typical Node.js CI sequence
# npm ci # Clean, reproducible install of dependencies using package-lock.json
# npm test # Run unit tests
# npm run lint # Run linting/static analysis
# Example for a Go project: # Example showing a typical Go CI sequence
# go test ./... # Run all tests in all subpackages
# golangci-lint run # Run Go linters via golangci-lint
# ------------------------------------------------------ # End of placeholder examples
echo "Build and tests completed successfully (assuming real commands are configured above)." # Summary for successful CI run

99
infra/Caddyfile Normal file
View File

@@ -0,0 +1,99 @@
{
# Global Caddy options.
#
# auto_https on
# - Caddy listens on port 80 for every host (ACME + redirect).
# - Automatically issues HTTPS certificates.
# - Automatically redirects HTTP → HTTPS unless disabled.
#
}
# ------------------------------------------------------------
# Redirect www → root domain
# ------------------------------------------------------------
www.avaaz.ai {
# Permanent redirect to naked domain
redir https://avaaz.ai{uri} permanent
}
# ------------------------------------------------------------
# Marketing site (optional — if frontend handles it, remove this)
# Redirect root → app
# ------------------------------------------------------------
avaaz.ai {
# If you have a static marketing page, serve it here.
# If not, redirect visitors to the app.
redir https://app.avaaz.ai{uri}
}
# ------------------------------------------------------------
# Frontend (Next.js)
# Public URL: https://app.avaaz.ai
# Internal target: frontend:3000
# ------------------------------------------------------------
app.avaaz.ai {
# Reverse-proxy HTTPS traffic to the frontend container
reverse_proxy frontend:3000
# Access log for debugging frontend activity
log {
output file /data/app-access.log
}
# Compression for faster delivery of JS, HTML, etc.
encode gzip zstd
}
# ------------------------------------------------------------
# Backend (FastAPI)
# Public URL: https://api.avaaz.ai
# Internal target: backend:8000
# ------------------------------------------------------------
api.avaaz.ai {
# Reverse-proxy all API traffic to FastAPI
reverse_proxy backend:8000
# Access log — useful for monitoring API traffic and debugging issues
log {
output file /data/api-access.log
}
# Enable response compression (JSON, text, etc.)
encode gzip zstd
}
# ------------------------------------------------------------
# LiveKit (signaling only — media uses direct UDP)
# Public URL: wss://rtc.avaaz.ai
# Internal target: livekit:7880
# ------------------------------------------------------------
rtc.avaaz.ai {
# LiveKit uses WebSocket signaling, so we reverse-proxy WS → WS
reverse_proxy livekit:7880
# Access log — helps diagnose WebRTC connection failures
log {
output file /data/rtc-access.log
}
# Compression not needed for WS traffic, but harmless
encode gzip zstd
}
# ------------------------------------------------------------
# Gitea (Git server UI + HTTPS + SSH clone)
# Public URL: https://git.avaaz.ai
# Internal target: gitea:3000
# ------------------------------------------------------------
git.avaaz.ai {
# Route all HTTPS traffic to Giteas web UI
reverse_proxy gitea:3000
# Log all Git UI requests and API access
log {
output file /data/git-access.log
}
# Compress UI responses
encode gzip zstd
}

102
infra/docker-compose.yml Normal file
View File

@@ -0,0 +1,102 @@
services:
caddy:
# Use the latest official Caddy image
image: caddy:latest
# Docker Compose automatically generates container names: <folder>_<service>_<index>
container_name: caddy # Fixed name used by Docker engine
# Automatically restart unless manually stopped
restart: unless-stopped
ports:
# Expose HTTP (ACME + redirect)
- "80:80"
# Expose HTTPS/WSS (frontend, backend, LiveKit)
- "443:443"
volumes:
# Mount the Caddy config file read-only
- ./Caddyfile:/etc/caddy/Caddyfile:ro
# Caddy TLS certs (persistent Docker volume)
- caddy_data:/data
# Internal Caddy state/config
- caddy_config:/config
networks:
# Attach to the shared "proxy" network
- proxy
gitea:
# Official Gitea image with built-in Actions
image: gitea/gitea:latest
container_name: gitea # Fixed name used by Docker engine
# Auto-restart service
restart: unless-stopped
environment:
# Run Gitea as host user 1000 (prevents permission issues)
- USER_UID=1000
# Same for group
- USER_GID=1000
# Use SQLite (stored inside /data)
- GITEA__database__DB_TYPE=sqlite3
# Location of the SQLite DB
- GITEA__database__PATH=/data/gitea/gitea.db
# Custom config directory
- GITEA_CUSTOM=/data/gitea
volumes:
# Bind mount instead of Docker volume because:
# - We want repos, configs, SSH keys, and SQLite DB **visible and editable** on host
# - Easy backups (just copy `./gitea-data`)
# - Easy migration
# - Avoids losing data if Docker volumes are pruned
- ./gitea-data:/data
networks:
- proxy
ports:
# SSH for Git operations mapped to host 2222
- "2222:22"
gitea-runner:
# Official Gitea Actions Runner
image: gitea/act_runner:latest
container_name: gitea-runner # Fixed name used by Docker engine
restart: unless-stopped
depends_on:
# Runner requires Gitea to be available
- gitea
volumes:
# Runner uses host Docker daemon to spin up job containers (Docker-out-of-Docker)
- /var/run/docker.sock:/var/run/docker.sock
# Bind mount instead of volume because:
# - Runner identity is stored in /data/.runner
# - Must persist across container recreations
# - Prevents duplicated runner registrations in Gitea
# - Easy to inspect/reset via `./gitea-runner-data/.runner`
- ./gitea-runner-data:/data
environment:
# Base URL of your Gitea instance
- GITEA_INSTANCE_URL=${GITEA_INSTANCE_URL}
# One-time registration token
- GITEA_RUNNER_REGISTRATION_TOKEN=${GITEA_RUNNER_REGISTRATION_TOKEN}
# Human-readable name for the runner
- GITEA_RUNNER_NAME=${GITEA_RUNNER_NAME}
# Runner labels (e.g., ubuntu-latest)
- GITEA_RUNNER_LABELS=${GITEA_RUNNER_LABELS}
# Set container timezone to UTC for consistent logs
- TZ=Etc/UTC
networks:
- proxy
# Start runner using persisted config
command: ["act_runner", "daemon", "--config", "/data/.runner"]
networks:
proxy:
# Shared network for Caddy + Gitea (+ later app stack)
name: proxy
# Default Docker bridge network
driver: bridge
volumes:
# Docker volume for Caddy TLS data (safe to keep inside Docker)
caddy_data:
name: caddy_data
# Docker volume for internal Caddy configs/state
caddy_config:
name: caddy_config

View File

@@ -0,0 +1,103 @@
APP_NAME = Gitea
RUN_MODE = prod
RUN_USER = git
WORK_PATH = /data/gitea
[repository]
ROOT = /data/git/
[repository.local]
LOCAL_COPY_PATH = /data/gitea/tmp/local-repo
[repository.upload]
TEMP_PATH = /data/gitea/uploads
[server]
PROTOCOL = http
APP_DATA_PATH = /data/gitea
DOMAIN = git.avaaz.ai
SSH_DOMAIN = git.avaaz.ai
HTTP_PORT = 3000
ROOT_URL = https://git.avaaz.ai/
DISABLE_SSH = false
SSH_PORT = 2222
SSH_LISTEN_PORT = 22
LFS_START_SERVER = true
LFS_JWT_SECRET = HbSrdK2xM1XsFwcX92OjA96s3X-L4H73Jhl0OPrLnEg
OFFLINE_MODE = true
[database]
PATH = /data/gitea/gitea.db
DB_TYPE = sqlite3
HOST = localhost:3306
NAME = gitea
USER = root
PASSWD =
LOG_SQL = false
SCHEMA =
SSL_MODE = disable
[indexer]
ISSUE_INDEXER_PATH = /data/gitea/indexers/issues.bleve
[session]
PROVIDER_CONFIG = /data/gitea/sessions
PROVIDER = file
[picture]
AVATAR_UPLOAD_PATH = /data/gitea/avatars
REPOSITORY_AVATAR_UPLOAD_PATH = /data/gitea/repo-avatars
[attachment]
PATH = /data/gitea/attachments
[log]
MODE = console
LEVEL = info
ROOT_PATH = /data/gitea/log
[security]
INSTALL_LOCK = true
SECRET_KEY =
REVERSE_PROXY_LIMIT = 1
REVERSE_PROXY_TRUSTED_PROXIES = *
INTERNAL_TOKEN = eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJuYmYiOjE3NjMwMTg2Mjd9.O0B7VVK_TRiM8fkn8Jcw0K10ypWX-r6K_lmeFNhIlo4
PASSWORD_HASH_ALGO = pbkdf2
[service]
DISABLE_REGISTRATION = false
REQUIRE_SIGNIN_VIEW = true
REGISTER_EMAIL_CONFIRM = false
ENABLE_NOTIFY_MAIL = false
ALLOW_ONLY_EXTERNAL_REGISTRATION = false
ENABLE_CAPTCHA = false
DEFAULT_KEEP_EMAIL_PRIVATE = false
DEFAULT_ALLOW_CREATE_ORGANIZATION = true
DEFAULT_ENABLE_TIMETRACKING = true
NO_REPLY_ADDRESS = noreply.localhost
[lfs]
PATH = /data/git/lfs
[mailer]
ENABLED = false
[openid]
ENABLE_OPENID_SIGNIN = true
ENABLE_OPENID_SIGNUP = true
[cron.update_checker]
ENABLED = true
[repository.pull-request]
DEFAULT_MERGE_STYLE = merge
[repository.signing]
DEFAULT_TRUST_MODEL = committer
[oauth2]
JWT_SECRET = c0-Xl6vRyjNC9UPykpCWA_XtXC62fygtoPh2ZxJgQu4
[actions]
ENABLED = true
DEFAULT_ACTIONS_URL = github