Troubleshooting
Common issues when self-hosting Multica — symptoms, causes, how to diagnose, how to fix.
Look up issues by symptom. Each entry gives you symptom / likely causes / how to diagnose / how to fix. If your situation isn't listed, open an issue on GitHub.
Daemon can't connect to the server
Symptom: multica daemon's status command shows offline or connection refused; the server logs show no /api/daemon/register or /api/daemon/heartbeat requests. For how the daemon mechanism works, see Daemon and runtimes.
Likely causes:
MULTICA_SERVER_URLpoints at the wrong address — default isws://localhost:8080/ws; self-host must change it to your server address- Network / firewall blocking — the daemon and server aren't on the same network, or outbound traffic is blocked
- Token expired or invalid — you never ran
multica login, or the PAT was revoked - Server rejected registration — the account you signed in with isn't in the target workspace (register returns 403)
- DNS resolution failure — the hostname doesn't resolve on the daemon machine
How to diagnose:
multica daemon logs --lines 100 # look for daemon-side errors
echo $MULTICA_SERVER_URL # confirm the address is set
curl -i http://<server-host>:8080/health # hit the server directly
curl -i http://<server-host>:8080/readyz # include DB + migration readiness
cat ~/.multica/config.json # verify api_token exists
multica workspace list # confirm you're a member of the target workspaceHow to fix: address each cause above. The two most common fixes are changing MULTICA_SERVER_URL and restarting the daemon (multica daemon restart) and signing in again (multica logout && multica login).
Tasks stuck in queued
Symptom: after assigning an issue to an agent, the issue status flips to in_progress immediately, but a long time passes with no sign of agent execution on the page; multica daemon status shows the daemon online.
Likely causes (ordered by frequency):
- Agent concurrency limit reached — this agent's
max_concurrent_tasks(default 6) is fully occupied by other running tasks - Another task from the same agent is still running on the same issue — same agent × same issue is forced to run sequentially (prevents duplicate execution)
- Agent has been archived — after archival, new tasks still enqueue but can't be claimed, and they time out after 5 minutes (code-issue G-01)
- Daemon hasn't registered this runtime in the current workspace — restart the daemon or reselect the runtime in the UI
- Daemon disconnected — no heartbeat in the last 45 seconds.
daemon statusreportingonlinemay reflect a very recent disconnect
How to diagnose:
multica daemon status --output json # runtime list + last_seen_at
multica agent list # check agent archived state
multica issue show <issue-id> # inspect task historyOn the server side (self-host), grep for "no_tasks" / "no_capacity" to see the claim outcome.
How to fix:
- Concurrency full → wait for running tasks to finish, or
multica agent update <id> --max-concurrent-tasks 10to raise the ceiling - Same-issue serialization → wait for the previous task to finish, or reassign to a different agent
- Agent archived →
multica agent restore <id> - Runtime not registered →
multica daemon restart, and the daemon will re-register
WebSocket can't connect
Symptom: the browser console logs WebSocket is closed; the page doesn't show real-time updates (task progress, comments, inbox), and a refresh is needed to see them; backend tasks still execute.
Likely causes:
- Origin check failure — your frontend domain isn't in the server's CORS allowlist. The default allowlist only includes
localhost:3000/5173/5174; self-hosting on the public internet requiresFRONTEND_ORIGIN - Protocol mismatch — frontend on
https://needswss://; HTTP usesws:// - Reverse proxy doesn't enable WebSocket upgrade — Nginx / Envoy / HAProxy don't forward the
Upgradeheader by default - JWT cookie expired or missing — no re-sign-in after the 30-day expiry
How to diagnose:
- Browser DevTools → Network → filter by "WS" and check connection state and status code
- Grep server logs for
"rejected origin"/"websocket"— an origin issue spells itself out curl -i http://<server-host>:8080/wsshould return101 Switching Protocols(with theUpgradeheader)
How to fix:
- Wrong origin → set
FRONTEND_ORIGIN=https://multica.yourdomain.comin the server's.env(or comma-separatedCORS_ALLOWED_ORIGINS) and restart the server - Protocol mismatch → make sure
FRONTEND_ORIGIN's protocol matches the frontend's - Reverse proxy → in Nginx, add
proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; - Cookie expired → refresh the page and sign in again
Emails not received
Symptom: after submitting an email during sign-in or invite acceptance, neither the inbox nor the spam folder has the verification code.
First, confirm which provider the server thinks is active. At startup the backend prints one of:
EmailService: SMTP relay <host>:<port> from=<addr>— using SMTP (SMTP_HOSTnon-empty wins over Resend)EmailService: Resend API from=<addr>— using ResendEmailService: DEV mode — codes printed to stdout …— no provider configured
docker compose -f docker-compose.selfhost.yml logs backend | grep "EmailService:"If the line you expected isn't there, the environment didn't reach the process — check .env and docker compose -f docker-compose.selfhost.yml exec backend env | grep -E 'RESEND_|SMTP_'. Credentials are never logged on this startup line.
When Resend is the active provider
Likely causes:
RESEND_API_KEYnot set — the server silently falls back and writes the code to its own stdout without error. Easy to trip over in production- Resend API key invalid / out of quota — server logs show
"failed to send verification code" RESEND_FROM_EMAIL's domain not verified in Resend — Resend refuses to send- Email was sent but flagged as spam by the recipient's ISP — check the Resend dashboard and the spam folder
How to diagnose:
- Grep server logs for
"[DEV] Verification code for"— if present, Resend isn't configured and the code was written to stdout - Resend dashboard → Emails for send history
- Confirm
RESEND_FROM_EMAIL's domain appears in the Resend console's "Verified Domains" list
How to fix:
- Missing API key → follow Sign-in and signup configuration → How email works to configure and restart the server
- Domain not verified → run the DNS verification flow in the Resend console (add SPF / DKIM records)
- In an emergency (internal testing) → copy the code printed under
[DEV]from the server logs
When SMTP is the active provider
The SMTP path wraps every failure with the stage it failed at, so the server logs already tell you where the relay rejected the session. Grep for "failed to send verification email" / "failed to send invitation email" and check the wrapped error:
| Logged error | What it means | How to fix |
|---|---|---|
smtp dial <host>:<port>: dial tcp …: connect: connection refused / i/o timeout | The backend container can't reach the relay — wrong host, wrong port, firewall, or the relay isn't listening | Verify SMTP_HOST / SMTP_PORT resolve from inside the container (docker compose -f docker-compose.selfhost.yml exec backend nslookup <host> and nc -vz <host> <port>); open the firewall from the host running Multica to the relay |
smtp starttls: x509: certificate signed by unknown authority (or certificate is not valid for any names) | The relay uses a private CA / self-signed cert and the container's trust store rejects it | Either install the CA into the container, or set SMTP_TLS_INSECURE=true only after confirming the relay is reachable on a trusted segment |
smtp auth: 535 5.7.8 Authentication credentials invalid (or 534/530) | SMTP_USERNAME / SMTP_PASSWORD are wrong, or the relay requires a different auth mechanism than PLAIN | Re-confirm the service-account credentials with your mail admin; for Exchange anonymous internal relay leave both empty (SMTP_USERNAME=, SMTP_PASSWORD=) |
smtp MAIL FROM: 550 5.7.1 Client does not have permissions to send as this sender | The relay won't accept RESEND_FROM_EMAIL as the envelope sender — typical Exchange "anonymous users not allowed" or DMARC alignment issue | Set RESEND_FROM_EMAIL to a domain the relay accepts; on Exchange, grant the source IP ms-Exch-SMTP-Accept-Any-Sender on the receive connector |
smtp RCPT TO <addr>: 550 5.7.1 Unable to relay | The relay's receive connector doesn't allow your subnet to relay to external recipients (most common for anonymous internal relays talking to outside domains) | Either restrict invites to internal recipients, or add the Multica host's subnet to the Exchange "Anonymous Users → Relay" permission list |
smtp DATA / smtp write body / smtp end data | Session was accepted but the relay dropped the body — usually message-size limits, content filtering, or a connection reset mid-stream | Check the relay's logs for the same Message-ID (logged as <unixnano>@<host>); raise the message size limit if needed |
MAIL FROM, RCPT TO, and DATA errors are always logged with the relay's response code so you can match them against Exchange / Postfix logs on the other side. Verification codes and invite tokens are never included in the wrapped error.
How to diagnose:
- Grep
"EmailService: SMTP relay"once at startup, then"failed to send"for runtime failures - From inside the backend container, sanity-check connectivity:
docker compose -f docker-compose.selfhost.yml exec backend sh -c 'nc -vz $SMTP_HOST $SMTP_PORT' - Confirm the env reached the process:
docker compose -f docker-compose.selfhost.yml exec backend env | grep SMTP_(password will be in the output — only run on a trusted shell)
How to fix:
- Wrong host / port → adjust
SMTP_HOST/SMTP_PORTand restart the backend; for the supported relay modes see Auth setup → Option B: SMTP relay - Cert mismatch → install the relay's CA into the container, or temporarily
SMTP_TLS_INSECURE=trueon a trusted segment - Auth failure → re-check credentials; for anonymous internal relay leave
SMTP_USERNAMEandSMTP_PASSWORDempty Unable to relay→ either restrict to internal recipients or grant the Multica host's IP relay permission on the Exchange receive connector
Fixed local test code doesn't work
Symptom: on a self-hosted instance, you try to sign in with a fixed local test code such as 888888 and it's rejected with invalid or expired code.
Likely causes (mutually exclusive):
MULTICA_DEV_VERIFICATION_CODEis empty — fixed codes are disabled by defaultAPP_ENV=production— this is the correct production configuration; fixed local test codes are ignored in production- The configured code is not 6 digits — the shortcut only accepts a 6-digit value
How to diagnose:
cat .env | grep -E 'APP_ENV|MULTICA_DEV_VERIFICATION_CODE'
docker exec <container> env | grep -E 'APP_ENV|MULTICA_DEV_VERIFICATION_CODE'Check your inbox (including spam) for the real verification code.
How to fix:
- In production, leave
MULTICA_DEV_VERIFICATION_CODEempty — configure Resend and use real codes - For local development or internal testing, either copy the generated code from server logs or set
APP_ENV=developmentplusMULTICA_DEV_VERIFICATION_CODE=888888— never enable a fixed code on a public instance (see Sign-in and signup configuration → Fixed local testing codes)
Usage dashboard stays at zero
Symptom: agents complete tasks, raw token usage is written to the database, but Settings → Usage and Settings → Runtime show 0 input / output / cost across the board. This is silent — there is no error in the backend logs.
Likely causes:
rollup_task_usage_hourly()is never being claimed — the Usage / Runtime dashboards read from the derivedtask_usage_hourlytable, populated by that function. Since MUL-2957 the backend runs the rollup in-process via the DB-backed scheduler (sys_cron_executions); a stale build, a missing migration113, or a sustained backend outage with no replicas left running can leave the table without a recent SUCCESS row.pg_cronis configured for compatibility but pointing at the wrong database —pg_cron.database_namedefaults topostgres; if your Multica database has a different name, the scheduled job never seesrollup_task_usage_hourly(). The in-process scheduler does not depend on this, but if you removed the in-process scheduler and rely onpg_cron, the DB name must match.- The handler is being claimed but silently erroring — e.g. the SQL function is missing because migrations were partially applied, or DB role / search_path is misconfigured. Check the FAILED audit rows in
sys_cron_executions.
How to diagnose:
-- Confirm raw events exist but the hourly table is empty.
SELECT count(*) AS raw_rows FROM task_usage;
SELECT count(*) AS hourly_rows FROM task_usage_hourly;
-- Inspect the in-process scheduler's audit log.
SELECT plan_time, status, attempt, runner_id,
error_code, error_msg, started_at, finished_at
FROM sys_cron_executions
WHERE job_name = 'rollup_task_usage_hourly'
ORDER BY plan_time DESC
LIMIT 20;
-- Watermark — if this is 1970-01-01, the rollup has never run.
SELECT watermark_at FROM task_usage_hourly_rollup_state;
-- Compatibility path: if you previously registered pg_cron, confirm
-- it is (or isn't) available and pointing at the right database.
SELECT * FROM pg_available_extensions WHERE name = 'pg_cron';
SHOW shared_preload_libraries;
SELECT jobname, schedule, database, active FROM cron.job;How to fix:
- Confirm the scheduler is actually running on at least one backend replica — every 30 seconds it should add a SUCCESS row to
sys_cron_executionsforrollup_task_usage_hourly. - Call the rollup once by hand to verify the SQL path:
SELECT rollup_task_usage_hourly();— refresh the dashboard; if numbers appear, the SQL function is fine and the issue is on the scheduler claim path. - If migration
113_sys_cron_executionshas not applied yet, restart the backend so migrations run, or invokemigrate upmanually. - If you have legacy
pg_cronhistory that pre-dates the in-process scheduler, the SQL function still holds advisory lock 4246 internally and the two paths cannot double-write — see Self-host quickstart → Usage rollup for the optionalcron.unschedulecleanup.
Migration 103 fails with refusing to drop legacy daily rollups
Symptom: upgrading from v0.3.4 to v0.3.5+, the backend container fails to start (or migrate up aborts) with:
ERROR: refusing to drop legacy daily rollups:
task_usage_hourly_rollup_state.watermark_at (1970-01-01 ...) trails
task_usage latest event (...) by more than 01:00:00 — backfill is
incomplete or pg_cron is not running. Run cmd/backfill_task_usage_hourly
(and let pg_cron catch up) before re-running migrateLikely cause: this is migration 103's fail-closed guard. It refuses to drop the legacy daily rollups until task_usage_hourly has caught up with raw task_usage. The guard fires whenever existing rows are present and the rollup watermark still sits at the epoch — i.e. nothing has rolled history into the hourly table yet.
Since MUL-2957 the migrate command runs an idempotent monthly-slice backfill (under advisory lock 4246) automatically immediately before applying migration 103, so v0.3.4 → v0.3.5+ direct upgrades complete in a single migrate up invocation. If you are still seeing this error you are either on a pre-MUL-2957 binary or the hook itself failed — check the migrate logs for an earlier task_usage hourly rollup hook line.
How to fix:
-
If you are on a pre-MUL-2957 binary and cannot upgrade the binary first, run the standalone backfill against the same database (idempotent, safe to interrupt, safe to re-run):
# Docker Compose docker compose -f docker-compose.selfhost.yml exec backend \ ./backfill_task_usage_hourly --sleep-between-slices=2s # Kubernetes kubectl -n multica exec deploy/multica-backend -- \ ./backfill_task_usage_hourly --sleep-between-slices=2s -
Re-run the upgrade — restarting the backend container is enough, migrations run on startup. The guard now sees a current watermark and lets
103apply. -
The in-process scheduler then keeps the watermark advancing — see Self-host quickstart → Usage rollup.
--sleep-between-slices=2s is a polite default on production databases with years of history. Use --months-back N --force-partial if you only want to keep the last N months and are willing to permanently abandon older buckets.
Port conflicts
Symptom: multica server or multica daemon start fails with address already in use.
Likely causes:
- Server port taken (default
8080) - Daemon health port taken (default
19514, offset by a hash per profile) - Web dev server port conflict (
3000/5173) - Insufficient privileges for the port (binding a privileged port
< 1024requires sudo)
How to diagnose:
lsof -i :8080 # macOS / Linux
netstat -ano | findstr :8080 # WindowsHow to fix:
- Kill the conflicting process (
kill -9 <PID>), or change ports viaPORT=9000 - To use 80 / 443 → don't bind directly; put a reverse proxy (Nginx / Caddy) in front, forwarding to a high port
Where to find logs
| Component | Location | Command |
|---|---|---|
| Daemon | ~/.multica/daemon.log (background mode) or foreground stdout | multica daemon logs -f --lines 100 |
| Server (Docker) | Container stdout | docker logs -f <container> |
| Server (systemd) | journal | journalctl -u multica-server -f |
| Frontend (dev) | Terminal running pnpm dev | Read directly |
| Frontend (browser) | DevTools → Console | Press F12 |
For more detailed daemon logs, move it from background to foreground: multica daemon stop && multica daemon start --foreground.