Hi! Thanks for the detailed report — dug into the service (paperclip, service-6a1e93028197c9aa0ae2d1c3, Tencent Singapore shared cluster). Findings:
Issue 1 — running the one-time setup commands
You're right that this is likely not a resource problem — it's almost certainly the dashboard web-terminal disconnect bug we're currently tracking: the in-browser terminal drops the session about every ~2 minutes, so a command can get cut off mid-run and look like it "won't run." The container itself is fine — I exec'd into it just now: working dir /app, pnpm on PATH at /usr/local/bin/pnpm, Node v24.
The reliable workaround is to run it from the Zeabur CLI instead of the dashboard terminal — same kubectl exec backend, but not affected by the disconnect bug:
npx zeabur@latest service exec --id 6a1e93028197c9aa0ae2d1c3 -- pnpm paperclipai auth bootstrap-ceo
A few important notes for Shared Cluster (no SSH — this exec/terminal is the supported way in):
- It runs inside the live container, on the app's PATH — confirmed above.
- The container filesystem is ephemeral — wiped on every redeploy/restart. So:
- The admin invite only survives a redeploy if Paperclip persists it to a database / mounted volume. If it writes to local disk in the container, you'll need to re-run
bootstrap-ceo after each redeploy.
- For the Claude token, don't
export CLAUDE_CODE_OAUTH_TOKEN in the terminal — that only affects that one shell, isn't visible to the running app process, and is lost on redeploy. Set it as a Zeabur environment variable (service → Variables) and redeploy; that persists and is injected into the app.
- Dedicated Server required? No — Shared Cluster handles this fine. A dedicated server only matters if you want a persistent SSH/VM-style environment.
Issue 2 — intermittent Cloudflare 524
Based on what I can see, this is not OOM and 8 GB won't fix it:
- Container uptime is ~39h with no restarts, memory usage only ~214 MB — no OOM kills, the 4 GB limit isn't the bottleneck.
- Your logs show
GET / 200 served fast. A 524 means Cloudflare opened the connection to the origin but got no response headers within ~100s — a specific request hung, not the whole app.
- Side note: there's a steady flood of bot scanners hitting
.env paths (/wp/.env, /laravel/.env…) returning 200 — noise, not the 524 cause, but you may want a Cloudflare WAF rule to block it.
To pin it down, could you share:
- The exact path(s) that 524 (always
/, or an API/AI route doing long work?) plus a couple of precise timestamps with timezone of recent 524s.
- Whether 524s correlate with running an AI/agent task — Claude Code calls can run >100s and would trip Cloudflare's 100s limit.
(Tencent Singapore is a newer region with lighter monitoring on our side, so timestamps help us trace the origin.)
Thanks!