One Proxy, Whole Team, Full Visibility
Individual Claude Code costs add up fast. At team scale, they compound, and nobody can see where the money goes. Here's how to fix both problems at once.
~10 min readThe Team Cost Problem
A single developer running Claude Code daily spends $100–600/month. That's already uncomfortable. Now multiply it by your team.
| Team size | Monthly without Prefex | Monthly with Prefex | Annual savings |
|---|---|---|---|
| 3 developers | $300–900 | $105–315 | ~$2,340/year |
| 8 developers | $800–2,400 | $280–840 | ~$6,240/year |
| 20 developers | $2,000–6,000 | $700–2,100 | ~$15,600/year |
| 50 developers | $5,000–15,000 | $1,750–5,250 | ~$39,000/year |
The savings come from the same place they do for individuals: repeated context being billed at full price. But at team scale, there's a second problem, one that solo users never face.
Every developer on your team is running their own isolated proxy. Same project context, same CLAUDE.md, same tool definitions, cached separately on each machine, with no shared benefit. A cache hit for one developer saves nothing for anyone else. You're paying 10x for the same tokens.
Flying Blind at Team Scale
Even if each developer is saving individually, you have a larger problem: no visibility.
- Finance asks for a breakdown, you have Anthropic's invoice total and nothing else. Which project? Which developer? Which model? Unknown.
- One developer runs a runaway agent, you find out when the invoice arrives. No alerts, no cap, no circuit breaker.
- Budget enforcement is manual, you can set account-level limits in the Anthropic console, but they're blunt. One developer blowing their budget can cut off everyone else.
- Onboarding is inconsistent, some developers configure their own proxy, some don't. Optimization quality varies. You have no idea what anyone is actually using.
- Audit trail is nonexistent, if something goes wrong, you can't trace which session, which developer, which request caused it.
This isn't a cost problem. It's a visibility problem. And at team scale, the two are inseparable.
How Team Mode Works
Team mode is one shared proxy instance, typically running on a server or a dedicated machine on your network, that all developers route through. You install Prefex once. Every developer points their Claude Code at it with a single setting change.
// Each developer's ~/.claude/settings.json
{ "env": { "ANTHROPIC_BASE_URL": "http://your-proxy:8019" } }
From there, every Claude Code request from every developer flows through the shared proxy. The proxy handles auth, optimization, routing, and logging, then forwards to Anthropic with the right API key.
What the shared proxy adds vs. individual installs
How developer auth works
Each developer gets a unique token when they join via invite. They set it as an env var once. The proxy uses it to tag every request, no VPN required, no shared credentials, tokens can be revoked instantly.
# Developer sets their token once (added by prefex join automatically) export PREFEX_DEV_TOKEN="dev-abc123..."
If the proxy is unreachable or returns an error, Claude Code falls back to Anthropic directly. Developer work is never blocked by proxy downtime.
Admin Setup (5 Minutes)
The proxy needs a home, a machine or server accessible to your team. It can be a Linux server, an EC2 instance, a Mac mini in your office, or a developer's machine if the team is co-located. It does not need a public IP if all developers are on the same network.
Step 1, Install on the proxy machine
curl -fsSL https://promptforce.ai/install.sh | bash
Step 2, Configure team mode
Edit ~/.prefex/config.yaml on the proxy machine:
prefex:
upstream:
anthropic_api_key: "${ANTHROPIC_API_KEY}" # the shared API key
team:
enabled: true
admin_token: "your-secret-admin-token" # keep this private
public_url: "http://your-proxy-host:8019" # how devs reach the proxy
Step 3, Start the proxy
prefex start
Step 4, Open the team dashboard
From the proxy machine (or any machine with access to it):
open http://localhost:8019/dashboard/team
Enter your admin token. The dashboard is your control plane, members, budgets, audit log, SSO, and the team knowledge view are all here.
Team mode requires a Pro or Team license. Run prefex renew or visit promptforce.ai/renew to activate one. The install script walks you through it automatically.
Onboarding Developers
Developers don't install a server, they just point their existing Claude Code at yours. The process takes under two minutes.
Admin side, generate an invite
In the team dashboard, enter the developer's email (for your records, no email is sent automatically), click Generate Invite, and copy the join command:
prefex join abc123def456 --proxy http://your-proxy:8019
Send this command to the developer however you normally communicate, Slack, email, chat. The invite expires after 48 hours by default.
Developer side, run the join command
-
Run the join command
This downloads the Prefex binary (if not already installed), authenticates with your proxy, and gets a dev token. -
Prefex patches Claude Code settings
The join command automatically updates~/.claude/settings.jsonto route through the shared proxy. No manual editing. -
Done
The next Claude Code request flows through the shared proxy. The developer appears in the team dashboard within seconds.
If you have Google Workspace, Okta, or Azure AD, you can enable SSO so developers sign in with their existing identity instead of a join command. See Section 07.
Roles
Each developer gets a role that controls what they can see in the dashboard:
| Role | Can see | Can do |
|---|---|---|
| admin | Everything | All actions including config, member management, budget changes |
| manager | All spend, all members | Set budgets, manage members, generate invites |
| analyst | All spend, audit log | Read-only on all team data |
| developer | Own spend only | None |
Managing Spend & Access
Budget caps
Set a monthly cap per developer or a team-wide cap. When a developer reaches their cap, requests return 429 until the period resets. You can set both, the team cap fires if the aggregate crosses it regardless of individual caps.
# Via API (or use the dashboard UI)
# Per-developer cap: $50/month
POST /api/team/budgets
{ "dev_id": "dev-abc123", "period": "monthly", "limit_usd": 50 }
# Team-wide cap: $500/month
POST /api/team/budget/team
{ "period": "monthly", "limit_usd": 500 }
A warning threshold (default 80%) fires a log entry and optional Slack notification before the hard cap, giving the developer a chance to wrap up before they're blocked.
Rate limits
Separate from cost caps, rate limits control velocity, useful for preventing runaway agents from hammering the API in a tight loop even if they're within budget.
# 200 requests/day, 500k tokens/day per developer
POST /api/team/rate-limits
{ "dev_id": "dev-abc123", "max_requests": 200, "max_tokens": 500000 }
Billing modes
For each developer you can choose who pays for their API calls:
| Mode | Who pays | When to use |
|---|---|---|
proxy | Team API key | Default, centralized billing, full visibility |
passthrough | Developer's own API key | Contractors, personal projects, developers who prefer to pay themselves |
proxy_then_own | Team key until cap, then developer's own key | Shared cost model, team covers baseline, developer pays for excess |
Revoking access
From the dashboard, click Revoke on any member. The token is invalidated instantly, not after the 30-second cache window, immediately. The next request from that developer returns 401.
SSO & Compliance
For teams with an identity provider (Google Workspace, Okta, Azure AD, or any OIDC-compliant IdP), you can replace the invite/token flow entirely with SSO login.
How it works
Developers visit the team dashboard and click Sign in with SSO. They authenticate through your IdP. The proxy verifies the ID token, creates a member record automatically, and issues a short-lived session cookie. No token to manage, no invite to generate for returning members.
Configuring SSO
In the team dashboard under Single Sign-On (OIDC):
- Set Issuer URL, your IdP's discovery endpoint (e.g.
https://accounts.google.com) - Set Client ID and Client Secret from your IdP's app registration
- Set Redirect URL,
http://your-proxy:8019/auth/callback - Set Admin Emails, these addresses get admin role on first login
- Optionally set Allow Domains, any email from
yourcompany.comgets developer role automatically - Click Save, then Test Connection to verify the IdP discovery succeeds
| Provider | Issuer URL |
|---|---|
| Google Workspace | https://accounts.google.com |
| Okta | https://your-org.okta.com |
| Azure AD | https://login.microsoftonline.com/{tenant}/v2.0 |
| Any OIDC IdP | The IdP's /.well-known/openid-configuration base URL |
Audit log
Every admin action is recorded: who did it, what changed, when. The audit log is queryable by actor, target, and date range from the dashboard or API. Useful for compliance reviews and incident investigation.
# Export audit log for the last 30 days GET /api/team/audit/export?format=csv&start_ts=1714521600
Team Knowledge
Every Claude Code session your team runs produces observations that are worth keeping: API quirks, architecture decisions, debugging patterns, gotchas that took two hours to find. Normally these live in a transcript nobody re-reads.
Team mode extracts these automatically. After each session, a background job pulls decisions, recurring patterns, and notable activity from the conversation, already PII-scrubbed before anything else touches it, and stores them in a shared knowledge base.
The knowledge tab
In the team dashboard under Team Knowledge, you get a live view of what your team has been learning, filterable by type, developer, and project.
| Type | What it captures | Example |
|---|---|---|
| Decision | Architectural choices made | "Used SQLite for team_budgets, avoids network dependency for team features" |
| Pattern | Recurring approaches observed | "Fail-open pattern: internal errors log and forward to upstream unchanged" |
| Activity | Files touched, tasks completed | "Updated handler.go, recorder.go, added billing_source field to request log" |
Every entry shows which developer it came from and which project it belongs to. A new team member can read three months of team knowledge in minutes and skip weeks of pairing to pick up context that's normally implicit.
Knowledge accumulates from work your team is already doing. No developer has to write anything down. The more your team uses Claude Code, the richer the knowledge base gets.
Secrets Vault, Credential Security for Agents
AI agents that call external APIs need credentials. The naive approach, putting real API keys in prompts or environment variables, means secrets appear in conversation logs, request bodies, and wherever your team's AI sessions are stored.
The secrets vault gives every agent a sealed token instead of a real credential. The proxy substitutes the real value in-flight, so agents and logs only ever see pfx_sealed_….
How the MITM proxy works
The proxy runs on port 8020 alongside the main proxy. It holds a local CA certificate that you install once in the machine's trust store. When an agent makes an HTTPS request to an allowlisted host (e.g. api.github.com), the proxy:
- Terminates TLS using the local CA, the agent's request is decrypted
- Scans headers and body for
pfx_sealed_…tokens - Replaces each token with its plaintext value from the encrypted vault
- Re-encrypts and forwards to the real upstream host
Non-allowlisted hosts are blocked at CONNECT with 403, the proxy never sees their traffic at all.
Admin setup
Enable secrets in ~/.prefex/config.yaml:
prefex:
secrets:
enabled: true
proxy_port: 8020
allowlist:
- api.anthropic.com
- api.openai.com
- api.github.com # add any host your agents need
Then install the CA once on the machine running agents:
prefex secret install-ca # prompts for confirmation, then runs sudo security/update-ca-certificates
Managing secrets from the dashboard
In the team dashboard under Secrets Vault:
- Add a secret (name + value) → dashboard shows the sealed token to copy
- Click any sealed token to copy it to clipboard
- Delete secrets by name, the token becomes invalid immediately
- Manage the allowlist, add or remove interception hosts without restarting
- Download the CA cert or get per-platform install instructions
Developer workflow
Once admin has stored credentials, developers use sealed tokens in prompts or CLAUDE.md:
# In CLAUDE.md or system prompt: Use GitHub API with Authorization header: Bearer pfx_sealed_a3f2c8b1… # Set HTTPS proxy so agents route through the secrets proxy: export HTTPS_PROXY=http://127.0.0.1:8020
CLI reference
prefex secret add <name> <value> # encrypt + store; prints sealed token prefex secret list # show names + tokens (no plaintext) prefex secret remove <name> # delete by name prefex secret show-ca # print CA cert PEM to stdout prefex secret install-ca # install CA in system trust store
| Without secrets vault | With secrets vault |
|---|---|
| Real API keys in prompts and logs | Sealed tokens only, keys never in logs |
| Key rotation requires updating all prompts | Update vault once, all agents pick up immediately |
| Revoke = rotate the key everywhere | Delete secret, token instantly invalid |
| Audit: unknown who used which key | Last-used timestamp per secret |
The vault encryption key lives at ~/.prefex/keys/vault.key. Back it up securely, all stored secrets are unrecoverable without it. Only the machine running the proxy needs this file.
The Numbers at Scale
10-developer team, one month
| Metric | Without Prefex | With Prefex Team |
|---|---|---|
| Monthly API cost | ~$1,800 | ~$630 |
| Cost visibility | Invoice total only | Per-developer, per-project, per-model |
| Budget overruns | Discovered at month end | Blocked at threshold, alerted at 80% |
| Runaway agent incidents | No detection | Rate-limited before they compound |
| Onboarding a new developer | Self-configure, inconsistent | One join command, consistent optimization |
| Audit trail | None | Full log of all admin actions |
| Annual savings | ~$14,000 |
Getting started
The fastest path: install on any machine your team can reach, enable team mode in config, and generate invites for your developers. Total time: under 10 minutes for the admin, under 2 minutes per developer.
-
Install on your proxy machine
curl -fsSL https://promptforce.ai/install.sh | bash -
Enable team mode in
~/.prefex/config.yaml
Setteam.enabled: true,team.admin_token, andteam.public_url -
Start the proxy
prefex start -
Open the team dashboard
open http://localhost:8019/dashboard/team, enter your admin token -
Invite your first developer
Enter their email label, click Generate Invite, send them theprefex joincommand
Open an issue at github.com/PromptForcePrime/prefex or check the solo guide for the individual-user fundamentals that team mode builds on.