Team / Pro

One Proxy, Whole Team, Full Visibility

Individual Claude Code costs add up fast. At team scale, they compound, and nobody can see where the money goes. Here's how to fix both problems at once.

~10 min read
01

The Team Cost Problem

A single developer running Claude Code daily spends $100–600/month. That's already uncomfortable. Now multiply it by your team.

Team sizeMonthly without PrefexMonthly with PrefexAnnual savings
3 developers$300–900$105–315~$2,340/year
8 developers$800–2,400$280–840~$6,240/year
20 developers$2,000–6,000$700–2,100~$15,600/year
50 developers$5,000–15,000$1,750–5,250~$39,000/year

The savings come from the same place they do for individuals: repeated context being billed at full price. But at team scale, there's a second problem, one that solo users never face.

The team-specific problem

Every developer on your team is running their own isolated proxy. Same project context, same CLAUDE.md, same tool definitions, cached separately on each machine, with no shared benefit. A cache hit for one developer saves nothing for anyone else. You're paying 10x for the same tokens.

02

Flying Blind at Team Scale

Even if each developer is saving individually, you have a larger problem: no visibility.

  • Finance asks for a breakdown, you have Anthropic's invoice total and nothing else. Which project? Which developer? Which model? Unknown.
  • One developer runs a runaway agent, you find out when the invoice arrives. No alerts, no cap, no circuit breaker.
  • Budget enforcement is manual, you can set account-level limits in the Anthropic console, but they're blunt. One developer blowing their budget can cut off everyone else.
  • Onboarding is inconsistent, some developers configure their own proxy, some don't. Optimization quality varies. You have no idea what anyone is actually using.
  • Audit trail is nonexistent, if something goes wrong, you can't trace which session, which developer, which request caused it.

This isn't a cost problem. It's a visibility problem. And at team scale, the two are inseparable.

03

How Team Mode Works

Team mode is one shared proxy instance, typically running on a server or a dedicated machine on your network, that all developers route through. You install Prefex once. Every developer points their Claude Code at it with a single setting change.

// Each developer's ~/.claude/settings.json
{ "env": { "ANTHROPIC_BASE_URL": "http://your-proxy:8019" } }

From there, every Claude Code request from every developer flows through the shared proxy. The proxy handles auth, optimization, routing, and logging, then forwards to Anthropic with the right API key.

What the shared proxy adds vs. individual installs

Per-developer cost tracking
Every request is tagged with a developer ID. Dashboard shows spend, cache hit rate, and model usage broken down by person.
Budget caps with enforcement
Set daily, weekly, or monthly spend limits per developer or team-wide. Requests over the cap return 429, no surprise invoices.
Rate limiting
Cap requests-per-minute and tokens-per-day per developer. Runaway agents hit the limit before they hit your bill.
Billing mode control
Per-developer: proxy key (admin pays), passthrough (dev pays with own key), or proxy-then-own (proxy until cap, then fallback).
Audit log
Every admin action, member add, token revoke, budget change, config update, is logged with timestamp and actor.
Team knowledge base
Decisions, patterns, and observations from every developer's sessions are extracted and surfaced in a shared knowledge view.

How developer auth works

Each developer gets a unique token when they join via invite. They set it as an env var once. The proxy uses it to tag every request, no VPN required, no shared credentials, tokens can be revoked instantly.

# Developer sets their token once (added by prefex join automatically)
export PREFEX_DEV_TOKEN="dev-abc123..."
Fail-open guarantee

If the proxy is unreachable or returns an error, Claude Code falls back to Anthropic directly. Developer work is never blocked by proxy downtime.

04

Admin Setup (5 Minutes)

The proxy needs a home, a machine or server accessible to your team. It can be a Linux server, an EC2 instance, a Mac mini in your office, or a developer's machine if the team is co-located. It does not need a public IP if all developers are on the same network.

Step 1, Install on the proxy machine

curl -fsSL https://promptforce.ai/install.sh | bash

Step 2, Configure team mode

Edit ~/.prefex/config.yaml on the proxy machine:

prefex:
  upstream:
    anthropic_api_key: "${ANTHROPIC_API_KEY}"   # the shared API key

  team:
    enabled: true
    admin_token: "your-secret-admin-token"       # keep this private
    public_url: "http://your-proxy-host:8019"    # how devs reach the proxy

Step 3, Start the proxy

prefex start

Step 4, Open the team dashboard

From the proxy machine (or any machine with access to it):

open http://localhost:8019/dashboard/team

Enter your admin token. The dashboard is your control plane, members, budgets, audit log, SSO, and the team knowledge view are all here.

Pro license required

Team mode requires a Pro or Team license. Run prefex renew or visit promptforce.ai/renew to activate one. The install script walks you through it automatically.

05

Onboarding Developers

Developers don't install a server, they just point their existing Claude Code at yours. The process takes under two minutes.

Admin side, generate an invite

In the team dashboard, enter the developer's email (for your records, no email is sent automatically), click Generate Invite, and copy the join command:

prefex join abc123def456 --proxy http://your-proxy:8019

Send this command to the developer however you normally communicate, Slack, email, chat. The invite expires after 48 hours by default.

Developer side, run the join command

  • Run the join command
    This downloads the Prefex binary (if not already installed), authenticates with your proxy, and gets a dev token.
  • Prefex patches Claude Code settings
    The join command automatically updates ~/.claude/settings.json to route through the shared proxy. No manual editing.
  • Done
    The next Claude Code request flows through the shared proxy. The developer appears in the team dashboard within seconds.
SSO alternative

If you have Google Workspace, Okta, or Azure AD, you can enable SSO so developers sign in with their existing identity instead of a join command. See Section 07.

Roles

Each developer gets a role that controls what they can see in the dashboard:

RoleCan seeCan do
adminEverythingAll actions including config, member management, budget changes
managerAll spend, all membersSet budgets, manage members, generate invites
analystAll spend, audit logRead-only on all team data
developerOwn spend onlyNone
06

Managing Spend & Access

Budget caps

Set a monthly cap per developer or a team-wide cap. When a developer reaches their cap, requests return 429 until the period resets. You can set both, the team cap fires if the aggregate crosses it regardless of individual caps.

# Via API (or use the dashboard UI)
# Per-developer cap: $50/month
POST /api/team/budgets
{ "dev_id": "dev-abc123", "period": "monthly", "limit_usd": 50 }

# Team-wide cap: $500/month
POST /api/team/budget/team
{ "period": "monthly", "limit_usd": 500 }

A warning threshold (default 80%) fires a log entry and optional Slack notification before the hard cap, giving the developer a chance to wrap up before they're blocked.

Rate limits

Separate from cost caps, rate limits control velocity, useful for preventing runaway agents from hammering the API in a tight loop even if they're within budget.

# 200 requests/day, 500k tokens/day per developer
POST /api/team/rate-limits
{ "dev_id": "dev-abc123", "max_requests": 200, "max_tokens": 500000 }

Billing modes

For each developer you can choose who pays for their API calls:

ModeWho paysWhen to use
proxyTeam API keyDefault, centralized billing, full visibility
passthroughDeveloper's own API keyContractors, personal projects, developers who prefer to pay themselves
proxy_then_ownTeam key until cap, then developer's own keyShared cost model, team covers baseline, developer pays for excess

Revoking access

From the dashboard, click Revoke on any member. The token is invalidated instantly, not after the 30-second cache window, immediately. The next request from that developer returns 401.

07

SSO & Compliance

For teams with an identity provider (Google Workspace, Okta, Azure AD, or any OIDC-compliant IdP), you can replace the invite/token flow entirely with SSO login.

How it works

Developers visit the team dashboard and click Sign in with SSO. They authenticate through your IdP. The proxy verifies the ID token, creates a member record automatically, and issues a short-lived session cookie. No token to manage, no invite to generate for returning members.

Configuring SSO

In the team dashboard under Single Sign-On (OIDC):

  1. Set Issuer URL, your IdP's discovery endpoint (e.g. https://accounts.google.com)
  2. Set Client ID and Client Secret from your IdP's app registration
  3. Set Redirect URL, http://your-proxy:8019/auth/callback
  4. Set Admin Emails, these addresses get admin role on first login
  5. Optionally set Allow Domains, any email from yourcompany.com gets developer role automatically
  6. Click Save, then Test Connection to verify the IdP discovery succeeds
ProviderIssuer URL
Google Workspacehttps://accounts.google.com
Oktahttps://your-org.okta.com
Azure ADhttps://login.microsoftonline.com/{tenant}/v2.0
Any OIDC IdPThe IdP's /.well-known/openid-configuration base URL

Audit log

Every admin action is recorded: who did it, what changed, when. The audit log is queryable by actor, target, and date range from the dashboard or API. Useful for compliance reviews and incident investigation.

# Export audit log for the last 30 days
GET /api/team/audit/export?format=csv&start_ts=1714521600
08

Team Knowledge

Every Claude Code session your team runs produces observations that are worth keeping: API quirks, architecture decisions, debugging patterns, gotchas that took two hours to find. Normally these live in a transcript nobody re-reads.

Team mode extracts these automatically. After each session, a background job pulls decisions, recurring patterns, and notable activity from the conversation, already PII-scrubbed before anything else touches it, and stores them in a shared knowledge base.

The knowledge tab

In the team dashboard under Team Knowledge, you get a live view of what your team has been learning, filterable by type, developer, and project.

TypeWhat it capturesExample
DecisionArchitectural choices made"Used SQLite for team_budgets, avoids network dependency for team features"
PatternRecurring approaches observed"Fail-open pattern: internal errors log and forward to upstream unchanged"
ActivityFiles touched, tasks completed"Updated handler.go, recorder.go, added billing_source field to request log"

Every entry shows which developer it came from and which project it belongs to. A new team member can read three months of team knowledge in minutes and skip weeks of pairing to pick up context that's normally implicit.

Zero maintenance burden

Knowledge accumulates from work your team is already doing. No developer has to write anything down. The more your team uses Claude Code, the richer the knowledge base gets.

09

Secrets Vault, Credential Security for Agents

AI agents that call external APIs need credentials. The naive approach, putting real API keys in prompts or environment variables, means secrets appear in conversation logs, request bodies, and wherever your team's AI sessions are stored.

The secrets vault gives every agent a sealed token instead of a real credential. The proxy substitutes the real value in-flight, so agents and logs only ever see pfx_sealed_….

How the MITM proxy works

The proxy runs on port 8020 alongside the main proxy. It holds a local CA certificate that you install once in the machine's trust store. When an agent makes an HTTPS request to an allowlisted host (e.g. api.github.com), the proxy:

  1. Terminates TLS using the local CA, the agent's request is decrypted
  2. Scans headers and body for pfx_sealed_… tokens
  3. Replaces each token with its plaintext value from the encrypted vault
  4. Re-encrypts and forwards to the real upstream host

Non-allowlisted hosts are blocked at CONNECT with 403, the proxy never sees their traffic at all.

Admin setup

Enable secrets in ~/.prefex/config.yaml:

prefex:
  secrets:
    enabled: true
    proxy_port: 8020
    allowlist:
      - api.anthropic.com
      - api.openai.com
      - api.github.com          # add any host your agents need

Then install the CA once on the machine running agents:

prefex secret install-ca   # prompts for confirmation, then runs sudo security/update-ca-certificates

Managing secrets from the dashboard

In the team dashboard under Secrets Vault:

  • Add a secret (name + value) → dashboard shows the sealed token to copy
  • Click any sealed token to copy it to clipboard
  • Delete secrets by name, the token becomes invalid immediately
  • Manage the allowlist, add or remove interception hosts without restarting
  • Download the CA cert or get per-platform install instructions

Developer workflow

Once admin has stored credentials, developers use sealed tokens in prompts or CLAUDE.md:

# In CLAUDE.md or system prompt:
Use GitHub API with Authorization header: Bearer pfx_sealed_a3f2c8b1…

# Set HTTPS proxy so agents route through the secrets proxy:
export HTTPS_PROXY=http://127.0.0.1:8020

CLI reference

prefex secret add <name> <value>     # encrypt + store; prints sealed token
prefex secret list                    # show names + tokens (no plaintext)
prefex secret remove <name>           # delete by name
prefex secret show-ca                 # print CA cert PEM to stdout
prefex secret install-ca              # install CA in system trust store
Without secrets vaultWith secrets vault
Real API keys in prompts and logsSealed tokens only, keys never in logs
Key rotation requires updating all promptsUpdate vault once, all agents pick up immediately
Revoke = rotate the key everywhereDelete secret, token instantly invalid
Audit: unknown who used which keyLast-used timestamp per secret
Vault key backup

The vault encryption key lives at ~/.prefex/keys/vault.key. Back it up securely, all stored secrets are unrecoverable without it. Only the machine running the proxy needs this file.

10

The Numbers at Scale

10-developer team, one month

MetricWithout PrefexWith Prefex Team
Monthly API cost~$1,800~$630
Cost visibilityInvoice total onlyPer-developer, per-project, per-model
Budget overrunsDiscovered at month endBlocked at threshold, alerted at 80%
Runaway agent incidentsNo detectionRate-limited before they compound
Onboarding a new developerSelf-configure, inconsistentOne join command, consistent optimization
Audit trailNoneFull log of all admin actions
Annual savings~$14,000

Getting started

The fastest path: install on any machine your team can reach, enable team mode in config, and generate invites for your developers. Total time: under 10 minutes for the admin, under 2 minutes per developer.

  1. Install on your proxy machine
    curl -fsSL https://promptforce.ai/install.sh | bash
  2. Enable team mode in ~/.prefex/config.yaml
    Set team.enabled: true, team.admin_token, and team.public_url
  3. Start the proxy
    prefex start
  4. Open the team dashboard
    open http://localhost:8019/dashboard/team, enter your admin token
  5. Invite your first developer
    Enter their email label, click Generate Invite, send them the prefex join command
Questions?

Open an issue at github.com/PromptForcePrime/prefex or check the solo guide for the individual-user fundamentals that team mode builds on.