1. Create the guardrail under the workspace
Create guardrails at the workspace level first so the same policy can be reused across multiple keys. Recommended first pass:- add model and provider restrictions before content checks
- keep the first rollout narrow to one key or one environment
- name the guardrail after the policy outcome, not the team name
2. Attach it to a target API key
From the API key settings page, attach the guardrail to the key that should receive enforcement. Use this sequence:- start with a staging or low-risk key
- attach one guardrail at a time when possible
- confirm the key detail dialog shows the applied guardrail
3. Set budget policy and routing restrictions
Common combination:- daily cost limit
- provider blocklist for low-trust providers
- model allowlist for the exact production models that should remain routable
- which providers stay allowed
- which models become blocked
- whether
Only allowis narrower than intended
4. Add prompt-injection and sensitive-info rules
For a practical rollout:- start prompt injection with
flagorredact - move to
blockonly after reviewing false positives - use deterministic sensitive-info rules first:
- phone
- SSN
- credit card
- IP address
5. Test the policy before broad rollout
Use the guardrail preview with:- one benign input
- one clearly disallowed input
- one realistic production-like prompt
- correct redaction text
- correct blocked/allowed behavior
- no accidental matches on unrelated content
6. Verify enforcement in activity and logs
After sending real requests through the guarded key, verify:- the Guardrail Enforcement panel on the activity page
- request detail dialogs for blocked or redacted requests
- the API key detail dialog for per-key guardrail activity
- blocked counts
- redacted counts
- flagged counts
- the specific guardrail and detector details when enforcement happened
7. Expand rollout safely
When the first key behaves correctly:- attach the same guardrail to more keys
- tighten actions from
flagtoredactorblock - add higher-latency name and address detection only where the latency tradeoff is acceptable