Checklist: Securing AI Agents Before They Reach Production – With Auth0

A practical checklist for agent identity, tool permissions, scoped tokens, approval gates, RAG access, memory controls, and audit evidence.

AI agents call APIs, use tools, retrieve documents, trigger workflows, and act for users. That gives them a different risk profile from a chatbot.

An AI assistant with weak controls may produce a poor answer.

An AI agent with weak controls may send an email, read restricted documents, delete data, approve a transaction, or chain several tools before a human reviews the action.

Auth0’s guidance focuses on 5 agent risks before release: over-privileged tools, unscoped third-party access, missing human approval for high-impact actions, poisonable memory, and cascading failures. It connects those risks to OpenFGA, Token Vault, asynchronous authorization, and authorization at the RAG retrieval layer.

OWASP’s Top 10 for Agentic Applications gives the wider risk frame, including Tool Misuse, Identity and Privilege Abuse, Memory & Context Poisoning, Human-Agent Trust Exploitation, and Cascading Failures.

CISOs, CTOs, IAM leaders, and security architects need to know: what can the agent access, what can it do, when does it need approval, and how is every action traced?

1. Give every agent a governed identity

Every production agent needs a defined identity before it touches tools, APIs, or data.

OWASP’s report references Identity and Privilege Abuse risk when agents use delegation chains, role inheritance, cached credentials, or agent context in ways that can escalate access or bypass controls. The report also states that agents without distinct, governed identities create an attribution gap that makes least privilege difficult.

At minimum, each agent needs:

A unique identity.
A named owner.
A clear business purpose.
Approved actions.
A lifecycle process for creation, review, change, and removal.
Logs that connect actions to the agent, user, task, tool, and system.

If your team cannot identify who owns the agent, why it has access, and what it can do, it should stay out of production.

2. Start with zero tool permissions

Tool access should be granted by task.

Auth0 points to OpenFGA task-based authorization as a way to keep agents at zero permissions by default. The application grants only the permissions needed for the task, then removes them when the task ends.

This matters because over-privileged tools turn normal agent activity into risk. An email summarizer should not delete emails. A calendar assistant should not inherit access to every connected service. A support agent should not issue refunds unless that action is explicitly allowed.

Before production, define:

Which tools the agent can call.
Which resources each tool can reach.
Which actions are read-only, write, delete, send, approve, publish, or transfer.
Which tasks grant temporary access.
When each permission expires.

OWASP recommends per-tool least privilege, action-level authentication, approval for high-impact actions, policy checks before execution, short-lived credentials, and logging for tool invocations.

Treat this as an authorization model, not a developer convention.

3. Replace broad delegated tokens with scoped access

Third-party integrations can quickly create excessive access.

When an agent acts for a user in Google Calendar, Slack, Microsoft 365, Salesforce, or another external service, the fast implementation path is often the riskiest: giving the agent the user’s full token.

Auth0’s Token Vault pattern gives the agent a scoped token at call time. In the example, the agent receives access to Google Calendar events without managing the user’s credentials directly.

Before production, review every third-party connection:

Which user is the agent acting for?
Which external service is accessed?
Which OAuth scopes are requested?
Does the task need read access, write access, or both?
How long does the token last?
Can access be revoked immediately?
Are token requests and tool calls logged?

A production agent should never inherit broad user access because it was easier to build.

4. Add human approval for high-impact actions

Some actions need a human decision before execution.

Auth0 describes asynchronous authorization through CIBA. The agent initiates an authorization request, the user receives a specific approval message on a trusted device, and the protected tool runs only after approval.

Use this pattern for actions such as:

Sending external communications.
Changing customer records.
Moving money.
Granting access.
Deleting data.
Publishing content.
Changing configurations.
Running privileged administration tasks.

The approval request should show the action, resource, requesting user, agent, and business context. It should also create audit evidence.

Keep approval risk-based. Low-risk read actions can run automatically. Sensitive actions need policy checks. High-impact actions need explicit approval.

5. Authorize RAG retrieval before the agent sees content

RAG can expose sensitive information when retrieval lacks access checks.

Auth0 recommends authorization at retrieval time. Its example uses an OpenFGA check so vector search results are filtered before the agent sees the documents.

If a user cannot access a document in SharePoint, Confluence, Google Drive, or a regulated repository, the agent should not retrieve it through RAG.

Before production, confirm:

Which repositories the agent can search.
Which document permissions apply at retrieval time.
How metadata maps to authorization rules.
Whether access is checked by user, group, role, or task.
How expired or restricted documents are removed.
How retrieval is logged.

This is a common gap. Teams may secure the application interface while leaving the retrieval pipeline too open.

6. Separate memory by user, task, and sensitivity

Agent memory needs clear limits.

OWASP lists Memory & Context Poisoning as a top risk for agentic applications. It describes persistent corruption of agent memory and retrievable context, including RAG stores, summaries, embeddings, and stored conversation history.

Memory can also create privilege problems. OWASP calls out memory-based privilege retention and data leakage when agents cache credentials, keys, or retrieved data and reuse them later.

Before production, define:

What the agent can store.
How long memory lasts.
Which data types are excluded.
How memory is separated by user, tenant, task, and business unit.
How poisoned or outdated memory is removed.
Which logs show when memory influenced an action.

Regulated organisations should treat agent memory as a governed data store.

7. Apply least agency

OWASP uses “Least-Agency” to advise organisations to avoid unnecessary autonomy. It also states that weak visibility into what agents do, why they do it, and which tools they call can expand the attack surface and turn minor issues into wider failures.

Least privilege limits what the agent can access. Least agency limits what the agent can execute without additional checks.

Before production, ask:

Does this workflow need autonomous execution?
Can the agent recommend an action for review?
Can it run in read-only mode?
Which steps need policy checks?
Which steps need human approval?
Which steps should be blocked by default?

Grant autonomy where the operational value is clear and the controls are ready.

8. Build audit evidence before release

AI agent actions need a traceable record.

OWASP recommends immutable logs for tool invocations and parameter changes, with monitoring for unusual execution rates, tool-chaining patterns, and policy violations.

The evidence should answer:

Which agent acted?
Which user or process triggered the task?
Which tools were called?
Which data was retrieved?
Which permissions were granted?
Which tokens were issued?
Which approvals were requested, accepted, or denied?
Which policy checks passed or failed?
Which action was completed?

This evidence supports security operations, incident response, audits, and regulatory review.

9. Test agent security before production

AI agents should be tested as systems with identity, tool access, data access, and delegated authority.

Security testing should cover:

Prompt injection that changes the agent’s goal.
Calls to unauthorized tools.
Restricted document access through RAG.
Abuse of broad OAuth scopes.
Tool chaining that combines data reads with external transfer.
Memory reuse across users or sessions.
Approval bypass attempts.
Token use after task completion.
Logging gaps.

The State of Pentesting Report 2025 shows the urgency. In AI and LLM pentests, 32% of findings were rated serious, compared with 13% across all pentest findings. Only 21% of serious AI and LLM findings were resolved.

Production readiness should include technical testing, access review, policy validation, and evidence review.

Cloudcomputing view: agent security is identity work

AI agent security belongs in the identity programme.

Auth0 gives teams practical patterns for authentication, authorization, scoped delegation, approval flows, and RAG access checks. OWASP gives the risk language and mitigation structure. Cloudcomputing brings these into modern identity design, Zero Trust, and implementation work.

The production checklist:

Give agents governed identities.
Grant tool permissions by task.
Use scoped tokens for third-party access.
Require approval for high-impact actions.
Enforce authorization before RAG retrieval.
Separate memory by user, task, and sensitivity.
Limit autonomy where execution is unnecessary.
Log every material action.
Test agent behaviour before release.

AI agents can create business value when their access is controlled with the same seriousness as privileged identities.

Before agents act in production, their identity, permissions, tokens, approvals, memory, and evidence need to be ready.