May 7, 2026
Enterprise security is increasingly shaped by AI-enabled automation on both sides of the threat boundary. Attackers benefit from scalable reconnaissance, credential abuse, code generation, and social engineering, while defenders are pressured to respond with comparable speed across SaaS platforms, non-human identities, software pipelines, and agentic systems. This paper develops a conceptual research framework for defending the agentic enterprise. Rather than presenting a product evaluation or an empirical benchmark, the paper synthesizes current standards and primary-source guidance from NIST, CISA, OWASP, MITRE ATLAS, the Open Source Initiative, and official platform documentation to answer a narrower question: what architectural properties are necessary if AI is to be used defensively without producing an ungovernable security control plane? The analysis argues that effective AI-native defense rests on four requirements: phishing-resistant identity, constrained agent authority, continuous visibility into assets and permissions, and disciplined use of open or open-weight models inside auditable workflows. The paper’s contribution is a structured governance-oriented architecture that integrates related work on digital identity, agent security, runtime guardrails, and open model ecosystems into a single analytical framework suitable for enterprise deployment and further academic evaluation.
The modern enterprise no longer protects a single perimeter. It protects a distributed environment made up of SaaS tenants, CI/CD systems, workload identities, model gateways, developer tooling, and increasingly autonomous agents. In such an environment, compromise often propagates through legitimate control paths rather than through obviously malicious binaries alone. A stolen session, an over-scoped token, a poisoned dependency, or an agent with excessive authority can all produce effects that resemble authorized work. The security problem is therefore not only detection of malicious code; it is governance of trusted action.
This shift matters because AI is now implicated in both offense and defense. Adversaries can use AI to scale reconnaissance, automate phishing content, generate malicious code variants, and accelerate social engineering. Defenders, meanwhile, are adopting AI for alert triage, code review, classification, summarization, and workflow orchestration. The central research problem is not whether AI should appear in security operations, but under what architectural constraints it can do so safely.
Recent standards and primary-source guidance converge on a common direction. NIST's current digital identity guidance elevates phishing-resistant authentication, while CISA explicitly recommends FIDO/WebAuthn as the most practical phishing-resistant approach for broad deployment [1][2]. OWASP now treats agentic systems as a distinct security class, and MITRE ATLAS provides a structured vocabulary for reasoning about adversarial behavior in AI-enabled systems [3][4]. Recent platform documentation from ServiceNow, Armis, and Veza also reflects a practical market response: visibility into identities, assets, and workflows must increasingly be correlated rather than managed in isolation [5][6][7].
This paper therefore advances a narrower thesis than the current draft implied: AI should be used defensively only within a control architecture that binds every material action to verified identity, constrained authority, runtime policy, and an auditable workflow record. The contribution is conceptual rather than experimental. The paper does not claim new benchmark results or a deployed field study. Instead, it offers a synthesis that organizes existing technical guidance into an enterprise defense framework suitable for later implementation and empirical evaluation.
This paper is a design-oriented synthesis. Its method is analytical rather than statistical. The argument is built from publicly accessible primary materials in four categories: identity and security standards, AI-agent security guidance, open-model ecosystem definitions, and official platform documentation. The purpose of this method is not to prove product superiority. It is to derive architectural requirements that remain defensible across vendors and deployment contexts.
The paper addresses three research questions. First, which security failures become structurally more likely as enterprises adopt AI agents, code assistants, and non-human identities at scale? Second, which controls appear repeatedly across current standards and operational guidance, suggesting that they are foundational rather than optional? Third, how should open or open-weight AI systems be positioned within a defensive architecture so that transparency and deployability improve control rather than weaken it?
Two scope limitations should be explicit. First, the analysis focuses on enterprise architecture and governance, not on detection model performance. Second, vendor references are used as examples of architectural roles, not as claims that a single commercial stack is required. This distinction is important for academic rigor because it prevents a conceptual framework from collapsing into a product whitepaper.
Three strands of prior work shape this discussion. The first concerns digital identity and authentication. NIST SP 800-63-4 and CISA's phishing-resistant MFA guidance establish that authentication strength remains central to modern security because compromise of trusted identity is often the shortest path to privilege [1][2]. In practice, this literature reframes authentication not as a narrow login problem but as a control-plane problem for users, services, and federated applications.
The second strand concerns agentic and generative AI security. OWASP's LLM and Agentic guidance identifies prompt injection, excessive agency, tool misuse, supply-chain exposure, and identity abuse as recurring risk classes rather than isolated anomalies [3]. MITRE ATLAS adds a threat-informed vocabulary that allows researchers and practitioners to reason about AI-specific attack behavior in a more standardized way [4]. Together, these sources support the claim that agent security cannot be reduced to prompt engineering alone.
The third strand concerns openness in AI systems. The Open Source Initiative's Open Source AI Definition 1.0 provides a stricter criterion than common industry usage by tying openness to the practical ability to study, modify, and share the system in its preferred form [10]. This matters because the enterprise security literature often treats all accessible models as equally governable, when in fact important differences remain between fully open model flows and open-weight systems. A rigorous defensive architecture should preserve that distinction.
The relevant threat model extends beyond prompt injection. Five risk domains recur across current guidance and practice.
First, identity remains the shortest path to privilege. Weak or phishable authentication, unbounded session reuse, and poorly governed federation continue to make identity compromise operationally decisive [1][2]. This applies not only to human administrators but also to service principals, automation accounts, and agent runtimes that act through delegated authority.
Second, agentic systems create an authority amplification problem. An agent that can inspect repositories, call APIs, manipulate tickets, or trigger automation workflows effectively becomes a policy-bearing actor. OWASP's guidance is important here because it treats tool misuse, excessive agency, and identity abuse as first-class concerns rather than implementation details [3].
Third, AI-assisted software delivery creates a supply-chain acceleration problem. The risk is not merely insecure code generation in isolation. It is the institutional habit of moving machine-authored changes through development pipelines faster than review, dependency verification, provenance, and rollback discipline can keep pace.
Fourth, the rapid growth of non-human identities changes the scale of governance. Recent ServiceNow and Armis materials stress that machine identities already vastly outnumber human identities and that insufficient visibility into those identities increases lateral movement and policy failure [6]. Once AI agents are layered on top of this base, identity sprawl becomes inseparable from enterprise security architecture.
Fifth, open-model ecosystems create a dual-use problem. Greater transparency and local deployability can improve defender control, but they also reduce barriers for adversarial experimentation. The practical implication is that openness should be treated as an operational design variable, not as an intrinsic security guarantee.
The paper proposes a three-plane architecture: an identity and authorization plane, an asset and exposure plane, and an orchestration and governance plane. Each plane answers a distinct security question, and the absence of any one plane produces blind spots that AI automation tends to magnify rather than repair.
| Plane | Primary Question | Representative Capability | Operational Value |
|---|---|---|---|
| Identity and Authorization | Who or what can do what? | Fine-grained permission graphing, non-human identity governance, least privilege | Prevents over-broad access and limits blast radius when credentials or agents are abused |
| Asset and Exposure | What exists and what is exposed? | Continuous discovery across IT, OT, IoT, cloud, code, and connected assets | Reveals unmanaged systems, vulnerable surfaces, and shadow infrastructure in real time |
| Orchestration and Governance | What should happen next under policy? | Incident workflows, approvals, audit trails, remediation routing, AI control tower logic | Turns detection into bounded action with evidence, ownership, and recovery workflows |
ServiceNow, Armis, and Veza map cleanly onto these three planes and are useful as concrete examples because their current documentation makes the architecture legible: ServiceNow as a workflow and governance layer, Armis as a continuous cyber-asset intelligence layer, and Veza as an authorization-graph layer [5][6][7][8]. The academic point, however, is broader than any one stack. A defensible enterprise needs a way to correlate asset truth, permission truth, and workflow truth. Without that correlation, AI automation acts on partial context.
The implication is methodological as well as operational. Security programs that keep identity governance, asset inventory, and response operations in disconnected systems will find it difficult to justify autonomous or semi-autonomous AI action. Shared context is a precondition for bounded machine action.
AI has legitimate defensive value when it is used to compress analysis time, summarize context, and improve prioritization. It is much less defensible when it is granted discretionary authority over production systems, credential lifecycles, or access expansion. The correct design principle is therefore bounded delegation: the system may reason, classify, summarize, and recommend, but any high-impact mutation must remain subject to explicit policy checks and, where appropriate, human approval.
Meta's publication on LlamaFirewall is useful in this context because it frames agent safety as a runtime systems problem rather than a one-time model tuning problem [9]. Its value for this paper is not brand-specific. It demonstrates an important architectural principle: scanners, guardrails, and action policies should be layered around the model rather than assumed to be fully embedded within it.
This argument applies equally to code review and operational triage. AI can help identify suspicious dependency additions, anomalous configuration changes, risky infrastructure-as-code diffs, or mismatches between a requested action and a subject's effective permissions. Such use is academically defensible because the model narrows human attention rather than replacing the governance structure in which decisions are made.
A practical design pattern is a bounded agent loop: ingest telemetry, classify risk, correlate against identity and asset context, propose an action, run the proposal through policy gates, request approval if the action is destructive or privilege-altering, and finally emit a complete audit record. This pattern matters because it converts AI from an autonomous actor into a constrained analytic component inside a larger system of control.
Enterprises building defensive AI should preserve a careful distinction between fully open systems and open-weight systems. The Open Source Initiative's Open Source AI Definition 1.0 requires access to code, parameters, and the preferred form needed to study, modify, and share the system [10]. This is stricter than common industry marketing language, and the distinction matters when transparency, reproducibility, and local governance are part of the security objective.
| Family | Type | Defensive Relevance | Source Basis |
|---|---|---|---|
| OLMo (Ai2) | Fully open model flow | Useful for research reproducibility, inspection, fine-tuning, and institution-controlled experimentation | Ai2 describes OLMo as a fully open language model and complete model flow [11] |
| Gemma (Google) | Open model family | Practical for lightweight local evaluation, moderation, classification, and retrieval pipelines | Google launched Gemma as a new generation of open models for developers and researchers [12] |
| Mistral Models | Open or open-weight family | Strong candidates for coding, reasoning, and multimodal defensive assistants under enterprise control | Mistral documents multiple open models and open-model licensing paths [13][14] |
| Llama | Open-weight / openly available ecosystem | Large ecosystem support, local deployment flexibility, and published guardrail work such as LlamaFirewall | Meta positions Llama as openly available and continues to expand the ecosystem [15] |
For enterprise defense, the significance of these model families is not only cost or raw performance. It is governability. Open or open-weight systems can often be deployed locally, instrumented deeply, red-teamed internally, and integrated with institution-specific retention, access-control, and evaluation pipelines. This makes them attractive for defensive tasks involving sensitive telemetry, regulated content, or environments where sending data to a third-party API is undesirable.
Openness, however, is not itself a security control. Open models still require guardrails, evaluation, prompt-injection defenses, secrets handling discipline, and package verification. The academically defensible conclusion is not that open models are inherently safer, but that they may offer a more controllable substrate for defenders when the surrounding operational discipline is sufficiently strong.
On the basis of the preceding analysis, this paper proposes a six-part framework for defending against AI-enabled threats. The framework is normative: it specifies what a defensible architecture should include, not what every current enterprise already does well.
All privileged human roles, service operators, and AI control-plane administrators should use phishing-resistant authentication. Federated and short-lived credentials should replace long-lived secrets wherever possible. Non-human identities must have owners, purpose labels, expiration expectations, and review cycles.
The principle of least privilege is necessary but insufficient for agents. Enterprises also need least agency: limit the categories of actions an agent can request, the tools it can invoke, the networks it can reach, and the data domains it can inspect. Read-only tools, mutation tools, and destructive tools should live in separate trust bands.
Every meaningful asset class should be visible: cloud workloads, endpoints, unmanaged devices, CI runners, model-serving infrastructure, and developer environments. This is one reason the Armis-style asset-intelligence layer is strategically important. It converts discovery from a periodic exercise into a continuously refreshed source of context [6][8].
AI-authored code should be treated as untrusted until it passes the same verification gates as human-authored code: protected branches, human review, SAST, dependency review, provenance, secret scanning, and artifact attestation. The policy question is not whether AI wrote the code. The policy question is whether the code can be defended after deployment.
Agent execution environments should enforce output screening, tool-call validation, argument filtering, prompt isolation where possible, and high-fidelity logs of every material action. Runtime guardrails should be versioned and testable. Guardrails should also be updateable without re-training the base model, which is one of the operational strengths of modular systems such as LlamaFirewall [9].
Detections that cannot trigger bounded, owned, and auditable remediation remain operationally weak. Security teams need a workflow plane that records who approved a step, which identity was scoped, which asset was affected, what the agent recommended, and how recovery was validated. This is where ServiceNow's AI Control Tower and security workflow model become materially relevant rather than merely administrative [5][7].
This paper has clear limitations. It does not include a quantitative experiment, a formal threat simulation, or a comparative field deployment across enterprise environments. As a result, its claims are architectural and interpretive rather than causal. It identifies a plausible and standards-aligned defense model, but it does not empirically prove that one implementation of that model will outperform another.
Several research gaps follow. First, future work should test whether bounded-agent workflows measurably reduce security incidents or simply redistribute analyst workload. Second, more empirical work is needed on how non-human identity governance affects real incident containment times. Third, the defensive value of open versus open-weight models in regulated environments deserves more systematic comparison, especially where local deployment, auditability, and operational secrecy intersect.
The revised argument of this paper is intentionally conservative. The point is not that AI should be granted broad autonomy because attacks are fast. The point is that speed without control is strategically self-defeating. The agentic enterprise is already visible in product design, software engineering practice, and security operations. The substantive question is whether institutions can preserve authority, auditability, and rollback discipline while still benefiting from machine-speed analysis.
That question has particular weight in higher education, public-sector systems, and other complex institutions where third-party trust, identity sprawl, and operational heterogeneity are normal rather than exceptional. Such environments are poorly served by simplistic calls for “more AI.” They require disciplined architectures in which AI operates as an analytic and procedural accelerator inside clearly bounded institutional controls.
AI-enabled threats are best understood as contests over trusted action at machine speed. The strongest defense is therefore not a single model, product, or scanner. It is a control architecture capable of answering three persistent questions: what exists, who can act, and what must happen next under policy. This paper has argued that those questions map to asset visibility, authorization visibility, and workflow governance, and that AI can help only when it is subordinated to that structure rather than allowed to replace it.
The practical conclusion is accordingly narrow but important. Enterprises should use AI to defend AI-enabled systems, but only with phishing-resistant identity, tightly constrained agent permissions, continuous visibility into assets and non-human identities, and deliberate use of open or open-weight models inside auditable workflows. Under those conditions, AI can strengthen institutional resilience. Outside those conditions, it risks becoming another source of ungoverned privilege.
* Third-party names and marks belong to their respective owners. Their mention here is for identification only and does not imply affiliation or endorsement.