This page is the operator-level view of Agnes. It covers the deployed components, the data plane, and the security boundaries that customers care about (where their data goes, what’s persisted, and what crosses external API boundaries). If you only need to call the API, you can safely skip this page; the Quickstart is enough.Documentation Index
Fetch the complete documentation index at: https://docs.lasscyber.com/llms.txt
Use this file to discover all available pages before exploring further.
Components
Agnes has four runtime components plus a small set of external upstream APIs.agnes.lasscyber.com — frontend dashboard
A React + TypeScript SPA that customers use to manage tenants, users,
roles, API keys, policies, YARA rules, SDP and safety policies, billing,
and analysis history. It authenticates via Auth0 and never holds API
keys; all programmatic calls go through bearer-token API keys minted
from this UI.
The frontend ships a thin in-app help link to this docs site. It does
not embed docs.
api.lasscyber.com — API service
The customer-facing FastAPI application. Async-first, deployed to Google
Cloud Run with autoscaling. Every product endpoint lives under
/api/v1/. Health endpoints live at /health, /healthz, and the
mirror under /api/v1/.
The API is the only component that talks to:
- the Postgres database,
- the model service,
- Google Cloud DLP / NLP / Web Risk / Vertex,
- Auth0 management,
- Stripe and SendGrid.
model_service — internal inference
A separate Cloud Run service running on L4 16 GB GPU instances.
Hosts:
- The prompt-injection / jailbreak BERT classifiers
(
Llama-Prompt-Guard-2,DeBERTa-v3injection v2, ONNX defenders). - The safety LLM-as-a-judge (
google/shieldgemma-2b,google/shieldgemma-9b,google/shieldgemma-27b) via vLLM.
Postgres + pgvector
A single Cloud SQL Postgres instance backs the API:- Tenants, users, roles, invitations, audit events.
- API keys (hashed, never the raw value).
- Policies, YARA rules and policies, SDP policies, safety policies.
- Threat intelligence embeddings (
PromptInjectiontable, 768-dim pgvector column). Queries use cosine similarity. - Billing artifacts (subscription state, usage rollups, plan catalog).
- Idempotency keys and rate-limit counters.
tenant_id column and every
query is scoped by that column. Cross-tenant reads are not possible from
the public API surface.
External upstream APIs
These are called by the API service when an analyzer needs them. They each have their own pricing, latency, and reliability profile, and Agnes surfaces failures asanalyzer_unavailable (HTTP 503) so SDK clients
retry safely.
| Upstream | Used by |
|---|---|
| Google Cloud DLP / SDP | Sensitive Data Protection analyzer. |
| Google Cloud Natural Language | Natural Language analyzer. |
| Google Web Risk | URL Risk analyzer. |
Google Vertex AI Embeddings (text-embedding-004) | Semantic Threat Intelligence analyzer. |
| Auth0 | Web dashboard authentication; API key auth never touches Auth0. |
| Stripe | Subscription, metered usage, invoices, customer portal. |
| SendGrid | Transactional email (verification, billing alerts, support tickets). |
| Sentry | Error reporting (only to operators; never customer payloads). |
| Better Stack (status page) | Real-time API health at status.lasscyber.com. |
Where customer data goes
Where text passed toanalyze ends up depends on which analyzers run:
| Analyzer | Sends prompt to |
|---|---|
| Prompt Injection & Jailbreak | The internal model_service; never leaves Google Cloud. |
| Safety & Responsible AI | Same — internal model service. |
| Sensitive Data Protection | Google Cloud DLP. |
| Natural Language | Google Cloud Natural Language. |
| URL Risk | Extracts URLs locally, then queries Google Web Risk for each URL (URL only, not the surrounding text). |
| YARA | Stays in the API service — pure local matching against compiled rules. |
| Semantic Threat Intelligence | Sends the prompt to Vertex AI for embedding, compares against pgvector locally. |
ak_test_*) bypass all upstream calls; the
TestModeStubProvider returns deterministic
canned results.
Network and tenancy
- Single global region. Customer-facing services run in one Google Cloud region. Multi-region is on the roadmap; ask sales if it matters for compliance.
- TLS everywhere. Public endpoints use Google-managed certificates. Internal calls (API → model_service, API → Cloud SQL, API → Cloud APIs) ride VPC Peering / Private Service Connect where available.
- No customer egress. Agnes never calls back into customer infrastructure. Webhook-style notifications (e.g. policy decision events) are pull-based via the analyzer log API.
- Tenant isolation at every layer. Tenant ID is set by middleware from the API key (or JWT), not from request bodies. The database layer rejects cross-tenant reads by query construction.
Deployment surface
The repo’sinfrastructure/ directory holds Docker, Terraform, and
Cloud Run config. Customers do not deploy Agnes themselves; this is
documented for operator-facing audits and security reviews. Reach out
to security@lasscyber.com if you need
a SOC 2-style packet, signed pentest report, or DPA.
What is not deployed
A few things are intentionally absent so the threat surface stays small:- No customer model hosting. Agnes does not run your model. Bring your own LLM (OpenAI, Anthropic, Google, self-hosted, …). Agnes’s optional OpenAI integration is a client-side wrapper, not a hosted endpoint.
- No long-lived plaintext storage of analyzed prompts. Structured decision metadata is logged; the raw prompt is not persisted unless you explicitly ingest it into the threat-intel store via Workbench.
- No third-party data brokers. Threat intel comes from public research datasets and customer-supplied data, never resold third-party feeds.
Next
- How Agnes works — request lifecycle.
- Combined analyzer — the hero endpoint.
- Errors — the canonical error envelope and what 503s mean.