| Canonical name | semantic-threat-intelligence |
| Python | semantic_threat_intelligence |
| TypeScript | semanticThreatIntelligence |
| Server key | vector_analyzer |
| Category | Adversarial |
What it detects
Inputs that are semantically close to a prompt already in your threat-intel store. “Close” is measured as cosine similarity in the embedding space:>= 0.90— high-similarity match (effectively a paraphrase).>= 0.75— medium similarity (related themes, similar attack pattern).>= 0.50— low similarity (loosely related; usually not actionable).
Low, Medium, or High derived from the
similarity score.
How it works
- Embed the input with Vertex AI (
text-embedding-004, 768-dimensional, optimized for semantic similarity). - Query the tenant’s
prompt_injectionstable — and optionally the public threat-intel set — for the nearest neighbour by cosine distance. - Convert distance to similarity:
similarity = 1.0 - distance. - Map the similarity to a severity level using the configured thresholds (defaults: 0.90 / 0.75).
include_public_threat_intel parameter lets the analyzer additionally
match against a curated public set; turning it off limits matches to
your own ingestion.
Parameters
| Key | Type | Required | Default | Notes |
|---|---|---|---|---|
include_public_threat_intel | boolean | No | true | Include the curated public adversarial-prompt corpus in the comparison. |
Outputs and metrics
| Metric | Range | Suggested thresholds |
|---|---|---|
similarity_score | 0.0–1.0 | >= 0.9 (high), >= 0.75 (medium), >= 0.5 (low). |
severity_level | 0 / 1 / 2 | == 2 (high), >= 1 (medium or higher). |
Termination signals
Output match hints:High— fires on a high-severity match.Medium— fires on a medium-severity match.
similarity_score threshold for tighter control:
output_match: "High" AND similarity_score >= 0.92.
Limits and cost
| Limit | Value |
|---|---|
| Max input tokens | 100,000 |
| Requests / minute | 50 (per tenant) |
| Embedding model | text-embedding-004 (768-dim) |
Typical latency
1–50 ms total. Embedding generation usually dominates (~10–25 ms); the pgvector cosine search itself is sub-millisecond.Adding your own threat intel
Two paths to ingest known-bad prompts into the comparison set:- In the dashboard, use the
Threat Workbench to triage
and resolve flagged prompts. Resolved threats are written to the
prompt_injectionstable for your tenant. - Programmatically, call
POST /api/v1/vector/add-embedding. This writes to the generic embedding table; for adversarial signal use the workbench so the entry shows up undersemantic-threat-intelligence.
category and severity so
termination rules can target specific severities.
When to use it
- Pair with the classifier. The classifier and this analyzer cover
different failure modes:
- Classifier strong on syntactic shape, weak on novel phrasings.
- Semantic analyzer strong on paraphrase / obfuscation, weak on truly novel attacks.
- Inbound primarily. The threat-intel store is keyed off attack prompts, so inbound is where it earns its keep.
- Ramp the threshold over time. Start at
0.90(high precision); monitor false negatives in the Analysis log; drop toward0.85as your corpus grows.
Failure modes
- Vertex AI unreachable →
analyzer_unavailable503 withRetry-After. - Empty threat-intel store →
similarity_score = 0.0,severity_level = 0. Not an error.
Next
- Combined analyzer — wire this analyzer into termination rules.
- Prompt Injection & Jailbreak Detection — the syntactic counterpart.
- Analysis logs — feed the workbench with real-world signal.