Documentation Index
Fetch the complete documentation index at: https://docs.lasscyber.com/llms.txt
Use this file to discover all available pages before exploring further.
The YARA analyzer matches input text against compiled
YARA rule sets. Agnes ships a
catalog of system rules (instruction bypass, secret patterns,
credential leakage, common jailbreak phrases) and lets each tenant
author its own rules and group them into YARA policies.
| |
|---|
| Canonical name | yara |
| Python | yara |
| TypeScript | yara |
| Server key | yara_analyzer |
| Category | Pattern Matching |
What it detects
Anything you can express as a YARA rule:
- Known prompt-injection idioms (“Ignore previous instructions…”,
“Disregard the above…”).
- Secret shapes (API tokens, SSH keys, generic high-entropy
identifiers).
- Internal codenames or document signatures you do not want leaving
your tenancy.
- Output formats you want to enforce (e.g. JSON-only outputs).
YARA is not statistical — a rule either matches or it does not. That
makes it complementary to the ML analyzers: zero false positives in
exchange for narrow coverage.
How it works
Agnes compiles all active rules in the selected YARA policy (or all
active tenant rules if no policy is selected) into a single YARA
matcher. Input text is scanned against the matcher; matches surface as
analyzer findings with the rule name and category.
System rules ship in api/data/yara/
and include things like instruction bypass, generic-secret detection,
SSH-key shapes, and a “fake IT maintenance” social-engineering rule.
Parameters
| Key | Type | Required | Default | Notes |
|---|
yara_policy_id | select | No | tenant default | Reference a YARA policy. Leave empty to use the default (or all active rules if no default exists). |
You can also override per request:
{ "prompt": "...", "policy_slug": "default-inbound", "yara_policy_id": "code-leakage" }
Outputs and metrics
{
"matches": [
{ "rule_name": "InstructionBypass", "category": "Injection",
"strings": ["ignore previous instructions"] },
{ "rule_name": "GenericSecret", "category": "Secrets",
"strings": ["sk-XXXXXXXXX"] }
],
"metrics": {
"processing_time_ms": 1.2,
"matches_found": 2
},
"status": "OK"
}
| Metric | Suggested thresholds |
|---|
matches_found | > 0 (any match), >= 3 (multiple). |
processing_time_ms | Observability only. |
Termination signals
| Signal | What it matches |
|---|
Boolean: match_found | Any rule matched. |
Match: rule_name | A specific rule, by name. |
Match: category | All rules in a meta category = block. |
rule_name and category are dynamic signals — the dashboard policy
editor populates them from the rules in your tenant.
Limits and cost
| Limit | Value |
|---|
| Max input tokens | 1,000,000 |
| Requests / minute | 5,000 (per tenant) |
YARA runs in-process; no external API cost. Compilation is cached
per tenant + policy.
Typical latency
1–50 ms depending on the number of compiled rules. YARA is one of the
cheapest analyzers — put it early in your execution plan to short-circuit
on cheap signals.
A small worked example
The shipped InstructionBypass rule:
rule InstructionBypass: Injection
{
meta:
category = "Instruction Bypass"
description = "Detects phrases used to ignore, disregard, or bypass instructions."
strings:
$bypass_phrase = /(Ignore|Disregard|Skip|Forget|Neglect|Overlook|Omit|Bypass|Pay no attention to|Do not follow|Do not obey)\s*(prior|previous|preceding|above|foregoing|earlier|initial)?\s*(content|text|instructions|instruction|directives|directive|commands|command|context|conversation|input|inputs|data|message|messages|communication|response|responses|request|requests)\s*(and start over|and start anew|and begin afresh|and start from scratch)?/
condition:
$bypass_phrase
}
The meta: category = value (Instruction Bypass) becomes a
termination signal you can match on — “terminate when any rule with
category ‘Instruction Bypass’ fires” — without naming each rule
individually.
When to use it
- Always on as a cheap first line. Put YARA early in your
combined policy so it short-circuits before the GPU classifiers
fire.
- Encode product-specific rules. Internal codenames, regulatory
phrasings, product-specific PII shapes that Cloud DLP does not
ship. YARA is the right place.
- Outbound is high-value. Outbound YARA catches LLMs that have
internalised your codebase or customer data and started reproducing
patterns.
Failure modes
- Rule compilation error → the rule fails to load and is excluded
from the run; surfaced in admin logs but does not break the
analyzer.
- No rules in the selected policy → the analyzer returns
matches_found: 0 and OK status. Fix the policy if you expected
matches.
Next