Documentation Index
Fetch the complete documentation index at: https://docs.lasscyber.com/llms.txt
Use this file to discover all available pages before exploring further.
The URL Risk analyzer extracts every URL from the input text and checks
each one against Google Web Risk. URLs that match Web Risk’s threat
lists are flagged with their threat type.
| |
|---|
| Canonical name | url-risk |
| Python | url_risk |
| TypeScript | urlRisk |
| Server key | url_analyzer |
| Category | Threat Detection |
What it detects
Three Google Web Risk threat types:
MALWARE — pages distributing malware.
SOCIAL_ENGINEERING — phishing or social-engineering pages.
UNWANTED_SOFTWARE — sites distributing unwanted software (toolbars,
bundleware).
It does not classify URLs as “spammy” or “low quality”; Web Risk’s
job is exclusively the three categories above.
How it works
- Extract URLs from the input using the
urlextract library and a
recent TLD list (refreshed when older than 7 days).
- For each extracted URL, call the Web Risk API.
- Return per-URL verdicts and counts.
If the Web Risk API errors on a particular URL, Agnes treats that URL
as unsafe — the analyzer fails closed. This means the
unsafe_urls_count metric over-reports rather than under-reports
during outages.
Parameters
This analyzer takes no parameters. Web Risk lookups always check all
three threat types.
Outputs and metrics
{
"urls": [
{ "url": "https://safe-corp.example",
"is_safe": true, "threat_types": [] },
{ "url": "https://phish.bad.example",
"is_safe": false, "threat_types": ["SOCIAL_ENGINEERING"] }
],
"metrics": {
"processing_time_ms": 12.4,
"urls_detected_count": 2,
"unsafe_urls_count": 1
},
"status": "OK"
}
| Metric | Suggested thresholds |
|---|
urls_detected_count | Observability only. |
unsafe_urls_count | > 0 (any unsafe URL). |
processing_time_ms | Observability only. |
Termination signals
| Signal | What it matches |
|---|
Boolean: unsafe_url_found | Any URL flagged unsafe by Web Risk. |
Match: threat_type | One of MALWARE, SOCIAL_ENGINEERING, UNWANTED_SOFTWARE. |
Output match hints (regex): MALWARE, SOCIAL_ENGINEERING,
UNWANTED_SOFTWARE, and (MALWARE|SOCIAL_ENGINEERING|UNWANTED_SOFTWARE)
for any threat.
Limits and cost
| Limit | Value |
|---|
| Max input tokens | 1,000,000 |
| Requests / minute | 5,000 (per tenant) |
Cost is Google Web Risk pricing — billed per URL lookup. See the
Web Risk pricing page.
Typical latency
1–50 ms total, dominated by the number of URLs found. Web Risk lookups
are fast (a handful of milliseconds each); URL extraction itself is
near-instant.
When to use it
- Always on. URL Risk is one of the cheapest analyzers and catches
a category of attack the ML analyzers do not.
- Both directions. Inbound: catches user-submitted phishing.
Outbound: catches LLMs that hallucinated a malicious link.
- Pair with YARA. YARA is the right tool for URL patterns you
care about beyond Web Risk’s categories (e.g. internal hostnames you
do not want appearing in outputs).
Failure modes
- Web Risk error on a URL → that URL is treated as unsafe; the
metric overcounts. The analyzer does not error; the run continues.
- Web Risk fully unreachable →
analyzer_unavailable 503 with
Retry-After.
Next
- Combined analyzer — wire
url_risk
into termination rules.
- YARA — for URL patterns Web Risk does not cover.