Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.lasscyber.com/llms.txt

Use this file to discover all available pages before exploring further.

The URL Risk analyzer extracts every URL from the input text and checks each one against Google Web Risk. URLs that match Web Risk’s threat lists are flagged with their threat type.
Canonical nameurl-risk
Pythonurl_risk
TypeScripturlRisk
Server keyurl_analyzer
CategoryThreat Detection

What it detects

Three Google Web Risk threat types:
  • MALWARE — pages distributing malware.
  • SOCIAL_ENGINEERING — phishing or social-engineering pages.
  • UNWANTED_SOFTWARE — sites distributing unwanted software (toolbars, bundleware).
It does not classify URLs as “spammy” or “low quality”; Web Risk’s job is exclusively the three categories above.

How it works

  1. Extract URLs from the input using the urlextract library and a recent TLD list (refreshed when older than 7 days).
  2. For each extracted URL, call the Web Risk API.
  3. Return per-URL verdicts and counts.
If the Web Risk API errors on a particular URL, Agnes treats that URL as unsafe — the analyzer fails closed. This means the unsafe_urls_count metric over-reports rather than under-reports during outages.

Parameters

This analyzer takes no parameters. Web Risk lookups always check all three threat types.

Outputs and metrics

{
  "urls": [
    { "url": "https://safe-corp.example",
      "is_safe": true,  "threat_types": [] },
    { "url": "https://phish.bad.example",
      "is_safe": false, "threat_types": ["SOCIAL_ENGINEERING"] }
  ],
  "metrics": {
    "processing_time_ms": 12.4,
    "urls_detected_count": 2,
    "unsafe_urls_count": 1
  },
  "status": "OK"
}
MetricSuggested thresholds
urls_detected_countObservability only.
unsafe_urls_count> 0 (any unsafe URL).
processing_time_msObservability only.

Termination signals

SignalWhat it matches
Boolean: unsafe_url_foundAny URL flagged unsafe by Web Risk.
Match: threat_typeOne of MALWARE, SOCIAL_ENGINEERING, UNWANTED_SOFTWARE.
Output match hints (regex): MALWARE, SOCIAL_ENGINEERING, UNWANTED_SOFTWARE, and (MALWARE|SOCIAL_ENGINEERING|UNWANTED_SOFTWARE) for any threat.

Limits and cost

LimitValue
Max input tokens1,000,000
Requests / minute5,000 (per tenant)
Cost is Google Web Risk pricing — billed per URL lookup. See the Web Risk pricing page.

Typical latency

1–50 ms total, dominated by the number of URLs found. Web Risk lookups are fast (a handful of milliseconds each); URL extraction itself is near-instant.

When to use it

  • Always on. URL Risk is one of the cheapest analyzers and catches a category of attack the ML analyzers do not.
  • Both directions. Inbound: catches user-submitted phishing. Outbound: catches LLMs that hallucinated a malicious link.
  • Pair with YARA. YARA is the right tool for URL patterns you care about beyond Web Risk’s categories (e.g. internal hostnames you do not want appearing in outputs).

Failure modes

  • Web Risk error on a URL → that URL is treated as unsafe; the metric overcounts. The analyzer does not error; the run continues.
  • Web Risk fully unreachableanalyzer_unavailable 503 with Retry-After.

Next

  • Combined analyzer — wire url_risk into termination rules.
  • YARA — for URL patterns Web Risk does not cover.