> ## Documentation Index
> Fetch the complete documentation index at: https://docs.lasscyber.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Natural Language Analysis

> Language detection, sentiment, entities, topics, and content moderation via Google Cloud Natural Language.

This analyzer enriches a prompt with **structured linguistic signals** —
sentiment scores, named entities, topic categories, and a moderation
verdict. It is a *signal* analyzer; it produces metrics you can use as
termination thresholds in a combined policy, but it does not by itself
make safety decisions (that's
[Safety & Responsible AI](/analyzers/safe-responsible-ai)).

|                    |                    |
| ------------------ | ------------------ |
| **Canonical name** | `natural-language` |
| **Python**         | `natural_language` |
| **TypeScript**     | `naturalLanguage`  |
| **Server key**     | `nlp_analyzer`     |
| **Category**       | Content Analysis   |

## What it provides

Four independent sub-analyses, each toggleable:

* **Sentiment** — overall sentiment (`-1.0` very negative to `+1.0`
  very positive) with a magnitude.
* **Entity extraction** — named entities (people, organizations,
  locations, etc.).
* **Topic classification** — high-level categories (requires text
  ≥ 20 bytes).
* **Content moderation** — categories like `Toxic`, `Insult`,
  `Profanity`, `Derogatory`, `Sexual`, `Violent`,
  `Death, Harm & Tragedy`, `Firearms & Weapons`, `Illicit Drugs`.

## How it works

Calls Google Cloud Natural Language API:

* **v1** (classification, entities) for topic and entity extraction.
* **v2** (sentiment, moderation) for sentiment and moderation
  scores.

Each sub-analysis is a separate API call. Disabling sub-analyses you
do not need is the most effective latency lever.

## Parameters

| Key                  | Type    | Required | Default | Notes                                  |
| -------------------- | ------- | -------- | ------- | -------------------------------------- |
| `include_entities`   | boolean | No       | `true`  | Extract named entities.                |
| `include_sentiment`  | boolean | No       | `true`  | Compute sentiment score and magnitude. |
| `include_moderation` | boolean | No       | `true`  | Run the v2 moderation classifier.      |

(Topic classification has no toggle; it always runs when input is
≥ 20 bytes.)

## Outputs and metrics

```json theme={null}
{
  "language": "en",
  "sentiment": { "score": -0.78, "magnitude": 1.4 },
  "entities": [
    { "name": "OpenAI", "type": "ORGANIZATION", "salience": 0.4 }
  ],
  "topics": [{ "name": "/Computers & Electronics", "confidence": 0.91 }],
  "moderation": [
    { "category": "Toxic",     "confidence": 0.86 },
    { "category": "Profanity", "confidence": 0.71 }
  ],
  "metrics": {
    "processing_time_ms": 480.0,
    "sentiment_score": -0.78,
    "sentiment_magnitude": 1.4,
    "topic_count": 1,
    "entity_count": 1,
    "moderation_category_count": 2,
    "max_moderation_confidence": 0.86
  },
  "status": "OK"
}
```

| Metric                      | Range       | Suggested thresholds                                            |
| --------------------------- | ----------- | --------------------------------------------------------------- |
| `sentiment_score`           | -1.0 to 1.0 | `< -0.5` (negative), `< -0.8` (very negative).                  |
| `sentiment_magnitude`       | ≥ 0.0       | Strength regardless of polarity.                                |
| `topic_count`               | int         | —                                                               |
| `entity_count`              | int         | —                                                               |
| `moderation_category_count` | int         | `> 0` (any flag), `>= 2` (multiple).                            |
| `max_moderation_confidence` | 0.0–1.0     | `>= 0.5` (any flagged), `>= 0.65` (high), `>= 0.8` (very high). |

## Termination signals

| Signal                        | What it matches                                                                                                                                               |
| ----------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Boolean: `moderation_flagged` | Any moderation category with confidence `>= 0.5`.                                                                                                             |
| Match: `moderation_category`  | A specific category — e.g. `Toxic`, `Insult`, `Profanity`, `Derogatory`, `Sexual`, `Violent`, `Death, Harm & Tragedy`, `Firearms & Weapons`, `Illicit Drugs`. |

Combine `moderation_flagged` with `max_moderation_confidence >= 0.65`
for stricter control than the default 0.5 cutoff.

## Limits and cost

| Limit                        | Value            |
| ---------------------------- | ---------------- |
| Max input tokens             | 100,000          |
| Requests / minute            | 100 (per tenant) |
| Topic classification minimum | 20 bytes         |

Cost is **Google Cloud Natural Language pricing** — see the
[NL pricing page](https://cloud.google.com/natural-language/pricing).

## Typical latency

500–3000 ms depending on enabled analyses and input length. The biggest
saving you can make is disabling sub-analyses you do not need; each one
is its own API call.

## When to use it

* **Best as a flag, not a block.** Sentiment and entity counts are
  noisy. Use them to enrich your analysis log; reach for safety
  guardrails or YARA when you actually want to block.
* **Use moderation as a *secondary* signal.** It overlaps with
  ShieldGemma but uses a different model family; agreement between
  the two raises confidence on a violation.
* **Disable what you do not use.** If you do not need topic
  classification, the `include_*` toggles already disable sentiment,
  entities, and moderation; topic always runs but is fast.

## Failure modes

* **NL API error** → analyzer `ERROR` status; surfaced in
  `analyzer_results`.
* **Empty input** → topic classification skipped; other sub-analyses
  produce empty results.

## Next

* [Combined analyzer](/concepts/combined-analyzer) — wiring NLP into
  termination rules.
* [Safety & Responsible AI](/analyzers/safe-responsible-ai) — the
  primary safety analyzer.
