classifier behavior

Classifier Notes

How unknown resources get classified. Agents should check source to distinguish rules from semantic fallback.

Classifier Notes

RecourseOS is rules-first. Deterministic handlers decide known AWS, GCP, Azure, and Azure AD resource types. The unknown-resource classifier only runs when a resource type does not have a known handler and --classifier is enabled.

Public Contract

Classifier output uses the same recoverability tiers as deterministic rules:

Unknown-resource classification is conservative. When evidence is weak, ambiguous, or missing, RecourseOS should return needs-review instead of marking a destructive change safe.

Semantic Signals

The classifier looks for provider-neutral safety signals that commonly affect recoverability:

These signals help RecourseOS generalize across long-tail provider resources without relying only on cloud-specific type names.

Known Limits

Some resources require context that may not exist in a Terraform plan, shell command, or MCP tool call:

In these cases, RecourseOS should escalate or require review.

Backup Topology vs Backup Existence

Recoverability depends not just on whether backups exist, but on whether backups will survive the operation being evaluated. This is the distinction between backup existence and backup topology.

The Problem

A volume with a snapshot is typically recoverable-from-backup. But if the snapshot is co-located with the volume (same region, same account, no external copies), deleting the volume may also delete the snapshot in certain failure scenarios or as part of a multi-step operation.

RecourseOS currently checks:

RecourseOS does NOT currently verify:

Documented in Traces

The EBS and RDS handlers include these limitations in their classification traces:

These trace limitations are visible to agents and can inform human review.

Cross-Action Analysis

Status: Implemented in v0.1.36

RecourseOS now detects dangerous action sequences where individually-safe actions become dangerous in combination:

Cross-action analysis runs after per-resource evaluation and returns matches in the crossActionRisks array. Per-resource verdicts remain unchanged; the plan-level worstRecoverability and riskAssessment reflect the elevated risk.

See [Cross-Action Analysis](/cross-action-analysis.html) for implementation details.

#### Scope Limitations

Cross-action analysis can only see resources in the current plan. It cannot detect:

When these limitations apply, the scopeWarning field documents what the engine couldn't see.

BitNet Classifier

BitNet is a 1-bit quantized neural network classifier for unknown resource types. It handles the long tail of cloud providers (Scaleway, UpCloud, Exoscale, Hetzner, etc.) that don't have explicit handlers.

Architecture

The classifier uses a three-layer routing system:

1. Exact mappings (confidence 1.0): Manually verified resource → category mappings for ~180 common resources across AWS, GCP, Azure, OCI. These fire first and short-circuit the model.

2. BitNet model (89% accuracy on resources not in exact mappings): 1-bit quantized neural network trained on 400+ labeled resource types. Handles unknown providers and edge cases.

3. Pattern fallback (variable confidence): Regex-based pattern matching for common suffixes like _bucket, _volume, _policy. Used when BitNet weights aren't loaded.

Model Characteristics

- Exact mappings: 100% (17/17 test cases covered) - BitNet alone: 89% (88/99 remaining cases) - Raw BitNet accuracy varies 2-3% between training runs due to random initialization

To reproduce: npx tsx scripts/measure-production-accuracy.ts

Known Model Weaknesses

The BitNet model has consistent failure patterns that are covered by exact mappings:

PatternFailure ModeFix
_document suffixOver-demotes to no-verificationExact mapping for google_firestore_document
_container suffixOver-demotes to no-verificationExact mapping for CosmosDB containers
_attached suffixOver-demotes to no-verificationExact mapping for google_compute_attached_disk
serverless_cacheMisclassifies as streamingExact mapping for aws_elasticache_serverless_cache
ami tokenNot recognized as disk imageExact mappings for aws_ami, aws_ami_copy
_ciphertext suffixOver-demotes to no-verificationExact mapping for google_kms_secret_ciphertext

These weaknesses exist because the model learned strong demotion signals (_policy, _configuration) but over-applies them to legitimate data-bearing resources with similar suffixes.

The exact mappings close these gaps with 100% confidence. Resources covered include:

These exact mappings exist because the BitNet model has known weaknesses on these specific patterns. They should not be removed during refactoring without verifying the model handles them correctly.

Training Discipline

Remaining Weaknesses

The 11 failures not covered by exact mappings are mostly config/data boundary cases:

These could be fixed with additional exact mappings if they prove problematic in production. The cost of adding an exact mapping is near-zero; the cost of a wrong verdict on a real user's resource is high.

When BitNet is Used

BitNet only classifies resources that: 1. Have no exact mapping in the catalog 2. Have no explicit AWS/GCP/Azure handler

For known resources, deterministic handlers remain authoritative. BitNet handles the long tail where manual rules don't exist.

Safety Requirements