Risk model
Four levels. Four escalation rules. One Drift combo detector. No heuristics, no learned weights. Every verdict is derivable from the transaction structure alone.
The four levels
| level | meaning | examples |
|---|---|---|
| low | Benign user action. Safe to sign without further review. | Small SOL transfer, SPL token transfer below 1 000 raw units, SPL SetAuthority to the same account. |
| medium | Human review recommended, but no material loss if the tx is not what the user expected. | Kamino deposit, Jupiter swap, Drift place_order, SPL Mint below the large-mint threshold, unknown program call (this is the fallback for programs not in the registry). |
| high | Material exposure. Losing the tx to the wrong recipient means funds or control move in a way that is costly to reverse. | New borrow opening a debt position, liquidation, durable nonce advance, Drift delegate change, SPL Token SetAuthority on a mint, Token-2022 non-transferable mint init. |
| critical | Do not sign without explicit, step-by-step confirmation of what each inner instruction actually does. Material loss or loss of control is likely. | Squads vault_transaction_execute, config_transaction_execute, MarginFi set_account_authority, Token-2022 permanent delegate init, and the Drift 2026 pattern combo. |
Escalation rules
The overall risk is computed by starting at low and applying four rules in order. Each rule can only raise the verdict, never lower it.
1. Per-instruction max
Every decoded instruction has an intrinsic risk assigned by its decoder. The classifier takes the maximum of every instruction's risk as the starting point.
2. Durable nonce
If the first instruction is System.AdvanceNonceAccount, the runtime recognizes the transaction as using a durable nonce and so does the engine. The verdict is raised to at least high and a warning is appended to the human summary.
3. Drift 2026 pattern
If the transaction uses a durable nonce AND contains any instruction whose name is one of:
config_transaction_execute(Squads v4)multisig_set_config(Squads v4)vault_transaction_execute(Squads v4)
...the verdict is forced to critical and a dedicated multi-line callout is prepended to the human summary:
[X] CRITICAL — this transaction matches the APRIL 2026
DRIFT EXPLOIT PATTERN:
durable nonce + multisig admin execute. the attacker
that drained $285M from Drift used exactly this shape —
pre-signed governance actions that stay valid indefinitely.
DO NOT SIGN without verifying the inner instructions AND
the nonce account lifecycle.4. State diff escalation
The classifier also inspects the simulation state diffs:
- Any account losing more than 1 SOL (
lamports_delta ≤ -1_000_000_000) bumps the verdict to high. - Any account whose owner program changes (
owner_before ≠ owner_after) bumps the verdict to critical.
In offline mode the state diffs are empty, so rule 4 never fires. The other three rules are fully structural and work without RPC.
What the model does NOT do
- No learned weights. There is no model file. Every verdict is derived from static rules and the transaction structure. A new engine build produces identical verdicts on identical inputs.
- No signature-based allowlisting.The engine does not keep a list of "trusted" signer addresses. Every signer is treated the same.
- No historical reputation lookups. The engine does not call out to external reputation APIs, block explorers, or community feed URLs. Everything it knows is in the crate.
- No false positives on benign traffic. A plain SOL transfer with no durable nonce and no multisig admin instructions stays at low. The test suite verifies this explicitly.