MODEL

NanoMind Security Classifier

On-device Mamba TME classifier for AI agent security content. 10 classes (9 attack types plus benign). 8.3 MB ONNX. 98.45% eval accuracy on the held-out set. Published to HuggingFace.

98.45%
Eval Accuracy
0.978
Macro F1
10
Classes
8.3 MB
Model Size

The 10 Classes

Nine attack types plus benign. This is the label set the v0.5.0 classifier emits, taken from the sft-v10 training corpus.

exfiltration
Data forwarding to external endpoints
injection
Instruction override, jailbreak
privilege_escalation
Unauthorized access elevation
persistence
Permanent state manipulation
credential_abuse
Credential harvesting, phishing
lateral_movement
Remote config, C2 communication
social_engineering
Urgency, pressure tactics
policy_violation
Governance bypass
steganography
Zero-width chars, homoglyphs, BIDI
benign
Normal agent behavior

Evaluation (v0.5.0)

Held-out eval (194 samples) plus a 50-fixture oracle (40 malicious plus 10 benign hard-negatives). Per-class F1 is tracked as a release gate but not published per class. Macro F1 is the authoritative summary.

Eval accuracy98.45%
Macro F10.978
Eval samples194
Oracle recall100%
Oracle precision79.6%
Oracle F10.887
Oracle benign FPR9.1%

Version History

VersionArchitectureAccuracyCorpusStatus
v0.5.0Mamba TME + dropout98.45%sft-v10 (3,168)latest
v0.4.0Mamba TME96.73%sft-v9 (3,337)stable
v0.2.0Mamba TME97.01%v4 (822)deprecated
v0.1.0MLP (3 layers)86%v4 (822)deprecated

Training Data (sft-v10 corpus)

3,168 training samples, 194 held-out eval, 10 classes, vocab 6,000. Claude LLM reviews every label as chief data scientist. The sources below are the raw pool sampled into the sft-v10 split.

SourceSamplesType
OASB benchmark4,151Real labeled scenarios
Registry (pretrain)4,885Real package descriptions
Synthetic1,029Template edge cases
DVAA88Vulnerable configs
AgentPwn68Real-world captures

Architecture Details

TypeTernary Mamba Encoder (TME)
Blocks8 Mamba SSM blocks
d_model128
d_state64
Dropout0.1
PoolingMean over sequence
Output10-class softmax
FormatONNX (CPU inference)
TrainingApple Silicon MLX
LossCross-entropy, class-weighted
LR ScheduleCosine with warmup
Early StoppingPatience 30 on eval loss

Links