NIST Phish Scale
Informacje podstawowe
- Nazwa: NIST Phish Scale
- Alias: Phish Scale, NIST TN 2276
- Dziedzina: Cybersecurity, Phishing Detection
- Typ: Measurement framework, difficulty assessment tool
Źródło
- URL: https://www.nist.gov/publications/nist-phish-scale-user-guide
- Paper: NIST Phish Scale User Guide (NIST TN 2276)
- Autorzy: Shanée Dawkins, Jody Jacobs
- Organizacja: National Institute of Standards and Technology (NIST)
- Rok: 2023 (updated framework), original research 2020
Charakterystyka
- Rozmiar: Framework (not a dataset per se)
- Podział: Categorizes phishing lures into difficulty levels
- Klasy/Kategorie: 3x3 matrix (Cues: Few/Some/Many × Alignment: Weak/Medium/Strong)
- Format: Assessment methodology
- Licencja: Public domain (US Government work)
Opis
NIST Phish Scale to standardized framework do oceny difficulty phishing lures. Umożliwia organizacjom benchmarking lure complexity i contextualization simulation outcomes.
Two-Dimensional Assessment:
-
Phishing Cues - Observable errors lub inconsistencies w email alerting users:
- Spelling mistakes
- Suspicious URLs
- Formatting irregularities
- Sender authenticity markers
- Scale: Few (hardest to detect) → Many (easiest to detect)
-
Premise Alignment - Relevance email do organizational context:
- Subject matter alignment z recipient’s job
- Sender plausibility
- Content alignment z typical communications
- Scale: Weak (low relevance) → Strong (high relevance)
Difficulty Classification:
- Easy: High cues + Low alignment (Many cues, Weak premise)
- Medium: Moderate on both dimensions (Some cues, Medium premise)
- Hard: Low cues + High alignment (Few cues, Strong premise)
Validation:
- Early validation: Barrientos et al. (2021) w lab settings (n=117)
- Large-scale validation: Rozema & Davis (2025) w enterprise (n=12,511)
Zastosowania
- Benchmarking phishing simulation difficulty
- Standardized reporting of training effectiveness
- Comparing results across organizations
- Calibrating phishing campaigns to appropriate difficulty levels
- Avoiding “gaming metrics” przez using only easy lures
- Research: controlling for lure complexity w experiments
- Vendor accountability: transparent difficulty assessment
Używany w publikacjach
- anti-phishing-training-2025 - First large-scale enterprise validation (N=12,511); F(2,12086)=41.415, p<0.001; click rates: 7.0% (easy) → 15.0% (hard)
Benchmarki
Enterprise Validation (Rozema & Davis 2025):
| Difficulty | Click Rate | N | Context |
|---|---|---|---|
| Easy | 7.0% | 5,721 | High cues, low alignment |
| Medium | 8.7% | 2,279 | Some cues, medium alignment |
| Hard | 15.0% | 4,511 | Few cues, high alignment |
Statistical Effect:
- F(2, 12086) = 41.415, p < 0.001
- η² = 0.007 (small but meaningful)
- Practical significance: Click rates doubled from easy to hard
Lab Validation (Canham et al. 2024, n=117):
- Confirmed Phish Scale predicts differential susceptibility
- Validated framework w controlled academic setting
Uwagi
Strengths:
- First standardized, open framework for phishing difficulty
- Two-dimensional assessment (cues × alignment) captures nuance
- Enables cross-organizational comparison
- Prevents “teaching to the test” (vendors gaming metrics)
- Public domain - free to use
- Validated at both lab and enterprise scale
Limitations:
- Subjective assessment (requires expert raters)
- Inter-rater reliability requires multiple assessors
- Time-consuming to apply (expert review needed)
- May not capture AI-generated phishing (lack traditional flaws)
- Focused on email phishing (unclear applicability to SMS, voice)
Practical Considerations:
- Requires 2-3 trained raters for reliability
- Disagreements resolved through discussion
- Best used with organizational context knowledge
- Should update as phishing tactics evolve
- LLM-generated phishing may challenge framework (perfect grammar, valid certs)
Future Directions (from Anti-Phishing Training 2025):
- Adaptation for AI-generated phishing attacks
- Extension to non-email modalities (SMS, voice, deepfakes)
- Automated assessment tools (reduce manual rating burden)
- Integration with phishing simulation platforms
Tagi
framework phishing-detection nist standardization difficulty-measurement cybersecurity human-factors validation