GPT-OSS MoE Expert Fingerprinting & Specialized Model Pruning

ExpertFingerprinting: Behavioral Pattern Analysis and Specialization Mapping of Experts in GPT-OSS-20B's Mixture-of-Experts Architecture

Interactive Tools:

Expert Analytics Dashboard - Token-level visualization and domain analysis
Layer Comparison Tool - Deep expert pattern comparison

Model Collections:

Main Collection - All 232 specialized models
General Purpose
Science
Mathematics
Health & Medicine
Law
Safety
Instruction Following
Harmful/Red-team

Each collection contains 29 models with the following parameter counts: 4.2B, 4.8B, 5.4B, 6.0B, 6.6B, 7.2B, 7.8B, 8.4B, 9.0B, 9.6B, 10.2B, 10.8B, 11.4B, 12.0B, 12.6B, 13.1B, 13.7B, 14.3B, 14.9B, 15.5B, 16.1B, 16.7B, 17.3B, 17.9B, 18.5B, 19.1B, 19.7B, 20.3B, and 20.9B parameters, offering flexibility for different deployment scenarios.

Project Overview

This project attempts to conduct an in-depth investigation into expert activation patterns within GPT-OSS-20B's Mixture-of-Experts (MoE) architecture. Through analysis of router decisions across diverse evaluation benchmarks, we've created specialized, resource-efficient models through expert pruning.

Key Achievements:

232 Specialized Models Released across 8 domains and 29 expert configurations each
Interactive Analysis Tools for real-time expert pattern exploration
Domain-Specific Optimization maintaining performance while reducing computational overhead
Comprehensive Evaluation across GPQA, MMLU, SORRY-Bench, Tulu3, and Polyglot benchmarks

Methodology

Expert Activation Analysis

Our approach begins with router analysis across the original GPT-OSS-20B model:

Token-Level Tracking: We use all 24 layers to capture router decisions for every generated token
Multi-Domain Evaluation: We look at scientific reasoning, mathematical computation, legal knowledge, medical understanding, safety evaluation, instruction following, and general capabilities
Pattern Recognition: Statistical aggregation allows us to know which experts consistently activate for specific task types

Systematic Expert Pruning

Based on activation patterns, we implement a data-driven pruning strategy:

Domain Specialization: Eight distinct specialization tracks:

General: Broad capability preservation across all domains
Science: Physics, chemistry, biology reasoning (GPQA-focused)
Mathematics: Quantitative reasoning and problem-solving
Health/Medicine: Clinical knowledge and medical reasoning
Law: Legal frameworks and jurisprudence
Safety: Harm detection and responsible AI patterns
Instruction Following: Constraint satisfaction and formatting adherence
Harmful: Inverted safety patterns for red-teaming research

Technical Implementation

Model Architecture Preservation:

Maintains original 24-layer transformer structure
Preserves 128K context length and attention patterns
Retains RoPE positional encoding and RMSNorm
Uses BF16 precision for optimal memory efficiency

Pruning Process:

Expert Selection: Top-performing experts identified per layer per domain
Weight Extraction: Router and expert weights carefully preserved
Architecture Adjustment: Configuration updated for reduced expert count
Validation: Functionality testing across representative prompts

Model Collections

We've systematically released 232 specialized models organized into domain-specific collections:

Main Collection

Complete overview of all 232 specialized models across domains and configurations.

Domain-Specific Collections:

General Purpose (4.2B-20B): Broad capability models for versatile applications
Science (4.2B-20B): Optimized for scientific reasoning and technical knowledge
Health & Medicine (4.2B-20B): Specialized for medical and clinical applications
Mathematics (4.2B-20B): Enhanced quantitative reasoning and problem-solving
Law (4.2B-20B): Legal knowledge and jurisprudential reasoning
Safety (4.2B-20B): Harm detection and responsible AI deployment
Instruction Following (4.2B-20B): Precise constraint satisfaction and formatting
Harmful/Red-team (4.2B-20B): Research models with inverted safety patterns

Each collection contains 29 models ranging from 4.2B to 20B parameters, offering flexibility for different deployment scenarios.

We've developed comprehensive web-based tools for exploring expert activation patterns:

Expert Analytics Dashboard

Token-Level Visualization: Interactive exploration of expert routing decisions
Domain Analysis: Compare activation patterns across different task types
Statistical Aggregation: View top-performing experts by layer and domain
Real-time Filtering: Analyze completed vs. incomplete generations

Comparison Tool

Layer-by-Layer Analysis: Deep comparison of expert patterns between configurations
Statistical Significance: Quantify differences in expert usage across domains
Visual Charting: Interactive graphs showing expert activation distributions
Export Functionality: Download analysis results for further research

Visit these tools at https://amanpriyanshu.github.io/GPT-OSS-MoE-ExpertFingerprinting/ and https://amanpriyanshu.github.io/GPT-OSS-MoE-ExpertFingerprinting/comparison.html to interact with the full dataset and explore expert behavior patterns in detail.

Technical Details

Model Configuration

All pruned models maintain compatibility with the original GPT-OSS architecture:

Precision: BF16 for optimal memory/performance balance
Top-k Routing: Dynamically adjusted to min(4, num_experts)
Context Length: Full 128K token support preserved
Attention Pattern: Alternating dense/sliding window maintained

## Citation

If you use this work in your research, please cite:

@misc{priyanshu2025gptoss,
  title={GPT-OSS MoE Expert Fingerprinting: Analyzing Expert Activation Patterns in Mixture of Experts Models},
  author={Priyanshu, Aman and Vijay, Supriti},
  year={2025},
  howpublished={\url{https://amanpriyanshu.github.io/GPT-OSS-MoE-ExpertFingerprinting/}},
  note={Interactive analysis tool and systematic expert pruning for MoE architectures}
}

Contributing

We welcome contributions to extend this research:

Additional Domains: Propose new specialization areas
Pruning Strategies: Alternative expert selection methodologies
Evaluation Metrics: Novel assessment approaches for MoE models
Tool Enhancement: Improvements to analysis interfaces

Repository Structure

├── inference_on_prompts.py    # Batch inference with router analysis
├── mini_model_creator.py      # Expert pruning and model generation
├── recorder.py               # Router activation recording utilities
├── expert_recorder.py        # Specialized expert tracking
├── index.html               # Main analysis dashboard
├── comparison.html          # Layer comparison tool
└── topical_analytics/       # Domain-specific expert rankings
    ├── all.json
    ├── science.json
    ├── math.json
    ├── health_or_medicine.json
    ├── law.json
    ├── safety.json
    ├── instruction_following.json
    └── harmful.json

Acknowledgments

OpenAI: For releasing the GPT-OSS-20B model and enabling this research
Hugging Face: For hosting infrastructure and model distribution
Research Community: For evaluation benchmarks and methodological foundations

Explore the interactive tools at https://amanpriyanshu.github.io/GPT-OSS-MoE-ExpertFingerprinting/ to dive deeper into expert activation patterns and model comparisons.

AmanPriyanshu/GPT-OSS-MoE-ExpertFingerprinting