Crystalite: A Lightweight Transformer for Efficient Crystal Modeling
Paper • 2604.02270 • Published
Crystalite checkpoint trained for 100K steps on a balanced 32K subset of Alex-MP-20 with 35% insulators (vs 2.1% in the full dataset). This is the production model for guided crystal generation.
Architecture: 67.8M-parameter Diffusion Transformer with subatomic tokenizer and GEM attention bias (Crystalite, Hadzi Veljkovic et al.).
| Metric | Value |
|---|---|
| In-window rate (4-6 eV) | 42.6% |
| Lattice validity | 100% |
| Geometry validity | 99.6% |
| Compositional uniqueness | 78% |
| Metal fraction | 0.2% |
Formation energy probe AUROC: 0.990. Band gap probe AUROC: ~0.95.
Hybrid gradient steering + token masking produces: 100% refractory, 0% cobalt/nickel, 100% insulator, 30% in target window.
Requires the Crystalite codebase and probe-gradient-guidance scripts.
from scripts.train_probe import load_model
model = load_model("final.pt", device="cuda")