Metanthropic Chat
Chat with AI and extract text from images
None defined yet.
RESEARCH • INTERPRETABILITY • INTELLIGENCE
"We do not view safety as an adjunct effort; it is the mathematical constraint under which we optimize for intelligence."
Metanthropic Labs is an independent AI research organization founded by Ekjot Singh. We operate on a singular thesis: the path to safe Artificial General Intelligence requires systems that are not just highly capable, but structurally transparent.
Rather than relying purely on brute-force scaling, our lab focuses on high-efficiency architectural injections, mechanistic interpretability, and controllable reasoning. We build, dissect, and open-source models that push the frontier of what is possible within optimized compute budgets.
Our technical agenda is rigorous, empirical, and built in public. We currently focus on three primary vectors:
We cannot align systems we do not fundamentally understand. Our lab pioneers techniques to map and control the internal representations of large language models.
We are pushing the frontier of "System 2" reasoning, moving models beyond probabilistic pattern matching to verifiable, multi-step logical deduction.
Building grounded world models that natively process text, audio, and visual data without the bloat of traditional pipeline systems.
We believe in open-source validation. Our active deployments include:
(Note: Experimental and foundational research shards are kept private until they meet our internal rigorous safety and coherence benchmarks.)
To stay perfectly synced with our latest model weights, technical reports, and architectural milestones, subscribe directly to our live feed:
📡 Subscribe to the Metanthropic Labs RSS Feed
Feb 12, 2026 | SPECIFICATION: Metanthropic Neural Ablation via Attention Refraction (M-NAAR) We introduce M-NAAR to resolve the "Unlearning Trilemma." By refracting attention away from high-entropy tokens rather than destroying weights, we achieve 0.00 hallucination rates and robust deletion without lobotomizing the model.
Feb 10, 2026 | Specification for Latent Logic Topology & Soundness-Aware Calibration We operationalize LLMs as engines of "Latent Causal Chains" to solve the RLVR Convergence Paradox. We introduce the Soundness-Aware Level (SAL), a microscopic metric that predicts post-alignment reasoning performance with 87% accuracy.
Feb 07, 2026 | The Kinetic-Potential Information Disentanglement Protocol (KP-IDP) Standard interpretability relies on a dangerous conflation: that Decodability equals Causality. We introduce KP-IDP to distinguish between "Dark Computation" (Kinetic) and "Phantom Readouts" (Potential), solving the intervention-reversal paradox.
Feb 06, 2026 | Module 003-CFG: Chronometric Flux Gating We introduce Chronometric Flux Gating (CFG), a dynamic regularization protocol that eliminates Latent Manifold Collapse in Sparse Autoencoders. By treating feature importance as a temporal trajectory, we reduce feature absorption by 95%.
Feb 04, 2026 | PROJECT OBLIQUE-GUARD: Latent Geometry Stabilization We reject the "Robustness vs. Accuracy" trade-off. We demonstrate that adversarial vulnerability is a deterministic artifact of Superposition and introduce the Oblique-Guard Layer to filter these geometric exploits.
Feb 02, 2026 | Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability We demonstrate that Supervised Fine-Tuning inadvertently introduces the "Knobe Effect"—a moral asymmetry where negative outcomes are judged as more intentional. We localize this bias to specific layers and propose a surgical intervention to remove it.
Jan 31, 2026 | Arvi 20B: Democratizing Reasoning with Efficient MoEs Introducing Arvi 20B, an open-weight Mixture-of-Experts reasoning model. With only 3.6B active parameters, it rivals frontier models on math, coding, and agentic benchmarks.
Jan 30, 2026 | MahenOCR: Commercial-Grade OCR with a 1B Parameter VLM We introduce MahenOCR, a 1B parameter vision-language model that achieves state-of-the-art performance in OCR tasks through a unified end-to-end architecture and novel reinforcement learning strategies.
Dec 23, 2025 | The Fragility of Guardrails: Cognitive Jamming and Repetition Collapse in Safety-Steered LLMs We conduct a mechanistic audit of the LLM residual stream, deploying Sparse Autoencoders to reveal how models spontaneously construct internal physics engines—and how fragile these representations are to perturbation.
Dec 17, 2025 | Dataset Distillation for the Pre-Training Era We introduce Linear Gradient Matching, a method that condenses massive datasets into a single synthetic image per class, revealing the shared representations across different AI models.
Nov 10, 2025 | Announcing Metanthropic We are proud to officially unveil Metanthropic, an independent research organization dedicated to the development of safe and broadly beneficial AGI.
Copyright © 2025-2026 Metanthropic Labs. All rights reserved.
Licensed under the Metanthropic Research License.