·
AI & ML interests
None yet
Recent Activity
Organizations
published an article about 1 hour ago view article SpecJAX: A Speculative Decoding Library for TPUs
view article We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.
view article 1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next
view article 1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7
view article 2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5
view article Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.
view article Speculative Decoding in Practice: How EAGLE3 Makes LLMs Faster Without Changing Their Outputs