kedar kolluri

kktw

AI & ML interests

None yet

Recent Activity

updated a dataset about 1 month ago

thoughtworks/document-processing-benchmark

published a dataset about 2 months ago

thoughtworks/document-processing-benchmark

updated a dataset about 2 months ago

thoughtworks/agentic-coding-trajectories

View all activity

Organizations

published an article 2 months ago

Article

SpecJAX: A Speculative Decoding Library for TPUs

lujangusface

•

Apr 20

published an article 2 months ago

Article

We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.

lujangusface

•

Apr 17

• 1

published an article 3 months ago

Article

1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next

lujangusface

•

Apr 15

published an article 3 months ago

Article

1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7

lujangusface

•

Apr 15

• 1

published an article 3 months ago

Article

2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5

lujangusface

•

Apr 9

• 3

published an article 3 months ago

Article

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.

lujangusface

•

Apr 7

• 3

published an article 3 months ago

Article

Speculative Decoding in Practice: How EAGLE3 Makes LLMs Faster Without Changing Their Outputs

lujangusface

•

Apr 3

• 9

kedar kolluri

AI & ML interests

Recent Activity

Organizations

kktw's activity

SpecJAX: A Speculative Decoding Library for TPUs

We Pitted the Cheapest TPU Against an NVIDIA L4. Here's What 6 Experiments Revealed.

1.37x Faster on Alibaba's 80B Code Model: EAGLE3 for Qwen3-Coder-Next

1.7x Faster on a 218B Model: EAGLE3 Speculative Decoding for GLM-4.7

2x Faster on a 229B MoE: EAGLE3 Speculative Decoding for MiniMax-M2.5

Google Released Gemma-4 Four Days Ago. We Already Made It 1.72× Faster.

Speculative Decoding in Practice: How EAGLE3 Makes LLMs Faster Without Changing Their Outputs