code5717's picture

code5717

code5717

·

code5717

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

The Chinese GLM-5 Model Now Ranks #2 in Arabic Language Performance

liked a model about 1 month ago

Dogacel/Universal-DeepSeek-OCR-2

liked a model about 2 months ago

View all activity

Organizations

upvoted an article about 1 month ago

Article

The Chinese GLM-5 Model Now Ranks #2 in Arabic Language Performance

Feb 16

•

5

upvoted a collection 2 months ago

Cerebras REAP

Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 30 items • Updated Feb 25 • 134

upvoted an article 9 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

Jul 8, 2025

•

767

upvoted a collection 10 months ago

Llama Nemotron

Open, Production-ready Enterprise Models • 12 items • Updated 3 days ago • 77

upvoted a collection 12 months ago

Gemma 3 QAT

Quantization Aware Trained (QAT) Gemma 3 checkpoints. The model preserves similar quality as half precision while using 3x less memory • 15 items • Updated 22 days ago • 217

upvoted 2 papers about 1 year ago

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 80

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 444

upvoted a collection over 1 year ago

Llama 3.2

This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 663