Shetu Mohanto's picture

Shetu Mohanto

shetumohanto

·

AI & ML interests

GenAI | MLOps | AI agent | Computer Vision

Recent Activity

reacted to mmhamdy's post with 🚀 1 day ago

The new DeepSeek Engram paper is super fun! It also integrates mHC, and I suspect they're probably releasing all these papers to make the V4 report of reasonable length😄 Here's a nice short summary from Gemini

reacted to prithivMLmods's post with 🔥 2 days ago

Now, a collection of various compression schemes for Qwen3.6 and the abliterated version 1 of dense models is available on the Hub. Check it out via the links below. 👇 🔗 Qwen3.6-MoE: https://huggingface.co/collections/prithivMLmods/qwen36-35b-a3b-compressions 🔗 Qwen3.6-27B Compressions: https://huggingface.co/collections/prithivMLmods/qwen36-27b-compressions 🤗 > To learn more, visit the app page or the respective model pages.

reacted to burtenshaw's post with ❤️ 8 months ago

Smol course has a distinctive approach to teaching post-training, so I'm posting about how it’s different to other post-training courses, including the llm course that’s already available. In short, the smol course is just more direct that any of the other course, and intended for semi-pro post trainers. - It’s a minimal set of instructions on the core parts. - It’s intended to bootstrap real projects you're working on. - The material handsover to existing documentation for details - Likewise, it handsover to the LLM course for basics. - Assessment is based on a leaderboard, without reading all the material. To start the smol course, follow here: https://huggingface.co/smol-course

View all activity

Organizations

shetumohanto 's datasets 1

shetumohanto/doctor_qa_bangla

Viewer • Updated Apr 27, 2024 • 5.14k • 25 • 1