Post
21
Someone ran Supra-50M-Instruct ON A 1GHZ 1999 CPU
https://www.reddit.com/r/LocalLLM/comments/1tm21ar/i_see_your_strix_halo_and_raise_you_a_vintage/
"As a fun experiment, I decided to try running the recently released Supra-50m on a 26-year-old machine I keep for retro Windows 9.X games. Although the model was somewhat silly and inconsistent, the performance wasn't bad, reaching around 1.3 tok/s with CPU inference alone.
Since this CPU doesn't have SSE2, I changed from llama.cpp to llama2.ce and asked Claude to write a custom tokenizer.
It's crazy to think that with the right file size of 200 MB, we could have experienced this magic back in 1999" - u/drone_stonks, r/localllm
https://www.reddit.com/r/LocalLLM/comments/1tm21ar/i_see_your_strix_halo_and_raise_you_a_vintage/
"As a fun experiment, I decided to try running the recently released Supra-50m on a 26-year-old machine I keep for retro Windows 9.X games. Although the model was somewhat silly and inconsistent, the performance wasn't bad, reaching around 1.3 tok/s with CPU inference alone.
Since this CPU doesn't have SSE2, I changed from llama.cpp to llama2.ce and asked Claude to write a custom tokenizer.
It's crazy to think that with the right file size of 200 MB, we could have experienced this magic back in 1999" - u/drone_stonks, r/localllm