any quantization released to reduce memory to fit into 8gb GPU RAM?

#33
by atolfia - opened

Fantastic work!!! but is was impossible for me to fit the personaplex into my 8gb gpu ram even trying to use RUST (moshi). Any plans for quantization released to reduce memory to fit into 8gb GPU RAM?
Thanks!!!

This a great work, for me personally i might be trying to use a rush

People even i am trying to Quantise it to 4Bit and run it on my RTX 3050 with GPU Offloading and other methods - But still facing issue - if you want i will give access to my GDRIVE where i have stored them all you guys can see what more can be done : nirooph1@gmail.com lets-connect

Sign up or log in to comment