Deployment

#13

by Prateektg - opened 25 days ago

25 days ago

Can anyone tell me the in detail hardware requirement to host this model on any cloud server for fine tuning, and as for fine tuning can we fine tune this model? and what kind of dataset would be required (audio or text).

Amargolin

NVIDIA org 24 days ago

pwidenfels

21 days ago

But does it like moshi run on MLX yet ? if not are there any plans to make it compatible ? (just can't afford a nvidia/cuda gpu rn ahha) love the work you guys did with this and i would love to try an implement it in some personal projects

ThanhNguyxn

16 days ago

MLX Compatibility Status

Hi @pwidenfels ,

Currently, PersonaPlex does not officially support MLX (Apple Silicon). The model relies heavily on PyTorch + CUDA for real-time audio streaming inference.

Why MLX is challenging for this model:

Architecture Complexity: PersonaPlex is built on Moshi which uses a complex multi-stream audio tokenizer (Mimi) + a 7B parameter LM - both optimized for CUDA
Real-time Requirements: The streaming inference requires very low latency (~80ms per audio frame), which needs careful optimization per platform
Mimi Codec: The audio encoder/decoder hasn't been ported to MLX yet

Current Options for Non-NVIDIA users:

Cloud GPU: Use services like RunPod, Vast.ai, or Lambda Labs with NVIDIA GPUs (~$0.30-0.50/hr for RTX 3090/4090)
Google Colab Pro: T4/A100 access for experimentation
CPU Offload: The model supports --lowvram flag which offloads the LM to CPU (works but with higher latency)

For the NVIDIA team:

If there's community interest, an MLX port would require:

Porting the Mimi encoder/decoder
Adapting the streaming LM generation loop
Testing real-time audio latency on M-series chips

Hope this helps! Feel free to ask if you need help with cloud deployment.

duzafizzl

14 days ago

Yesss MLX support would be amazing !!

tommasobredariol

4 days ago

Hi, why do I keep getting this error?

Traceback (most recent call last):
File "", line 198, in _run_module_as_main
File "", line 88, in _run_code
File "/home/administrator/PERSONAPLEX/venv/lib/python3.13/site-packages/moshi/server.py", line 287, in
main()
~~~~^^
File "/home/administrator/PERSONAPLEX/venv/lib/python3.13/site-packages/moshi/server.py", line 227, in main
mimi = checkpoint_info.get_mimi(device=args.device)
File "/home/administrator/PERSONAPLEX/venv/lib/python3.13/site-packages/moshi/models/loaders.py", line 284, in get_mimi
num_codebooks = max(self.lm_config["dep_q"], self.lm_config["n_q"] - self.lm_config["dep_q"])
~~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'dep_q'

if I try to run python -m moshi.server --hf-repo "nvidia/personaplex-7b-v1"

pwidenfels

4 days ago

personaplex doesn't run natively on the moshi server. The architecture is very similar but the audio stack is a little different.

Do you have access to a nvidia GPU ? if so running the steps in the official repo should work properly.

duzafizzl

44 minutes ago

https://huggingface.co/eastlondoner/personaplex-mlx/tree/main i found this one but did not test it yet @Prateektg

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment