512 max positional embeddings, but 8192 context length

by Fizzarolli - opened Dec 19, 2024

Dec 19, 2024

hi!! this is fantastic and i love that someone finally made a series of models like this and i love you all
However. the model card notes that it was annealed up to 8192 context length which is great-- but then the config.json specifies 512 on the max positional embeddings. Am I missing something obvious? Does RoPE need to be manually configured? I am unsure

bwarner

Dec 19, 2024

@Fizzarolli Good catch. That was a research code to hugging face transformers code porting mistake which I fixed in 5756c58 and f87846.

bwarner changed discussion status to closed Dec 19, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment