RLVF pipeline using parser oracles to align LMs for Icelandic and Danish. GPT-SW3 and Viking-13B trained with Delta-DPO.
Fakhar
Hodfa71
AI & ML interests
None yet
Recent Activity
updated a model 34 minutes ago
Hodfa71/gpt-sw3-6b7-da-delta-dpo updated a model about 2 hours ago
Hodfa71/gpt-sw3-356m-is-delta-dpo-nosft-antihack published a model about 2 hours ago
Hodfa71/gpt-sw3-356m-is-delta-dpo-nosft-antihack