ShethArihant/GPTNeoX-160M-minipile-2048-flash_attention_2_gradient-checkpointing 0.2B • Updated Nov 8, 2025 • 2
ShethArihant/deepseek-coder-7b-instruct-v1.5_sft-v4-with-setup_3-epochs_ce-0.8_triplet-0.2_lora Updated Oct 29, 2025
ShethArihant/deepseek-coder-1.3b-instruct_sft-v4-with-setup_3-epochs_ce-0.8_triplet-0.2_lora2 Updated Oct 29, 2025
ShethArihant/deepseek-coder-1.3b-instruct_sft-v4-with-setup_3-epochs_ce-0.8_triplet-0.2_lora Updated Oct 29, 2025
ShethArihant/deepseek-coder-1.3b-instruct_sft-v4-with-setup_3-epochs_ce-0.8_triplet-0.2_mean-pool Text Generation • 1B • Updated Oct 29, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct_sft-v4_3-epochs_ce-0.8_triplet-0.2_last-token Text Generation • 1B • Updated Oct 28, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct_sft-v4_3-epochs_ce-0.8_triplet-0.2_mean-pool Text Generation • 1B • Updated Oct 28, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct-seccodeplt-updated-cot-sft-v4-10-epochs_last-token Text Generation • 1B • Updated Oct 28, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct-seccodeplt-updated-cot-sft-v3-10-epochs-no-triplet Text Generation • 1B • Updated Oct 28, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct-seccodeplt-updated-cot-sft-v3-5-epochs Text Generation • 1B • Updated Oct 28, 2025 • 1
ShethArihant/deepseek-coder-1.3b-instruct-seccodeplt-cot-sft-v2-5-epochs Text Generation • 1B • Updated Oct 27, 2025 • 2
ShethArihant/deepseek-coder-1.3b-instruct-seccodeplt-cot-sft-5-epochs-with-tag-instr Text Generation • 1B • Updated Oct 26, 2025 • 4