openeurollm
/

datamix-2b-50-50

Model card Files Files and versions

vitiugin commited on 6 days ago

Commit

d78ffc9

·

verified ·

1 Parent(s): 922b775

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -66,6 +66,6 @@ The model utilizes Gemma-3 tokenizer — a SentencePiece tokenizer with a 262K v
 # Training Information
 The model was trained using the Megatron-LM framework on the LUMI HPC supercomputer. The training utilized 64 AMD MI250x nodes, totaling approximately 165000 GPU hours.
 Intermediate Checkpoints
-We have released intermediate checkpoints to provide access to the model's training progression. These checkpoints are available in separate branches, with a new checkpoint released every 5000 training steps.
 The naming convention is `checkpoint_0xxxxx00`. For example, the checkpoint for 50000 iterations is named `checkpoint_0050000`. The available checkpoints range from `checkpoint_0010000` up to `checkpoint_0953675`. The final checkpoint, `checkpoint_0953675`, is located in the main branch.

 # Training Information
 The model was trained using the Megatron-LM framework on the LUMI HPC supercomputer. The training utilized 64 AMD MI250x nodes, totaling approximately 165000 GPU hours.
 Intermediate Checkpoints
+We have released intermediate checkpoints to provide access to the model's training progression. These checkpoints are available in separate branches, with a new checkpoint released every 10000 training steps.
 The naming convention is `checkpoint_0xxxxx00`. For example, the checkpoint for 50000 iterations is named `checkpoint_0050000`. The available checkpoints range from `checkpoint_0010000` up to `checkpoint_0953675`. The final checkpoint, `checkpoint_0953675`, is located in the main branch.