PyTorch
llama
vitiugin commited on
Commit
d78ffc9
·
verified ·
1 Parent(s): 922b775

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -66,6 +66,6 @@ The model utilizes Gemma-3 tokenizer — a SentencePiece tokenizer with a 262K v
66
  # Training Information
67
  The model was trained using the Megatron-LM framework on the LUMI HPC supercomputer. The training utilized 64 AMD MI250x nodes, totaling approximately 165000 GPU hours.
68
  Intermediate Checkpoints
69
- We have released intermediate checkpoints to provide access to the model's training progression. These checkpoints are available in separate branches, with a new checkpoint released every 5000 training steps.
70
 
71
  The naming convention is `checkpoint_0xxxxx00`. For example, the checkpoint for 50000 iterations is named `checkpoint_0050000`. The available checkpoints range from `checkpoint_0010000` up to `checkpoint_0953675`. The final checkpoint, `checkpoint_0953675`, is located in the main branch.
 
66
  # Training Information
67
  The model was trained using the Megatron-LM framework on the LUMI HPC supercomputer. The training utilized 64 AMD MI250x nodes, totaling approximately 165000 GPU hours.
68
  Intermediate Checkpoints
69
+ We have released intermediate checkpoints to provide access to the model's training progression. These checkpoints are available in separate branches, with a new checkpoint released every 10000 training steps.
70
 
71
  The naming convention is `checkpoint_0xxxxx00`. For example, the checkpoint for 50000 iterations is named `checkpoint_0050000`. The available checkpoints range from `checkpoint_0010000` up to `checkpoint_0953675`. The final checkpoint, `checkpoint_0953675`, is located in the main branch.