Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,66 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
+
tags:
|
| 7 |
+
- chess
|
| 8 |
+
- neuron
|
| 9 |
+
- aws-trainium
|
| 10 |
+
- vllm
|
| 11 |
+
- optimum-neuron
|
| 12 |
+
- continuous-batching
|
| 13 |
+
base_model: karanps/ChessLM_Qwen3
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# ChessLM Qwen3 - Neuron Traced (AWS Format Structure)
|
| 17 |
+
|
| 18 |
+
This is a Neuron-traced version of [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3) optimized for AWS Trainium (trn2) instances using vLLM.
|
| 19 |
+
|
| 20 |
+
This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.
|
| 21 |
+
|
| 22 |
+
This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)
|
| 23 |
+
|
| 24 |
+
## Model Details
|
| 25 |
+
|
| 26 |
+
- **Base Model**: Qwen3-8B fine-tuned for chess
|
| 27 |
+
- **Compilation**: optimum-neuron[vllm]==0.3.0
|
| 28 |
+
- **Compiler Version**: neuronxcc 2.21.33363.0
|
| 29 |
+
- **Target Hardware**: AWS Trainium2 (trn2)
|
| 30 |
+
- **Precision**: BF16
|
| 31 |
+
- **Tensor Parallelism**: 2 cores
|
| 32 |
+
- **Batch Size**: 4 (continuous batching enabled)
|
| 33 |
+
- **Max Sequence Length**: 2048
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
## Compilation instructions
|
| 37 |
+
```
|
| 38 |
+
optimum-cli export neuron \
|
| 39 |
+
--model karanps/ChessLM_Qwen3 \
|
| 40 |
+
--task text-generation \
|
| 41 |
+
--sequence_length 2048 \
|
| 42 |
+
--batch_size 4 \
|
| 43 |
+
/home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled
|
| 44 |
+
```
|
| 45 |
+
|
| 46 |
+
### Key Files
|
| 47 |
+
|
| 48 |
+
- **context_encoding_model/**: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
|
| 49 |
+
- **token_generation_model/**: Compiled NEFF files for autoregressive token generation
|
| 50 |
+
- **layout_opt/**: Layout optimization artifacts from compilation
|
| 51 |
+
- **model.pt**: Main model file containing compiled graphs and embedded weights (17GB)
|
| 52 |
+
- **neuron_config.json**: Neuron compilation configuration
|
| 53 |
+
|
| 54 |
+
## Model Files
|
| 55 |
+
|
| 56 |
+
| File | Purpose |
|
| 57 |
+
|------|---------|
|
| 58 |
+
| model.pt | Main model with embedded weights (17GB) |
|
| 59 |
+
| config.json | Base model configuration |
|
| 60 |
+
| neuron_config.json | Neuron compilation settings |
|
| 61 |
+
| tokenizer* | Tokenizer files for text processing |
|
| 62 |
+
|
| 63 |
+
## License
|
| 64 |
+
|
| 65 |
+
This model inherits the license from the base model [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3).
|
| 66 |
+
|