jburtoft commited on
Commit
1296456
·
verified ·
1 Parent(s): e158e00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +66 -3
README.md CHANGED
@@ -1,3 +1,66 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ pipeline_tag: text-generation
6
+ tags:
7
+ - chess
8
+ - neuron
9
+ - aws-trainium
10
+ - vllm
11
+ - optimum-neuron
12
+ - continuous-batching
13
+ base_model: karanps/ChessLM_Qwen3
14
+ ---
15
+
16
+ # ChessLM Qwen3 - Neuron Traced (AWS Format Structure)
17
+
18
+ This is a Neuron-traced version of [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3) optimized for AWS Trainium (trn2) instances using vLLM.
19
+
20
+ This model follows the AWS Neuron repository structure with separate directories for compiled artifacts.
21
+
22
+ This model is meant to be used from within the Neuron Workshop (https://github.com/aws-neuron/neuron-workshops)
23
+
24
+ ## Model Details
25
+
26
+ - **Base Model**: Qwen3-8B fine-tuned for chess
27
+ - **Compilation**: optimum-neuron[vllm]==0.3.0
28
+ - **Compiler Version**: neuronxcc 2.21.33363.0
29
+ - **Target Hardware**: AWS Trainium2 (trn2)
30
+ - **Precision**: BF16
31
+ - **Tensor Parallelism**: 2 cores
32
+ - **Batch Size**: 4 (continuous batching enabled)
33
+ - **Max Sequence Length**: 2048
34
+
35
+
36
+ ## Compilation instructions
37
+ ```
38
+ optimum-cli export neuron \
39
+ --model karanps/ChessLM_Qwen3 \
40
+ --task text-generation \
41
+ --sequence_length 2048 \
42
+ --batch_size 4 \
43
+ /home/ubuntu/environment/ml/qwen-chess/karanps/ChessLM_Qwen3_compiled
44
+ ```
45
+
46
+ ### Key Files
47
+
48
+ - **context_encoding_model/**: Compiled NEFF files for processing initial prompt sequences (up to 2048 tokens)
49
+ - **token_generation_model/**: Compiled NEFF files for autoregressive token generation
50
+ - **layout_opt/**: Layout optimization artifacts from compilation
51
+ - **model.pt**: Main model file containing compiled graphs and embedded weights (17GB)
52
+ - **neuron_config.json**: Neuron compilation configuration
53
+
54
+ ## Model Files
55
+
56
+ | File | Purpose |
57
+ |------|---------|
58
+ | model.pt | Main model with embedded weights (17GB) |
59
+ | config.json | Base model configuration |
60
+ | neuron_config.json | Neuron compilation settings |
61
+ | tokenizer* | Tokenizer files for text processing |
62
+
63
+ ## License
64
+
65
+ This model inherits the license from the base model [karanps/ChessLM_Qwen3](https://huggingface.co/karanps/ChessLM_Qwen3).
66
+