T145 commited on
Commit
672f9f1
·
verified ·
1 Parent(s): f99ba80

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -59
README.md CHANGED
@@ -1,59 +1,61 @@
1
- ---
2
- base_model:
3
- - meta-llama/Llama-3.1-8B-Instruct
4
- - akjindal53244/Llama-3.1-Storm-8B
5
- - arcee-ai/Llama-3.1-SuperNova-Lite
6
- - Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2
7
- library_name: transformers
8
- tags:
9
- - mergekit
10
- - merge
11
-
12
- ---
13
- # Untitled Model (1)
14
-
15
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
16
-
17
- ## Merge Details
18
- ### Merge Method
19
-
20
- This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) as a base.
21
-
22
- ### Models Merged
23
-
24
- The following models were included in the merge:
25
- * [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
26
- * [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
27
- * [Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2](https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2)
28
-
29
- ### Configuration
30
-
31
- The following YAML configuration was used to produce this model:
32
-
33
- ```yaml
34
- base_model: meta-llama/Llama-3.1-8B-Instruct
35
- dtype: bfloat16
36
- merge_method: dare_ties
37
- parameters:
38
- int8_mask: 1.0
39
- slices:
40
- - sources:
41
- - layer_range: [0, 32]
42
- model: akjindal53244/Llama-3.1-Storm-8B
43
- parameters:
44
- density: 0.7
45
- weight: 0.2
46
- - layer_range: [0, 32]
47
- model: arcee-ai/Llama-3.1-SuperNova-Lite
48
- parameters:
49
- density: 0.7
50
- weight: 0.3
51
- - layer_range: [0, 32]
52
- model: Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2
53
- parameters:
54
- density: 0.7
55
- weight: 0.5
56
- - layer_range: [0, 32]
57
- model: meta-llama/Llama-3.1-8B-Instruct
58
- tokenizer_source: meta-llama/Llama-3.1-8B-Instruct
59
- ```
 
 
 
1
+ ---
2
+ base_model:
3
+ - meta-llama/Llama-3.1-8B-Instruct
4
+ - akjindal53244/Llama-3.1-Storm-8B
5
+ - arcee-ai/Llama-3.1-SuperNova-Lite
6
+ - Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+ - llama3.1
12
+ language:
13
+ - en
14
+ ---
15
+ # ZEUS
16
+
17
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
+
19
+ ## Merge Details
20
+ ### Merge Method
21
+
22
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [meta-llama/Llama-3.1-8B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct) as a base.
23
+
24
+ ### Models Merged
25
+
26
+ The following models were included in the merge:
27
+ * [akjindal53244/Llama-3.1-Storm-8B](https://huggingface.co/akjindal53244/Llama-3.1-Storm-8B)
28
+ * [arcee-ai/Llama-3.1-SuperNova-Lite](https://huggingface.co/arcee-ai/Llama-3.1-SuperNova-Lite)
29
+ * [Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2](https://huggingface.co/Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2)
30
+
31
+ ### Configuration
32
+
33
+ The following YAML configuration was used to produce this model:
34
+
35
+ ```yaml
36
+ base_model: meta-llama/Llama-3.1-8B-Instruct
37
+ dtype: bfloat16
38
+ merge_method: dare_ties
39
+ parameters:
40
+ int8_mask: 1.0
41
+ slices:
42
+ - sources:
43
+ - layer_range: [0, 32]
44
+ model: akjindal53244/Llama-3.1-Storm-8B
45
+ parameters:
46
+ density: 0.7
47
+ weight: 0.2
48
+ - layer_range: [0, 32]
49
+ model: arcee-ai/Llama-3.1-SuperNova-Lite
50
+ parameters:
51
+ density: 0.7
52
+ weight: 0.3
53
+ - layer_range: [0, 32]
54
+ model: Orenguteng/Llama-3.1-8B-Lexi-Uncensored-V2
55
+ parameters:
56
+ density: 0.7
57
+ weight: 0.5
58
+ - layer_range: [0, 32]
59
+ model: meta-llama/Llama-3.1-8B-Instruct
60
+ tokenizer_source: meta-llama/Llama-3.1-8B-Instruct
61
+ ```