bubbliiiing commited on
Commit
03658f4
·
1 Parent(s): 5c12a7d

Update weights

Browse files
.gitattributes CHANGED
@@ -1,3 +1,5 @@
 
 
1
  *.7z filter=lfs diff=lfs merge=lfs -text
2
  *.arrow filter=lfs diff=lfs merge=lfs -text
3
  *.bin filter=lfs diff=lfs merge=lfs -text
 
1
+ *.png filter=lfs diff=lfs merge=lfs -text
2
+ *.jpg filter=lfs diff=lfs merge=lfs -text
3
  *.7z filter=lfs diff=lfs merge=lfs -text
4
  *.arrow filter=lfs diff=lfs merge=lfs -text
5
  *.bin filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,100 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Z-Image-Turbo-Fun-Controlnet-Union
5
+
6
+ ## Model Features
7
+ - This ControlNet is added on 6 blocks.
8
+ - The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
9
+ - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD—and can be used like a standard ControlNet.
10
+ - You can adjust controlnet_conditioning_scale and control_guidance_end for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
11
+
12
+ ## TODO
13
+ - [ ] Train on more data and for more steps.
14
+ - [ ] Support inpaint mode.
15
+
16
+ ## Results
17
+
18
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
19
+ <tr>
20
+ <td>Pose</td>
21
+ <td>Output</td>
22
+ </tr>
23
+ <tr>
24
+ <td><img src="asset/pose2.jpg" width="100%" /></td>
25
+ <td><img src="results/pose2.png" width="100%" /></td>
26
+ </tr>
27
+ </table>
28
+
29
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
30
+ <tr>
31
+ <td>Pose</td>
32
+ <td>Output</td>
33
+ </tr>
34
+ <tr>
35
+ <td><img src="asset/pose.jpg" width="100%" /></td>
36
+ <td><img src="results/pose.png" width="100%" /></td>
37
+ </tr>
38
+ </table>
39
+
40
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
41
+ <tr>
42
+ <td>Canny</td>
43
+ <td>Output</td>
44
+ </tr>
45
+ <tr>
46
+ <td><img src="asset/canny.jpg" width="100%" /></td>
47
+ <td><img src="results/canny.png" width="100%" /></td>
48
+ </tr>
49
+ </table>
50
+
51
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
52
+ <tr>
53
+ <td>HED</td>
54
+ <td>Output</td>
55
+ </tr>
56
+ <tr>
57
+ <td><img src="asset/hed.jpg" width="100%" /></td>
58
+ <td><img src="results/hed.png" width="100%" /></td>
59
+ </tr>
60
+ </table>
61
+
62
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
63
+ <tr>
64
+ <td>Depth</td>
65
+ <td>Output</td>
66
+ </tr>
67
+ <tr>
68
+ <td><img src="asset/depth.jpg" width="100%" /></td>
69
+ <td><img src="results/depth.png" width="100%" /></td>
70
+ </tr>
71
+ </table>
72
+
73
+ ## Inference
74
+ Go to the VideoX-Fun repository for more details.
75
+
76
+ Please clone the VideoX-Fun repository and create the required directories:
77
+
78
+ ```sh
79
+ # Clone the code
80
+ git clone https://github.com/aigc-apps/VideoX-Fun.git
81
+
82
+ # Enter VideoX-Fun's directory
83
+ cd VideoX-Fun
84
+
85
+ # Create model directories
86
+ mkdir -p models/Diffusion_Transformer
87
+ mkdir -p models/Personalized_Model
88
+ ```
89
+
90
+ Then download the weights into models/Diffusion_Transformer and models/Personalized_Model.
91
+
92
+ ```
93
+ 📦 models/
94
+ ├── 📂 Diffusion_Transformer/
95
+ │ └── 📂 Z-Image-Turbo/
96
+ ├── 📂 Personalized_Model/
97
+ │ └── 📦 Z-Image-Turbo-Fun-Controlnet-Union.safetensors
98
+ ```
99
+
100
+ Then run the file `examples/z_image_fun/predict_t2i_control.py`.
Z-Image-Turbo-Fun-Controlnet-Union.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86c085c0d7853f12ce5183499934b54d08371c60f549c5a6b20615cd23989388
3
+ size 3101572408
asset/canny.jpg ADDED

Git LFS Details

  • SHA256: 800790ae2e890e99b75dc1fc0a05142d22dbcdd9a961d2bc15222a4356683723
  • Pointer size: 131 Bytes
  • Size of remote file: 278 kB
asset/depth.jpg ADDED

Git LFS Details

  • SHA256: 6e2ba1022bb71d026c764b12e7d6c67a233cfa4c6836616f618a878764fe7a7c
  • Pointer size: 131 Bytes
  • Size of remote file: 106 kB
asset/hed.jpg ADDED

Git LFS Details

  • SHA256: c10f91fe342b439d1e99fe703e313aa09315b59cf7362c43e2e42910f7c681d7
  • Pointer size: 131 Bytes
  • Size of remote file: 188 kB
asset/pose.jpg ADDED

Git LFS Details

  • SHA256: c3543f29a838b77933dc439f8520c5eff1bb2075315afbe6eb4b309c477a31f0
  • Pointer size: 130 Bytes
  • Size of remote file: 43.5 kB
asset/pose2.jpg ADDED

Git LFS Details

  • SHA256: 82005b3e813d714e3a4cf8dddbeddad5047978d6aca78c6a121ad1e7c0ec4b4e
  • Pointer size: 130 Bytes
  • Size of remote file: 94.6 kB
results/canny.png ADDED

Git LFS Details

  • SHA256: 24fe7cafd384e040ddd2b9edfdfbffada39feb002fb170497f1aaae161859eb2
  • Pointer size: 132 Bytes
  • Size of remote file: 1.71 MB
results/depth.png ADDED

Git LFS Details

  • SHA256: 32b7bf025e55136d30f9d4d976547856df8f39dba0616a40e816737bfd6a4a94
  • Pointer size: 131 Bytes
  • Size of remote file: 944 kB
results/hed.png ADDED

Git LFS Details

  • SHA256: 772810bdada9e7e2a89ee8178d700295075f488a3b015ce9a46e6dc3b5047b6e
  • Pointer size: 132 Bytes
  • Size of remote file: 1.31 MB
results/pose.png ADDED

Git LFS Details

  • SHA256: f22a32ed33eae03f10eb6674bd5ff86f207edb6a7979585c7fd2d8ad91f2f54f
  • Pointer size: 132 Bytes
  • Size of remote file: 1.77 MB
results/pose2.png ADDED

Git LFS Details

  • SHA256: fea1a3ed2816449b1a5fe77e5dfc7b498a4a601de4b5ef9b339bf380cdf4cd2a
  • Pointer size: 132 Bytes
  • Size of remote file: 2.14 MB