Comfy-Org/HunyuanVideo_1.5_repackaged · How to use Hunyuan1.5 for video upscaling

19 days ago

How should the Hunyuan1.5 super-resolution model be applied for video upscaling? In my current workflow, I input a 640×640 resolution video, but the resulting output is a 1920×1072 black screen.

RuneXX

19 days ago

Did you use the made-for-comfyui models? if using the original lightx models the output will be black (although they have now also added comfyui models, with comfyui in the name )

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/tree/main/split_files/latent_upscale_models
https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/diffusion_models/hunyuanvideo1.5_1080p_sr_distilled_fp16.safetensors
https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/diffusion_models/hunyuanvideo1.5_720p_sr_distilled_fp16.safetensors

makisekurisu-jp

19 days ago

Did you use the made-for-comfyui models? if using the original lightx models the output will be black (although they have now also added comfyui models, with comfyui in the name )

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/tree/main/split_files/latent_upscale_models
https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/diffusion_models/hunyuanvideo1.5_1080p_sr_distilled_fp16.safetensors
https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/diffusion_models/hunyuanvideo1.5_720p_sr_distilled_fp16.safetensors

I’m using the model from the link below because the official model provided by Comfy is too large for me.
https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/blob/main/hy15_1080p_sr_cfg_distiled_fp8_e4m3_lightx2v.safetensors

RuneXX

19 days ago

•

edited 19 days ago

I’m using the model from the link below because the official model provided by Comfy is too large for me.
https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/blob/main/hy15_1080p_sr_cfg_distiled_fp8_e4m3_lightx2v.safetensors

might be ok, not sure. I'll give it a test too. But the other non-comfy models do produce black video, so its likely this model is also for LightX own codebase, and not comfy
Maybe the Comfy repository will also get some fp8 versions too

See here also https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/discussions/2#6921bbee921df8e6e4f81d4f

makisekurisu-jp

19 days ago

I’m using the model from the link below because the official model provided by Comfy is too large for me.
https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/blob/main/hy15_1080p_sr_cfg_distiled_fp8_e4m3_lightx2v.safetensors

might be ok, not sure. I'll give it a test too. But the other non-comfy models do produce black video, so its likely this model is also for LightX own codebase, and not comfy
Maybe the Comfy repository will also get some fp8 versions too

See here also https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/discussions/2#6921bbee921df8e6e4f81d4f

My intended test is to apply the 1080p super-resolution model from hy1.5 video to upscale the original video and then compare its performance with FlashVSR’s current super-resolution upscaling. It is possible that the 1080p upscaling model from lightx2v was not adapted for ComfyUI and is only compatible with their proprietary code.

RuneXX

18 days ago

It is possible that the 1080p upscaling model from lightx2v was not adapted for ComfyUI and is only compatible with their proprietary code.

Yes exactly. The original models posted were not compatible with ComfyUI.
They have since added some that are, but the upscale models are probably not yet made compatible
https://huggingface.co/lightx2v/Hy1.5-Quantized-Models/tree/main (if you look at the file names, it says ComfyUI in those that are compatible)
And ComfyUI also made compatible ones here: https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/tree/main/split_files (but no fp8 for the upscale model... yet at least)

Kijai

Comfy Org org 18 days ago

but no fp8 for the upscale model... yet at least)

Added.

rzgar

18 days ago

but no fp8 for the upscale model... yet at least)

Added.

still black screen with both upsamplers hunyuanvideo1.5_720p_sr_distilled_fp8_scaled.safetensors and 1080_sr

Kijai

Comfy Org org 18 days ago

but no fp8 for the upscale model... yet at least)

Added.

still black screen with both upsamplers hunyuanvideo1.5_720p_sr_distilled_fp8_scaled.safetensors and 1080_sr

Oh that's weird... same script was used as with the other models, will have to double check what could be different here.

rzgar

18 days ago

but no fp8 for the upscale model... yet at least)

Added.

still black screen with both upsamplers hunyuanvideo1.5_720p_sr_distilled_fp8_scaled.safetensors and 1080_sr

Oh that's weird... same script was used as with the other models, will have to double check what could be different here.

This is the video and workflow, maybe I’m the one doing it wrong
workflow: https://pastebin.com/dUQcM80D

makisekurisu-jp

18 days ago

I used Comfy’s official sample workflow, with the 480p GGUF Q8 quantized model, and the LoRA provided by the official Comfy repo. However, the results are terrible — what exactly is going wrong?

https://huggingface.co/jayn7/HunyuanVideo-1.5_T2V_480p-GGUF/blob/main/480p/hunyuanvideo1.5_480p_t2v-Q8_0.gguf

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/loras/hunyuanvideo1.5_t2v_480p_lightx2v_4step_lora_rank_32_bf16.safetensors

rzgar

18 days ago

I used Comfy’s official sample workflow, with the 480p GGUF Q8 quantized model, and the LoRA provided by the official Comfy repo. However, the results are terrible — what exactly is going wrong?

https://huggingface.co/jayn7/HunyuanVideo-1.5_T2V_480p-GGUF/blob/main/480p/hunyuanvideo1.5_480p_t2v-Q8_0.gguf

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/loras/hunyuanvideo1.5_t2v_480p_lightx2v_4step_lora_rank_32_bf16.safetensors

try with cfg 1

makisekurisu-jp

18 days ago

I used Comfy’s official sample workflow, with the 480p GGUF Q8 quantized model, and the LoRA provided by the official Comfy repo. However, the results are terrible — what exactly is going wrong?

https://huggingface.co/jayn7/HunyuanVideo-1.5_T2V_480p-GGUF/blob/main/480p/hunyuanvideo1.5_480p_t2v-Q8_0.gguf

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/loras/hunyuanvideo1.5_t2v_480p_lightx2v_4step_lora_rank_32_bf16.safetensors

try with cfg 1

Thank you, after setting cfg to 1 the results finally became normal.

Below is the output using the GGUF Q8 model, 4-step LoRA, cfg=1, steps=4

GGUF Q8 model, 4-step LoRA, cfg=1, steps=6

GGUF Q8 model, cfg=6, steps=20

makisekurisu-jp

18 days ago

but no fp8 for the upscale model... yet at least)

Added.

still black screen with both upsamplers hunyuanvideo1.5_720p_sr_distilled_fp8_scaled.safetensors and 1080_sr

Same issue.
The Diffusion Model and Latent Upscale Model I loaded are both the 1080p models from this repo. Since my VRAM is only 12GB, directly using the latent from the first sampling inevitably causes OOM.
Therefore, I passed the latent to Hunyuan Video 15 Latent Upscale With Model using the load video + VAE encode method, but the result was just a black screen.

Kijai

Comfy Org org 18 days ago

•

edited 18 days ago

The issue with the fp8 SR models was that the SR model doesn't actually use the t_embedder layer, but the weights still exist in the model.... they're just all zeroes, and my fp8 scaling script didn't account for that and stupidly made the scale_weight zeros too.

Uploaded fixed versions now, sorry for the confusion.

rzgar

18 days ago

The issue with the fp8 SR models was that the SR model doesn't actually use the t_embedder layer, but the weights still exist in the model.... they're just all zeroes, and my fp8 scaling script didn't account for that and stupidly made the scale_weight zeros too.

Uploaded fixed versions now, sorry for the confusion.

Thank you, and on behalf of the community, we are all grateful for your wonderful efforts and works!

rzgar

18 days ago

I used Comfy’s official sample workflow, with the 480p GGUF Q8 quantized model, and the LoRA provided by the official Comfy repo. However, the results are terrible — what exactly is going wrong?

https://huggingface.co/jayn7/HunyuanVideo-1.5_T2V_480p-GGUF/blob/main/480p/hunyuanvideo1.5_480p_t2v-Q8_0.gguf

https://huggingface.co/Comfy-Org/HunyuanVideo_1.5_repackaged/blob/main/split_files/loras/hunyuanvideo1.5_t2v_480p_lightx2v_4step_lora_rank_32_bf16.safetensors

try with cfg 1

Thank you, after setting cfg to 1 the results finally became normal.

You're welcome, if you use higher steps, lower the LoRA strength like, Lightx2v lora strength: 0.5 with Steps: 6, 8, and lowering the shift will also have some effect especially with fp16 models

RuneXX

18 days ago

•

edited 18 days ago

Works perfectly, gave it a try with the new fp8 SR models ;-)

makisekurisu-jp

18 days ago

Works perfectly, gave it a try with the new fp8 SR models ;-)

(832x480 => 1280x720p upscale)

Did you connect the latent output from the first KSampler to the hy15 Latent Upscale With Model? Is it possible to connect it using load video + VAE encode? How does the resulting video quality compare with FlahVSR?

RuneXX

18 days ago

•

edited 18 days ago

Yes the latent output from first sampler goes to the upscale with model. Should be possible to use Load Video + VAE endode, I think. Will give it a try.
FlashVSR is for sure a lot faster, but Hunyuan upscale might be a good alternative

Speaking of speed, not sure if its worth converting the lightx "tiny vae" to comfyUI format as well (but maybe it would require more than just a model conversion, and not sure if its worth the hassle)
https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors

makisekurisu-jp

18 days ago

Yes the latent output from first sampler goes to the upscale with model. Should be possible to use Load Video + VAE endode, I think. Will give it a try.
FlashVSR is for sure a lot faster, but Hunyuan upscale might be a good alternative

Speaking of speed, not sure if its worth converting the lightx "tiny vae" to comfyUI format as well (but maybe it would require more than just a model conversion, and not sure if its worth the hassle)
https://huggingface.co/lightx2v/Autoencoders/blob/main/lighttaehy1_5.safetensors

It would be great if there were a tiny VAE custom node that could directly replace Comfy’s native load VAE node, because the hy1.5 VAE is really too large. So far, I haven’t found one. Kijai’s load VAE node can load a tiny VAE, but it can only connect to his WanVideo wrapper node and cannot replace Comfy’s original load VAE node.

Kijai

Comfy Org org 17 days ago

I actually submitted a PR for the tiny VAEs yesterday: https://github.com/comfyanonymous/ComfyUI/pull/10884

RuneXX

17 days ago

•

edited 17 days ago

I actually submitted a PR for the tiny VAEs yesterday: https://github.com/comfyanonymous/ComfyUI/pull/10884

ah very nice, hopefully the PR goes through. The current VAE is a bit slow, so might be nice to try the tiny one

? Is it possible to connect it using load video + VAE encode?

Tried that, works like a charm ;-)

(input video 480p => upscaled to 720p)
Workflow attached/embedded in the video if you want to play around with it (download the video and drop into comfy for workflow)

makisekurisu-jp

17 days ago

•

edited 17 days ago

I actually submitted a PR for the tiny VAEs yesterday: https://github.com/comfyanonymous/ComfyUI/pull/10884

ah very nice, hopefully the PR goes through. The current VAE is a bit slow, so might be nice to try the tiny one

? Is it possible to connect it using load video + VAE encode?

Tried that, works like a charm ;-)

(input video 480p => upscaled to 720p)
Workflow attached/embedded in the video if you want to play around with it (download the video and drop into comfy for workflow)

Edit : ups, that video was quite large for the thread, i will make a wide 16:9 instead;-)hehe

Honestly, I would delete the hy1.5 video base model, because its video quality is inferior to Wan Video. However, its super-resolution upscale model does have genuine native support from Comfy’s official implementation, which makes it much more convenient to use than FlashVSR.

RuneXX

17 days ago

in some areas i feel its better than Wan.
I am definitive keeping both ;-) both have their strengths

As for FlashVSR, I use the one in WanVideoWrapper. Its more than good enough for my use (and extremely fast)
https://github.com/kijai/ComfyUI-WanVideoWrapper/blob/main/example_workflows/wanvideo_1_3B_FlashVSR_upscale_example.json

Kutches

17 days ago

I also feel is better than wan(base wan) with the correct sampler.

makisekurisu-jp changed discussion status to closed 17 days ago