WAN FLF issue

#77
by Axymeus - opened

I'm trying to use first-last-frame diffusion with Wan (both 2.1 and 2.2) using Kijai's wrapper but my results are a little off. I have questions:

  • Naturally I send my last frame to end_image in WanVideoImageToVideoEncode. Do I also add it as image_2 of WanVideoClipVisionEncode?
  • How many frames do you set for such a task? If I leave it at 81 frames it tries to generate 85 frames, so I believe in FLF you aim for 77 frames?
  • Do you need a specific FLF model? I tried using the FLF2V model and checking "fun_or_fl2v_model" in WanVideoImageToVideoEncode, but it wouldn't load at all. Using the regular I2V model works but results aren't great.

The problem I'm getting in the videos I generate like this are different depending on the wan version I'm using. With 2.1 it looks alright, but doesn't reach the last frame exactly, it cuts short. With 2.2 the qualiity drops severly and the last frame is extremely noisy. It makes me thing something is wrong in my setup.

Additionally, the motion is little underwhelming in FLF, especially when trying to do a loop using the same frame for first and last. Any tips to improve that?

Sign up or log in to comment