CyFeng16/MVIMP

Future Features

CyFeng16 opened this issue ยท 22 comments

This issue will continue open and receive the wanted feature, as well as discuss the priority of which the feature merged in.
Feel free to share your point and welcome to join and contribute together.

Suggestion 1: Detect scene change, and don't interpolate them (To avoid scene change artifacts). Instead, copy the original frame into where the interpolated frames would have been.

Suggestion 2: Detect and interpolate only duplicated frames. (Good for anime where there are multiple duplicated frames, sometime 1 dupe, sometimes 2 or 3, etc. Its super inconsistent through out the video)

@Brokensilence
Frankly speaking, scene detection is not a familiar area for me. Maybe you could point out some of the mature solutions of which we can make use, I'll consider maintain its interface consistency and integrate it into our project if its license allows.

It is so COOOOOOL to point out that the multi-frame repetition phenomenon that often occurs in anime, I think this is usually ignored by other users. However, I am more considering the whole pipeline we talking about.

For duplicate frames, duplication detection -> duplication remove -> DAIN is a more intuitive and clear way to make each function responsible for its own part, rather than complex interaction between functions. Therefore, only a few codes of duplication detection and deleting can we fill the function out.

And for the same reason, scene detection -> video split -> DAIN -> fragments fusion seems better than detection while DAIN. What do you think?

According to #3 , model-level DeOldify has been integrated.
Thanks to jantic, the author of DeOldify, for contributing to the open-source AI capabilities community.
Open a new issue if you are meeting some problems when using it.

@meguerreroa
Hi, glad you mentioned these models. Super-resolution is such an important topic in the field of image and video processing. I have considered adding some functions in MMSR, but time is limited recently.
From your perspective of demand, you must have a certain understanding of super-resolution, can you try to summarize the similarities and differences of several repositories/models you mentioned? This will help us decide the priority of adding features. : )

Actually i'm a total newbie in the area so i'm unsure of which one is better or should be added. But what i have seen, there are some articles and posts talking about ESRGAN. Sorry not to be helpful.

re2cc commented

meguerreroa
Hi, glad you mentioned these models. Super-resolution is such an important topic in the field of image and video processing. I have considered adding some functions in MMSR, but time is limited recently.
From your perspective of demand, you must have a certain understanding of super-resolution, can you try to summarize the similarities and differences of several repositories/models you mentioned? This will help us decide the priority of adding features. : )

I do not know much about the subject, but with a little research on the models, I can highlight some points:

ESRGAN:

  • It is an "enhanced" version of SRGAN
  • For video and images
  • It is practically not in development anymore.

MMSR:

  • Based on ESRGAN, BasicSR and EVDR
  • For video and images
  • There is not much more remarkable information in the README, but I suppose that if it is based on an old project, it must be better, besides that it is still in continuous development.

TecoGAN:

  • It only works on videos
  • I do not fully understand the system, but by the name and a little reading, it implies that it works from details that persist in the video (TEmporally COherent GAN).
  • It has not had any updates in about a year
  • Works only with TensorFlow 1.x

SRGAN:

  • It only works on images
  • He emphasizes that it is designed for realistic images
  • I think it is the only one that works with TensorFlow 2
  • Still under development

SRVAE:

  • He says it only works with realistic images
  • It only works on images
  • I did not find much else of note in the README
  • Still under development

I do not know if you knew what I said, but I hope I helped. Honestly I have only tried TecoGAN at Colab, but the results did not seem good (More definition, but many errors), maybe I did something wrong, but I wanted to give my experience. Sorry if there are mistakes in my English, I did it with a translator.
It would be interesting to have waifu2x here

Edit: Thanks for your work.

@meguerreroa That is all right! "The greatest pleasure in life is doing what people say you cannot do", we can be better. ๐Ÿ˜ƒ

@EtianAM In my opinion, the open-source field of super-resolution seems to have been very stagnant in the past two years. By looking at the information displayed on Image Super-Resolution
on Paper-with-code
, the open-source repositories in the past two years mainly include ESRGAN, DRCN, ESPCN, SRCNN, SRRNN And face more or fewer problems which you had mentioned.
waifu2x has a great reputation, and it is among our priority rank, hope it can be seen here as soon as possible.

re2cc commented

@meguerreroa That is all right! "The greatest pleasure in life is doing what people say you cannot do", we can be better. ๐Ÿ˜ƒ

@EtianAM In my opinion, the open-source field of super-resolution seems to have been very stagnant in the past two years. By looking at the information displayed on Image Super-Resolution on Paper-with-code, the open-source repositories in the past two years mainly include ESRGAN, DRCN, ESPCN, SRCNN, SRRNN And face more or fewer problems which you had mentioned.
waifu2x has a great reputation, and it is among our priority rank, hope it can be seen here as soon as possible.

Thanks for answer.
I want to think that as soon as the graphic cards and in general the required to program this kind of tools are more accessible, the algorithms will start to be created and improved.

re2cc commented

@Brokensilence
Frankly speaking, scene detection is not a familiar area for me. Maybe you could point out some of the mature solutions of which we can make use, I'll consider maintain its interface consistency and integrate it into our project if its license allows.

It is so COOOOOOL to point out that the multi-frame repetition phenomenon that often occurs in anime, I think this is usually ignored by other users. However, I am more considering the whole pipeline we talking about.

For duplicate frames, duplication detection -> duplication remove -> DAIN is a more intuitive and clear way to make each function responsible for its own part, rather than complex interaction between functions. Therefore, only a few codes of duplication detection and deleting can we fill the function out.

And for the same reason, scene detection -> video split -> DAIN -> fragments fusion seems better than detection while DAIN. What do you think?

I know the question was not addressed to me, but I wanted to raise the question or problem about the removal of frames
If that were done, would not it cause problems with audio synchronization? (Assuming that it was added later) Let is suppose that for every 4 frames there is 1 that is repeated, 6 frames would be deleted assuming it is 30 fps, I do not know if that estimate is exaggerated or not, but I think it could cause a problem in long videos and it is an issue that would have to be resolved.
Also, if there is a scene that is intentionally the same, it could be a problem, but I guess that just have to add a minimum number of frames to be interpreted as intentional.

As I said, I do not know much about this, but one solution I can think of is to run DAIN on a small part of the frames until it can complete the missing frames, but I wonder if it will not cause a strange effect, besides the problems that those kind of changes in the name and numbering of the files imply.
Anyway, I am sorry if I am wrong when explaining or if my approach just does not make sense and for the probably bad translation.

@EtianAM @Brokensilence I moved #DAIN in Anime# into #4 . More discussions are welcome.

Generating Digital Painting Lighting Effects via RGB-space Geometry
Poster: https://lllyasviel.github.io/PaintingLight/
Repo: https://github.com/lllyasviel/PaintingLight

It really impressed me with what it can do!!!

Suggestion:
Make it so we can DAIN interpolate multiple videos in a row.
Example: Have multiple videos in the input folder, and DAIN interpolate them all 1 by 1 automatically.

*I have tried doing this, but I'm not a python expert, and I'm still struggling to change whats needed for this to work.

This is good for when we have large videos, we split them into parts first, so when using COLAB, even if we hit the 12h session limit, we don't lose the entire progress.

Also good for when we split the video into different scenes using scene change detection beforehand. (Still trying to implement auto scene detection into the repo, for now I'm using "PySceneDetect" on my local computer to split the video into the different scenes first.)

@Brokensilence WIP. But notes that if we add such function, we need more strict checks.

re2cc commented

I had been trying to use Video2x on linux without being able to do it, but about 2 days ago the developer updated the linux dependencies a bit and fixed a few problems with this OS.
Yesterday I finally managed to run it on Colab, so I can assure you that it is possible to add it.

I should mention that to run it you need Python 3.8 and some other dependencies that need to be run as root, which could be a problem

@meguerreroa That is all right! "The greatest pleasure in life is doing what people say you cannot do", we can be better. ๐Ÿ˜ƒ

@EtianAM In my opinion, the open-source field of super-resolution seems to have been very stagnant in the past two years. By looking at the information displayed on Image Super-Resolution on Paper-with-code, the open-source repositories in the past two years mainly include ESRGAN, DRCN, ESPCN, SRCNN, SRRNN And face more or fewer problems which you had mentioned.
waifu2x has a great reputation, and it is among our priority rank, hope it can be seen here as soon as possible.

Are you still working on this?

@LizandroVilDia Yep. You can find the waifu2x function is ready to use. Although more or less affected, I insist on updating and maintaining modules during the free time of entrepreneurship.

re2cc commented

AnimeGANv2 ? :)

@EtianAM Done, try it through :)

Dunno if can be useful for you in any way, but check out SwinIR by @JingyunLiang too:
SwinIR: Image Restoration Using Swin Transformer

@forart Thanks for sharing, i'll take a look. :)