bearprin/neuralpull-pytorch

Surface reconstruction

ThibaultGROUEIX opened this issue · 9 comments

Hi,

I have just tried the code! Thanks a lot for the pytorch version, it is very helpful.

This is the output I get when I fit NeuralPull on the horse for surface reconstruction. There are strange artefacts.

Screenshot 2021-09-29 at 23 50 22

Have you experienced the same thing on your end and do you have a working example?

Best regards,
Thibault

Hi,

I have just tried the code! Thanks a lot for the pytorch version, it is very helpful.

This is the output I get when I fit NeuralPull on the horse for surface reconstruction. There are strange artefacts.

Screenshot 2021-09-29 at 23 50 22

Have you experienced the same thing on your end and do you have a working example?

Best regards, Thibault

  • I suggest fitting one shape by one network though the Implementation support fitting several shapes by condition code(feat). This will reduce the difficulty of convergence. Just put only one .npy in the 'npy' folder.
  • You can use more epochs for fitting and then output the reconstruction results for each epoch to check if that network convergence.

Below is my horse (From Points2Surf dataset) reconstruction result after the 207 epochs.

horse

 Hope help.

Thanks @bearprin this is great!
Let's try to get me the same results.

  • I was already following suggestion 1, fitting a single shape by only having the horse in the npy folder
  • I think suggestion 2 is natively in the code, at least it returns a mesh for every epoch off-the-shelf.
    My horse after the same number of epochs does not look good : the L1 Loss is around 0.12.

Did you keep the default parameters?

parser.add_argument('--bd', type=float, default=0.55)
parser.add_argument('--resolution', type=int, default=128)
parser.add_argument('--thresholds', type=list, default=[0.0])

parser.add_argument('--name', type=str, default='base')
parser.add_argument('--epochs', type=int, default=1000)
parser.add_argument('--batch_size', type=int, default=16)
parser.add_argument('--lr', type=float, default=0.0001)
parser.add_argument('--num_workers', type=int, default=20)
parser.add_argument('--seed', type=int, default=40938661)

On my end, I followed the quickstart section instructions. The only modification i made was to remove the fandisk from the npy folder to only keep the horse before running python train.py
This is confusing me :)

I have made some modifications to the code.

  • deleted the fandisk shape(mesh and point cloud).
  • Besides, I have repair some issues of env.yaml.

After that, I retrained the network for 'horse' with the default hyperparameters. Below is my result after 212 epochs.

horse_212

Try reclone this repository with the latest code and retrain again.

Thanks @bearprin this is great! Let's try to get me the same results.

  • I was already following suggestion 1, fitting a single shape by only having the horse in the npy folder
  • I think suggestion 2 is natively in the code, at least it returns a mesh for every epoch off-the-shelf.
    My horse after the same number of epochs does not look good : the L1 Loss is around 0.12.

Did you keep the default parameters?

parser.add_argument('--bd', type=float, default=0.55)
parser.add_argument('--resolution', type=int, default=128)
parser.add_argument('--thresholds', type=list, default=[0.0])

parser.add_argument('--name', type=str, default='base')
parser.add_argument('--epochs', type=int, default=1000)
parser.add_argument('--batch_size', type=int, default=16)
parser.add_argument('--lr', type=float, default=0.0001)
parser.add_argument('--num_workers', type=int, default=20)
parser.add_argument('--seed', type=int, default=40938661)

On my end, I followed the quickstart section instructions. The only modification i made was to remove the fandisk from the npy folder to only keep the horse before running python train.py This is confusing me :)

Huge thanks ! Trying now.

Interesting. So I restarted from scratch, reinstalled the conda env and launched. Here is my Tensorboard log.

As you can see, the test losses are very unstable.
Screenshot 2021-10-06 at 17 33 32

After 500 epochs, I still don't have a great horse like you. If i pick the epoch with the best L1 loss, I get this:
Screenshot 2021-10-06 at 17 38 33

I am not sure how to address this. Do you also observe pretty bad convergence properties in your curves (ideally can you share them?)?

Thank you for the pytorch version. But this line of code is slightly different from the original work, which may lead to incomplete structure.
@ThibaultGROUEIX You can try to change it to [0.005] or [0.0,0.0005,0.005], and then observe whether the output mesh is normal.
Finally, thank you very much for your work. @bearprin
image

Thank you for the pytorch version. But this line of code is slightly different from the original work, which may lead to incomplete structure. @ThibaultGROUEIX You can try to change it to [0.005] or [0.0,0.0005,0.005], and then observe whether the output mesh is normal. Finally, thank you very much for your work. @bearprin image

Sorry for this mistake, and I repaired that.

Hi ! Thanks a lot @mabaorui and @bearprin,

The 0.005 helped get this results :
Screenshot 2021-10-07 at 11 50 36

However, immediately after I get this:
Screenshot 2021-10-07 at 11 49 25

Do you also observe some instability in the results, and is it expected that one has to look for good L1 test values to find a good reconstruction, or all reconstructions after convergence should be good?

To help inform this, can you share your Tensorboard curves @bearprin ?


The programmer in me is still struggling with the fact that we have the same conda env, an experiment with a fixed seed, and @bearprin gets good results and I don't (with the 0 threshold).
One thing I noticed is that knn_cuda is not in the yaml spec file, I figured it came from https://github.com/unlimblue/KNN_CUDA Do you confirm @bearprin ?
On which platform are you running the code @bearprin ?

Thanks a lot for the time you take on this,
Thibault

Hi ! Thanks a lot @mabaorui and @bearprin,

The 0.005 helped get this results : Screenshot 2021-10-07 at 11 50 36

However, immediately after I get this: Screenshot 2021-10-07 at 11 49 25

Do you also observe some instability in the results, and is it expected that one has to look for good L1 test values to find a good reconstruction, or all reconstructions after convergence should be good?

To help inform this, can you share your Tensorboard curves @bearprin ?

The programmer in me is still struggling with the fact that we have the same conda env, an experiment with a fixed seed, and @bearprin gets good results and I don't (with the 0 threshold). One thing I noticed is that knn_cuda is not in the yaml spec file, I figured it came from https://github.com/unlimblue/KNN_CUDA Do you confirm @bearprin ? On which platform are you running the code @bearprin ?

Thanks a lot for the time you take on this, Thibault

The latest eny.yaml has KNN_CUDA, see here.