Question about train_pcl2pcl_gan_3D-EPN.py

Question

Question about train_pcl2pcl_gan_3D-EPN.py

Condor-G opened this issue 5 years ago · 4 comments

Hi, Chen~
Sorry to bother you. I have a problem when I run train_pcl2pcl_gan_3D-EPN.py.

train GAN: python train_pcl2pcl_gan_3D-EPN.py
After the training, I use Meshlab to see the result in pc2pc/run_3D-EPN/run_car/pcl2pcl/log_car_pcl2pcl_gan_3D-EPN_default_hausdorff/fake_cleans. However, the ply point clouds in file(reconstr_x) are all like wool ball. GAN didn't seem to work.

I don't know what the problem is. Maybe the dataset.
Here is what I did:
I use the dataset "shape_net_core_uniform_samples_2048" (from other projects) . And I use matlab to make incomplete point set.
And use pc2pc/data_processing to make pickle file.
python train_ae_ShapeNet-v1.py
python train_ae_3D-EPN.py
(the ply point clouds (ShapeNet-v1 and 3D-EPN) in reconstr are successful.)
But after I run "python train_pcl2pcl_gan_3D-EPN.py", the ply clouds in reconstr are a mess.

I also reduce the number of batch_size to run it. Does it affect the outcome?

If I don't clarify my question, tell me what I should show.
I'm a novice. I really hope to get your help.

Answer 1 · 2020-03-16T10:55:58.000Z

Hi @Condor-G ,thanks for trying out the code.

Can you please show the training log from the tensorboard?

The way you prepare the data should not cause this GAN failure, I tried other types of data too.
Looks like the GAN is not yet started to work in the training.

Answer 2 · 2020-03-16T13:29:11.000Z

Thank you very much for your reply!

Here is the gan's log in "log_train.txt":

{'kk': 1, 'd_fc_sizes': [256, 512], 'random_seed': None, 'batch_size': 12, 'latent_dim': 128, 'save_interval': 10, 'point_cloud_shape': [2048, 3], 'beta1': 0.5, 'd_activation_fn': <function leaky_relu at 0x7f9cf94ff840>, 'lambda': 1.0, '3D-EPN_train_point_cloud_dir': '/home/condor/gyf/GAN/pcl/pc2pc/data/3D-EPN_dataset/shapenet_dim32_sdf_pc/02958343/point_cloud', 'clean_ae_ckpt': '/home/condor/gyf/GAN/pcl/pc2pc/run_synthetic/run_car/ae/log_ae_car_ShapeNet-V1_c2c/ckpts/model_0.ckpt', '3D-EPN_test_point_cloud_dir': '/home/condor/gyf/GAN/pcl/pc2pc/data/3D-EPN_dataset/test-images_dim32_sdf_pc/02958343/point_cloud', 'g_bn': False, 'k': 1, 'epoch': 2001, 'noisy_ae_ckpt': '/home/condor/gyf/GAN/pcl/pc2pc/run_3D-EPN/run_car/ae/log_3DEPN_ae_car/ckpts/model_0.ckpt', 'recover_ckpt': None, 'output_interval': 1, 'eval_loss': 'hausdorff', 'g_activation_fn': <function relu at 0x7f9cf99bdf28>, 'loss': 'hausdorff', 'd_bn': False, 'lr': 0.0001, 'point_cloud_dir': '/home/condor/gyf/GAN/pcl/pc2pc/data/ShapeNet_v1_point_cloud/02958343/point_cloud_clean', 'g_fc_sizes': [128], 'exp_name': 'car_pcl2pcl_gan_3D-EPN_default'}
{'activation_fn': <function relu at 0x7f9cf99bdf28>, 'n_filters': [64, 128, 128, 256], 'fc_sizes': [256, 256], 'latent_code_dim': 128, 'point_cloud_shape': [2048, 3], 'encoder_bn': True, 'filter_size': 1, 'stride': 1, 'decoder_bn': False}
pid: 3235
Net layers:
G/fc_0/kernel:0
G/fc_0/bias:0
G/fc_output/kernel:0
G/fc_output/bias:
D/fc_0/kernel:0
D/fc_0/bias:0
D/fc_1/kernel:0
D/fc_1/bias:0
D/output/kernel:0
D/output/bias:
2020-03-10-17-45-05 training 0 snapshot:
G loss: 1.387982 = (g)0.954996, (r)0.432987
D loss: 0.264273 = (f)0.000979, (r)0.527566
Eval loss (hausdorff) on test set: 0.432021
Model saved in file: run_3D-EPN/run_car/pcl2pcl/log_car_pcl2pcl_gan_3D-EPN_default_hausdorff/ckpts/model_0.ckpt
2020-03-10-17-45-16 training 1 snapshot:
G loss: 1.232149 = (g)0.802055, (r)0.430094
D loss: 0.035887 = (f)0.011880, (r)0.059893
2020-03-10-17-45-25 training 2 snapshot:
G loss: 1.171390 = (g)0.744821, (r)0.426569
D loss: 0.011561 = (f)0.021100, (r)0.002023

......

2020-03-10-23-01-38 training 1999 snapshot:
G loss: 0.643860 = (g)0.259829, (r)0.384031
D loss: 0.249035 = (f)0.245988, (r)0.252083
2020-03-10-23-01-47 training 2000 snapshot:
G loss: 0.645519 = (g)0.258738, (r)0.386781
D loss: 0.244853 = (f)0.247335, (r)0.242371
Eval loss (hausdorff) on test set: 0.387690
Model saved in file: run_3D-EPN/run_car/pcl2pcl/log_car_pcl2pcl_gan_3D-EPN_default_hausdorff/ckpts/model_2000.ckpt

Is that what you what? I don't understand the meaning of "the training log from the tensorboard".
If not, do you mean the file called "events.out.tfevents...." under "summary"? Should I zip the file to you?

BTW, I found I made a mistake in processing the data. The number of 3D-EPN datasets was about 1000(total number is about 7000). I split the car dataset firstly. But the file called gen_point_cloud_split.py split again. So I try again with 7k datasets. Now I am waiting the running results. But it doesn't seems to work until now. Maybe that's not the mistake.

Thanks a ton for you help~

Answer 3 · 2020-03-16T13:38:07.000Z

Yes, this log file should contains information for debuggin,
it would also be better to show the losses from the tensorboard.

From the log file you provided,
I guess the problem is that the two AEs are set to the wrong ones somehow in the config.py.
See "'clean_ae_ckpt': '/home/condor/gyf/GAN/pcl/pc2pc/run_synthetic/run_car/ae/log_ae_car_ShapeNet-V1_c2c/ckpts/model_0.ckpt'". The clean AE is set to model_0.ckpt, simply change it to model_2000.ckpt in config.py should work out. ;-)

Answer 4 · 2020-03-16T14:11:11.000Z

Yes!!! You are right!
I modify the number of ckpt in config.py. And now evething is working like a charm!

I will close this issue.

Thank you! :)
Best wishes for you!