The line_pred visualized on nothing.

Question

The line_pred visualized on nothing.

Closed this issue 4 years ago · 17 comments

Hey, thanks for your nice work.
I wanna try your work to see whether it works in my pics. First, I need to train it, I started by train.sh.
But when I use tensorboard to visualize the junctions and lines, it seems that line_pred output nothing like below. Is that normal?

Thanks!

Answer 1 · 2019-07-31T18:09:37.000Z

The AMIM, which infers the adjacency matrix of junctions, is hard to train, and need lots of epochs to output reasonable results. If you observed dense connections between junctions in the first few epochs and then found all line segments just disappeared, then the training and visualization part should work fine. Because the adjacency matrices are very sparse, the AMIM will first learn to output close-to-zero matrices. If you want to see some results in the early stage of training, you can try to lower the threshold of adjacency matrices for visualization by changing the --vis-line-th parameter in train.sh. But be aware that if you set the threshold to be too low, you will also see a mass of false-positive lines.

Answer 2 · 2019-08-01T03:50:58.000Z

Thanks for your reply. It seems like what you said.
However, in the early training process, the junction_pred is working to correct direction, while no line_pred.
Now I'm at 18 epoch, while the line_pred has mass output(most of them are not correct), the junctions just find 0 points which prob higher than threshold.
Seems like this task is difficult and need much patience to get good result. If it is OK could you tell how much time your team used to train this network? Thanks again.

Answer 3 · 2019-08-04T16:37:02.000Z

I believe that 18 epochs should be enough to get some reasonable results. It seems that the AMIM was not trained correctly somehow, but I didn't know where the problem is. I wanted to train the network again using the code in this repo. However, I just found that I couldn't access the computing resources I used for this project right now... I think @rayryeng should have trained the network based on the code in this repo, maybe he can give you some advice on this problem for now? Thanks @rayryeng ...

Answer 4 · 2019-08-05T06:48:09.000Z

Yeah...I have trained the network for 30 epochs followed by train.sh.
But the result is not look really good. Seems that line verification is hard to learn.
Here is the result test in the wireframe dataset:

The top row is the groundtruth while below is pred. (junction is predicted by the junction network.)
Like you see, there relationship between each junction is not predicted well..

And it's worth to say, the junction part works good, as it can pred most of junction well.(below is junction pred result: top row is groundtruth and another is pred.)

Could you(or who have trained this network and get awesome result) give any help or to check where the code in this repo different with your origin code?
Really thanks for your work.

Answer 5 · 2019-08-06T03:04:28.000Z

Can you tell me what's your batch size and block inference size in train.sh. And besides, did you modify the learning rate or lambda-heatmap and lambda-adj? Can you provide me your loss_adj curve?

Answer 6 · 2019-08-06T05:10:45.000Z

@allankevinrichie Thanks for your reply and help!
Here are my params:

batch size = 8
block-inference-size = 64
and followed below command:
train --end-epoch 9 --solver SGD --lr 0.2 --weight-decay 5e-4 --lambda-heatmap 1. --lambda-adj 5.
train --end-epoch 15 --solver SGD --lr 0.02 --weight-decay 5e-4 --lambda-heatmap 1. --lambda-adj 10.
train --end-epoch 30 --solver SGD --lr 0.002 --weight-decay 5e-4 --lambda-heatmap 1. --lambda-adj 10.

Here is my loss_adj training curve, and other curves recorded in training process. Restart at 25 epoch and end at 55 epoch.

Finally, the text of loss is:
epoch: [29][624/625], lr: 0.002, time_total: 8.34, time_data: 0.04, time_net: 7.70, time_vis: 0.60, loss: 0.7582, loss_heatmap: 0.0870, loss_adj_mtx: 0.0671

Answer 7 · 2019-08-07T13:36:39.000Z

If I remember correctly, in order to see some reasonable results, the adj loss should be less than 0.0008 (or 0.008). I suggest that you can try to turn on the loss adj only to see if AMIM can be trained correctly. You can do this by setting --is-train-junc to be False and --lambda-heatmap to be 0. Here is an example.

python main.py \
--exp-name line_weighted_wo_focal_junc --backbone resnet50 \
--backbone-kwargs '{"encoder_weights": "ckpt/backbone/encoder_epoch_20.pth", "decoder_weights": "ckpt/backbone/decoder_epoch_20.pth"}' \
--dim-embedding 256 --junction-pooling-threshold 0.2 \
--junc-pooling-size 64 --attention-sigma 1.5 --block-inference-size 128 \
--data-root /data/path --junc-sigma 3 \
--batch-size 16 --gpus 0,1,2,3 --num-workers 10 --resume-epoch latest \
--is-train-junc False --is-train-adj True \
--vis-junc-th 0.1 --vis-line-th 0.1 \
    - train --end-epoch 9 --solver SGD --lr 0.2 --weight-decay 5e-4 --lambda-heatmap 0. --lambda-adj 5. \
    - train --end-epoch 15 --solver SGD --lr 0.02 --weight-decay 5e-4 --lambda-heatmap 0. --lambda-adj 10. \
    - train --end-epoch 30 --solver SGD --lr 0.002 --weight-decay 5e-4 --lambda-heatmap 0. --lambda-adj 10. \
    - end

Answer 8 · 2019-08-09T12:56:20.000Z

@allankevinrichie Use this command to train again. It's really hard to make adj_mtx loss decrease. 0.038 is the best result...

Answer 9 · 2019-09-01T07:50:09.000Z

hi, sorry to bother you.
I just finished the training process (30 epochs) and got images containing no lines while the junction part works fine, as shown below. There is no connection between junctions.

Due to the large GPU memory needs and the limitation of my computer, I change --batch-size from 16 to 1. Does this have any effect on the results? Is that possible be the reason for my result?
Thank you for your attention.

Answer 10 · 2019-09-30T11:05:36.000Z

I want to know the information of '*.lg', so i write code like this:
import pickle with open("C:\\YMW\\YMWbiye\\PPGNet-master\\SIST (1)\\indoorDist\\train\\00030077.lg", "rb") as f: data = pickle.load(f)
but the code is wrong, "ValueError: Buffer dtype mismatch, expected 'ITYPE_t' but got 'long long'".
What is wrong with this, how to solve it?

Answer 11 · 2020-01-07T06:43:26.000Z

@allankevinrichie Dear, I followed your code and train the module. But I can not get results as you presented when running with test.sh.

Answer 12 · 2020-01-07T09:52:14.000Z

我测试过提取点的那一部分网络，发现有几个问题，第一在线生成标签的时候，宽高的问题，如果resize成宽高相同，那没啥问题了，第二就是计算heatmap时loss的计算，计算方式是不是不合适，第三heatmap后处理时，在聚类那块代码中有逻辑错误。因为我的任务只需要预测点即可，后面预测点之间的连接关系的网络没去看。另外我是一个新手小白，上面是我个人的理解，可能不太正确。

…

------------------ 原始邮件 ------------------ 发件人: "Chang Chen(陈常)"<notifications@github.com>; 发送时间: 2020年1月7日(星期二) 下午2:43 收件人: "svip-lab/PPGNet"<PPGNet@noreply.github.com>; 抄送: "A Student"<1337442535@qq.com>;"Comment"<comment@noreply.github.com>; 主题: Re: [svip-lab/PPGNet] The line_pred visualized on nothing. (#9) @allankevinrichie Dear, I followed your code and train the module. But I can not get results as you presented when running with test.sh. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

Answer 13 · 2020-01-07T13:47:44.000Z

@mingweiY 谢谢你的回复.你将高和宽resize成相同,没有改变loss和后续的处理，其余都是使用默认的。最开始只是在test.sh中添加了图片的路径，python test.py
--exp-name line_weighted_wo_focal_junc --backbone resnet50
--backbone-kwargs '{"encoder_weights": "ckpt/backbone/encoder_epoch_20.pth", "decoder_weights": "ckpt/backbone/decoder_epoch_20.pth"}'
--dim-embedding 256 --junction-pooling-threshold 0.2
--junc-pooling-size 64 --block-inference-size 128
--gpus 0, --resume-epoch latest
--vis-junc-th 0.25 --vis-line-th 0.25
- test $1
--path-to-image PATH_TO_IMAGE /mnt/lustre/chenchang1/data/line_test_data/p2.png
不过出现了一个错误：
result = fn(*varargs, **kwargs)
File "test.py", line 103, in test
img = cv2.resize(img, (self.img_size, self.img_size))
cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/resize.cpp:3720: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

后面我直接将路径设置到imread中，出现了上面的结果。
你有没有跑出作者提供的效果？如果有的话，麻烦告诉我你的设置。

Answer 14 · 2020-01-07T13:52:46.000Z

很抱歉，这个我没跑过全部的程序，我只知道前半部分预测heatmap效果是不错的，后半部分预测点之间的关系就没试过了。

…

------------------ 原始邮件 ------------------ 发件人: "Chang Chen(陈常)"<notifications@github.com>; 发送时间: 2020年1月7日(星期二) 晚上9:47 收件人: "svip-lab/PPGNet"<PPGNet@noreply.github.com>; 抄送: "A Student"<1337442535@qq.com>;"Mention"<mention@noreply.github.com>; 主题: Re: [svip-lab/PPGNet] The line_pred visualized on nothing. (#9) @mingweiY 谢谢你的回复.你将高和宽resize成相同,没有改变loss和后续的处理，其余都是使用默认的。最开始只是在test.sh中添加了图片的路径，python test.py --exp-name line_weighted_wo_focal_junc --backbone resnet50 --backbone-kwargs '{"encoder_weights": "ckpt/backbone/encoder_epoch_20.pth", "decoder_weights": "ckpt/backbone/decoder_epoch_20.pth"}' --dim-embedding 256 --junction-pooling-threshold 0.2 --junc-pooling-size 64 --block-inference-size 128 --gpus 0, --resume-epoch latest --vis-junc-th 0.25 --vis-line-th 0.25 - test $1 --path-to-image PATH_TO_IMAGE /mnt/lustre/chenchang1/data/line_test_data/p2.png 不过出现了一个错误： result = fn(*varargs, **kwargs) File "test.py", line 103, in test img = cv2.resize(img, (self.img_size, self.img_size)) cv2.error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/resize.cpp:3720: error: (-215:Assertion failed) !ssize.empty() in function 'resize' 后面我直接将路径设置到imread中，出现了上面的结果。你有没有跑出作者提供的效果？如果有的话，麻烦告诉我你的设置。 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

Answer 15 · 2020-06-23T04:27:56.000Z

I want to know the information of '*.lg', so i write code like this:
import pickle with open("C:\\YMW\\YMWbiye\\PPGNet-master\\SIST (1)\\indoorDist\\train\\00030077.lg", "rb") as f: data = pickle.load(f)
but the code is wrong, "ValueError: Buffer dtype mismatch, expected 'ITYPE_t' but got 'long long'".
What is wrong with this, how to solve it?

Hello, I' m facing the same problem here, can you tell me how to create a '.lg' file using my own data?

Answer 16 · 2020-07-07T09:51:08.000Z

I want to know the information of '*.lg', so i write code like this:
import pickle with open("C:\\YMW\\YMWbiye\\PPGNet-master\\SIST (1)\\indoorDist\\train\\00030077.lg", "rb") as f: data = pickle.load(f)
but the code is wrong, "ValueError: Buffer dtype mismatch, expected 'ITYPE_t' but got 'long long'".
What is wrong with this, how to solve it?

Hello, I' m facing the same problem here, can you tell me how to create a '.lg' file using my own data?

Having the same question here, also would like to know how to get a '.lg' file using my own data. Thank you.

Answer 17 · 2020-07-08T01:55:41.000Z

I want to know the information of '*.lg', so i write code like this:
import pickle with open("C:\\YMW\\YMWbiye\\PPGNet-master\\SIST (1)\\indoorDist\\train\\00030077.lg", "rb") as f: data = pickle.load(f)
but the code is wrong, "ValueError: Buffer dtype mismatch, expected 'ITYPE_t' but got 'long long'".
What is wrong with this, how to solve it?

Hello, I' m facing the same problem here, can you tell me how to create a '.lg' file using my own data?

Having the same question here, also would like to know how to get a '.lg' file using my own data. Thank you.

you can find a demo for how to create '.lg' file in tools/rebuild_yorkurban.py