abduallahmohamed/Social-STGCNN

max_nodes =88

arsalhuda24 opened this issue · 7 comments

very interesting work I have a question about the seq_to_nodes function. why is max_nodes set to be 88? I was wondering how do you compute this number? Thanks

def seq_to_nodes(seq_,max_nodes = 88):
seq_ = seq_.squeeze()
seq_len = seq_.shape[2]

V = np.zeros((seq_len,max_nodes,2))
for s in range(seq_len):
    step_ = seq_[:,:,s]
    for h in range(len(step_)): 
        V[s,h,:] = step_[h]
        
return V.squeeze()

Hi,
this number was used in early stages of the dev of the model. The max number of pedestrians across the datasest in a single scene is 88. In the early trials as far as I recall I wanted to have a place holder like zero and a fixed graph dimension. Also, this number doesn't effect the evaluation code. I pushed a fix to remove this and to have a more optimized eval metrics.

Thanks for clarification. So I used your code to train the model on Lyft object detection dataset but it seems the predictions are very weird. I am using pixel coordinates though. Do you have idea what could be possible reasons. I tried using different obs and pred lengths also but no luck.
Social-STGCNN_lyft
blue : observed
green: ground truth
red: prediction (one out of 20 sample)

Thanks Abduallah. Actually I have prepared the Lyft data in exactly similar way to UCY/ETH datasets the only difference is that its in pixel coordinates. So I was wondering , shouldn't the adjacency matrix kernel construct the graph in a similar way to that of pedestrians. or is it that pedestrians have much more stochastic behavior compared to vehicles and thats why may be it can not capture accurate V2V interactions. I don't know may be i am missing something.

I also tried changing the kernel function as mentioned by equation 8 and 9 in paper but no luck.

Thanks

I'm not sure about pixel coordinates vs meters. But I don't think the current adjacency matrix design is suitable for cars, because you think differently about cars around you unlike pedestrians. So, having a suitable kernel function can help a lot.

Thanks

Thanks, I think it makes sense to change the adjacency matrix because when I use eq8 and 9 the results are very much different. Need to think about how to find a good kernel function.

Just curious in pedestrian case your work constructs graph between every pedestrian in the scene or there is a fixed neighborhood? What I am understanding is that the graph is dynamic as the frames progress?

Do you think having a fixed frame from graph construction might help in case of vehicles?