airalcorn2/RankNet

network prediction examples

shadiakiki1986 opened this issue · 1 comments

My 2 cents: The last part of your code generating scores was unclear to me.
Instead, the below exemplified how the trained network outputs 1 when the two vectors are uncorrelated, and something close to 0.5 when they are.
Feel free to add it to your code.
Also, maybe a jupyter notebook would make more sense, as it can display in github the results without you having to copy paste them in-line the code as comments.

(Pdb) model.predict([X_1[0:1,:], np.zeros(shape=(1,50))])
array([[0.9999989]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], np.ones(shape=(1,50))])
array([[0.9999285]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], np.random.randn(N, INPUT_DIM)])
array([[0.99992096]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]])
array([[0.5]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_2[0:1,:]])
array([[0.9995178]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]+0.1])
array([[0.47106665]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]+0.2])
array([[0.44332358]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]-0.2])
array([[0.5630561]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]+0.1*np.random.randn(N, INPUT_DIM)])
array([[0.46504748]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]+0.1*np.random.randn(N, INPUT_DIM)])
array([[0.5252977]], dtype=float32)
(Pdb) model.predict([X_1[0:1,:], X_1[0:1,:]+0.1*np.random.randn(N, INPUT_DIM)])
array([[0.4684026]], dtype=float32)

The purpose of RankNet is to rank things (like documents), so I wanted to demonstrate how you would generate scores for ranking. My goal with the code was to show that, e.g., documents with higher similarity features when compared to a query (i.e., X_1 vs. X_2) will produce higher scores and thus be ranked higher, but I realize now using randn was not actually demonstrating that (it was more showing that things with "extreme" features produce higher scores). I updated my code to use uniform instead of randn to better support the document ranking interpretation. Your examples are showing that similar input vectors produce similar scores (i.e., the difference of their scores is close to 0, which corresponds to a probability of 0.5 when passed through a sigmoid function because 1 / (1 + e-0) = 0.5), which is exactly what you would expect because the model is a Siamese network.