Validating distances against reference implementations

Question

Validating distances against reference implementations

sdmccabe opened this issue 5 years ago · 10 comments

Answer 1 · 2019-08-23T17:39:54.000Z

There are differences in output between our distance and the reference implementation of Portrait Divergence, but the differences are consistently small (the largest I've seen is 0.005, and it's usually more like 0.001). I'll keep investigating but I'd guess it's nothing.

Answer 2 · 2019-08-23T17:52:07.000Z

We should bump the PyPI version after finishing this.

Answer 3 · 2019-08-23T20:13:17.000Z

@leotrs I've checked off NBD because I assume the implementations are the same.

Answer 4 · 2019-08-25T17:57:47.000Z

HIM is producing different outputs from the R NetworkDistance implementation for RGGs (N=200, p=0.26, using the edgelists from the graphwend repo); will need to investigate further.

Answer 5 · 2019-08-26T14:31:32.000Z

@leotrs I've checked off NBD because I assume the implementations are the same.

At this point I wouldn't be surprised if netrd's implementation is more updated than mine. However, you can forget about NBD as I am the maintainer of the other one. If the outputs from the two different repos are different, then probably netrd's are correct...

Answer 6 · 2019-08-26T14:33:33.000Z

For NetSimile, I found this and this. Haven't compared them yet tho.

Answer 7 · 2019-08-26T14:55:00.000Z

NetSimile is a frustrating one since there isn't a reference implementation in the sense of author's code, so we're assuming the other independent implementations are correct. When I was debugging some NetSimile issues back in the spring I remember comparing the outputs to those from the netcomp library; I don't know if anything has changed since but I believe they were producing similar or identical outputs.

Answer 8 · 2019-08-26T15:00:27.000Z

We could use it as a touchstone only then. As long as we're in their ballpark, we're good.

Answer 9 · 2019-10-09T17:49:10.000Z

Frobenius and Jaccard depend on row ordering, yes?

Unrelatedly, they both seem to be simple enough that we can just check them off?

Answer 10 · 2019-10-09T17:52:18.000Z

They should depend on row ordering, @jkbren would be able to confirm from his experiments.

They're probably simple enough to check off, but simplicity can be deceiving; see the issues we had with Jaccard before in #180.