Causal adjacency matrix returned in case of insignificant node(s)
geogian opened this issue · 2 comments
I am running multivariate analyses (both TE and MI) in a simulated dataset with 10 variables and known causal structure. I then retrieve the adjacency matrix of the estimated causal graph with results.get_adjacency_matrix
(binary weights) and subsequently export it to networkx with io.export_networkx_graph
. Then I transform it to a numpy matrix with nx.to_numpy_matrix
.
I have found that, if a variable of the dataset is not deemed significant as a source or as a target, (i.e. if, as a node, it is disconnected from the graph the method returns), then it is not part of the graph visualization, and it is also not included (as a zero row & column) in the corresponding adjacency matrix.
This leads to a corresponding adjacency matrix of smaller dimension. In my example, using multivariate MI, one node is found insignificant, and the adjacency matrix is 9x9 (instead of 10x10 which is in line with the dimension of the original data).
When the causal ground truth of the dataset is known, it is of great interest to compare the estimated causal adjacency matrix with the ground truth adjacency matrix - and evaluate the performance of the method e.g. through binary classification metrics.
It is also important to be able to iterate the analysis and evaluate the performance of the method over an arbitrary number of such datasets. Due to the aforementioned potential discrepancy in dimensions between the two matrices used to evaluate the performance of the method, this task is hard to automate, as inconsistencies in matrix dimensions will break the evaluation.
Is there a way to always retrieve the full-dimension adjacency matrix with current idtxl tools?
Hi @geogian, this is actually a problem, I have also encountered. This is quite easy to fix. I will put this on my todo list and work on it asap. Thanks!
Hi @pwollstadt , I have the same needs as @geogian. I want to know when this bug will be fixed. Thanks a lot!