sfcheung/semptools

Problem with custom node labels

marklhc opened this issue · 10 comments

library(lavaan)
library(semPlot)
library(semptools)
mod_pa <- 
 'x1 ~~ x2
  x3 ~  x1 + x2
  x4 ~  x1 + x3
 '
fit_pa <- lavaan::sem(mod_pa, pa_example)

In semPaths() one can specify custom labels using the nodeLabels
argument.

# Use custom labels
m <- matrix(c("Var1",   NA,  NA,   NA,
              NA, "Var3",  NA, "Var4",
              "Var2",   NA,  NA,   NA), byrow = TRUE, 3, 4)
p_pa <- semPaths(fit_pa, whatLabels = "est",
                 sizeMan = 10,
                 edge.label.cex = 1.15,
                 style = "ram", 
                 nodeLabels = c("Var3", "Var4", "Var1", "Var2"), 
                 layout = m)

However, using custom labels result in errors from mark_sig() and
similarly mark_se(), because the node labels are different from the
ones in the lavaan object.

p_pa2 <- mark_sig(p_pa, fit_pa)
plot(p_pa2)  # No "***" added

There may not be an easy fix as the qgraph object does not seem to
contain the original variable names. One option is to have users input
the custom labels as a named list. Another is to print out a warning
cautioning that the use of custom labels is not supported.

It is solvable if users specify a named vector as custom labels

# Use custom labels with named vector
p_pa <- semPaths(fit_pa, whatLabels = "est",
                 sizeMan = 10,
                 edge.label.cex = 1.15,
                 style = "ram", 
                 nodeLabels = c(
                   "x3" = "Var3", 
                   "x4" = "Var4", 
                   "x1" = "Var1", 
                   "x2" = "Var2"), 
                 layout = m)

In which case the mapping of the variable names and custom labels are
stored

p_pa$graphAttributes$Nodes$names
##     x3     x4     x1     x2 
## "Var3" "Var4" "Var1" "Var2"

P.S. A similar issue occurs when semPaths() abbreviate the names of
the variables.

I will draft a function in the branch match_node_names to address this issue.

I will draft a function in the branch match_node_names to address this issue.

We decided not addressing this issue by matching node names. Instead, another function for changing node names will be written.

I will draft a function in the branch match_node_names to address this issue.

We decided not addressing this issue by matching node names. Instead, another function for changing node names will be written.

I will draft the function in this branch:
change_node_labels

I added an experimental function change_node_label2(), which takes a named list as input, but then I found out your suggestion in #53. The function is not currently exported.

A bug has now been identified as set_curve() and set_edge_label_position() give an error when applying to objects returned by change_node_label(). Still need more investigation, but most likely due to these functions not using the names of the named vector returned. @sfcheung do you think we should change all functions to allow them use the names of the named vector/list for the labels if they could not find a match?

I added an experimental function change_node_label2(), which takes a named list as input, but then I found out your suggestion in #53. The function is not currently exported.

I drafted this function to_list_of_lists in the branch change_node_labels to make it easier to specify the elements to be changed by supplying a named vector, rather a list of lists. We could have used this approach for all relevant functions but it is too late to rewrite them now. There may also be situations in which named vector will not work. Therefore, we can revise relevant functions such that, if they detected the input is a named vector rather than a list of lists, they will be converted to a list of lists internally. Then all previous examples will still work while users have two options to specify the nodes or paths to be changed.

P.S.: I forgot which branch is the active branch that we should work on. Given that change_node_labels involves an urgent issue to handle, I drafted that function there.

A bug has now been identified as set_curve() and set_edge_label_position() give an error when applying to objects returned by change_node_label(). Still need more investigation, but most likely due to these functions not using the names of the named vector returned. @sfcheung do you think we should change all functions to allow them use the names of the named vector/list for the labels if they could not find a match?

Agree, @marklhc . I will revise related functions in the branch change_node_labels.

A bug has now been identified as set_curve() and set_edge_label_position() give an error when applying to objects returned by change_node_label(). Still need more investigation, but most likely due to these functions not using the names of the named vector returned. @sfcheung do you think we should change all functions to allow them use the names of the named vector/list for the labels if they could not find a match?

@marklhc, I revised edge_index(), which is used by set_curve() and set_edge_label_position(), following your advice on using the names of the named vector (...$graphAttributes$Nodes$names) if no match was found. This should solve the bug you found (the tests in test-change-node have been passed). I will check which other functions need to be revised similarly (I created #58 to keep track of this task).

I added an experimental function change_node_label2(), which takes a named list as input, but then I found out your suggestion in #53. The function is not currently exported.

I revised change_node_label() based on change_node_label2(), adopted your approach in finding a match and allowing for using a named list (rather than a list of lists).

I am closing this issue because it should be been fixed when addressing other issues. The brnach https://github.com/sfcheung/semptools/tree/change_node_labels up to 2fb1d0b passed all tests. The remaining tasks are addressed in #58. If necessary, we can reopen this issue later.