davidsjoberg/ggsankey

Add label to flow

jpaulitz opened this issue ยท 6 comments

Thanks for this great package!
While using ggsankey I sometimes would like to add the numbers of objects per flow to the visualization.
If I understand the documentation correctly this is not currently possible with geom_sankey_text, which is used to label the nodes only.
I am currently trying out some hacky ways to label the flows, but any pointers would be appreciated!

Great that you are working on that option!

I think there are 3 reasonable possibilities for the labels: Directly behind the source node, directly before the end node or in the middle of two nodes. I think the first two options will usually be most reasonable since these areas tend to be more homogeneous and not as messy as the midpoint between two nodes where the flows cross each other. The best option for me would be directly behind the source node in the middle (y-direction) of the flow.

Thanks much for this fantastic package, @davidsjoberg. I'm wondering if you've had a chance to build in this ^ functionality yet; I'm considering MacGyver-ing the code, but figured I'd check in before I do!

Hello @davidsjoberg thank you for this nice package. I have the same need of labeling the flows, I tried to retrieve the flow positions from the figure layers and add corresponding labels to them, but not succeed yet ...
I'am wondering if you had already implemented this function or maybe have any suggestion on how to do this properly?

Hello @davidsjoberg thank you for this nice package. I have the same need of labeling the flows, I tried to retrieve the flow positions from the figure layers and add corresponding labels to them, but not succeed yet ... I'am wondering if you had already implemented this function or maybe have any suggestion on how to do this properly?

I come back here to answer my own question, and here is a solution to add flow size to each flow:
Suppose df is the data returned by the function make_long(), and p is the sankey plot you've made based on df

# get flow size for each flow
`flow_labels <- df %>% group_by(x, node, next_x, next_node) %>% tally() %>% drop_na()

# get corresponding positions of flows from the sankey plot
flow_info <- layer_data(p) %>% select(xmax, flow_start_ymax, flow_start_ymin) %>% distinct() # get flow related key positions related from the plot
flow_info <- flow_info[with(flow_info, order(xmax, flow_start_ymax)), ] # order the positions to match the order of flow_labels
rownames(flow_info) <- NULL # drop the row indexes
flow_info <- cbind(as.data.frame(flow_info), as.data.frame(flow_labels)) # bind the flow positions and the corresponding labels

# add labels to the flows
for (i in 1:nrow(flow_info)){
   p <- p + annotate("text", x = flow_info$xmax[i],
                       y = (flow_info$flow_start_ymin[i] + flow_info$flow_start_ymax[i])/2,
                       label = sprintf("%d", flow_info$n[i]), hjust = -1,
                       size = 3
                       )
 }

It's not a very elegant solution, but it works for me.

hello @davidsjoberg wondering if you have been able to build this enhancement? I did try @feizheng0209 workaround but as they mentioned, it's not very elegant.