const-ae/ggupset

joining two datasets

yusufzhc opened this issue · 6 comments

Hi,

Is it possible to compare two datasets. G and F have gene names

setA <- fromList(list(G1,G2,G3,G4))
setB <- fromList(list(F1,F2,F3,F4))

Trying to make a similar figure like this

https://twitter.com/jamesacotton/status/733373416179388416/photo/1

Also is it possible to get a matrix with names too. by command fromList you get matrix without names.

Hi Yusufkhan,

Is it possible to compare two datasets

Sure, as you noted in #4, I made a ggplot2 using the geom_bar(position = "dodge") function, to compare datasets, similarly to your example from Twitter :)

However, with the information that you provide, I cannot really help you. Could you say a bit more about the specific problem / issue that you have when you try to plot two datasets? Please make a reproducible example using the reprex package.

Best, Constantin

#Alphabets are gene names

G1 <- c("A","B","C","D","E","F","G","H","I","J","K","L","M","N")
G2 <- c("C","D","G","H","I","M","N")
G3 <- c("A","B","C","D","E","F","H","Y","Z","W","Y","P")
G4 <- c("M","N","O","P","E","Q","Z","L","S","I")
F1 <- c("D","G","J","L","P","U","T","E","Q","Z","N")
F2 <- c("Q","W","E","R")
F3 <- c("A","C","D","E","R")
F4 <- c("K","L","M","N","O","Z")
setA <- list(a=G1,b=G2,c=G3,d=G4)
setB <- list(a=F1,b=F2,c=F3,d=F4)
x<-(fromList(setA))
y<-(fromList(setB))
upset(x)
upset(y)

#Now I want to join setA and setB

I find data structure is different from # 4 also the length that's why I got struck.

Okay, thanks that is already helpful :). Next time remember to copy the output from reprex, which makes it even easier for me to see what is happening:

library(UpSetR)
#> Warning: package 'UpSetR' was built under R version 4.0.2
G1 <- c("A","B","C","D","E","F","G","H","I","J","K","L","M","N")
G2 <- c("C","D","G","H","I","M","N")
G3 <- c("A","B","C","D","E","F","H","Y","Z","W","Y","P")
G4 <- c("M","N","O","P","E","Q","Z","L","S","I")
F1 <- c("D","G","J","L","P","U","T","E","Q","Z","N")
F2 <- c("Q","W","E","R")
F3 <- c("A","C","D","E","R")
F4 <- c("K","L","M","N","O","Z")
setA <- list(a=G1,b=G2,c=G3,d=G4)
setB <- list(a=F1,b=F2,c=F3,d=F4)
x<-(fromList(setA))
y<-(fromList(setB))
upset(x)

upset(y)

Based on your description and the code in the README how to reshape rectangular data I would do something like this:

library(tidyverse)


tidy_pathway1 <- t(x == 1) %>%
  as_tibble(rownames = "Pathway") %>%
  gather(Gene, Member, -Pathway) %>%
  filter(Member) %>%
  select(- Member)

tidy_pathway2 <- t(y == 1) %>%
  as_tibble(rownames = "Pathway") %>%
  gather(Gene, Member, -Pathway) %>%
  filter(Member) %>%
  select(- Member)

summarized_pathways1 <- tidy_pathway1 %>%
  group_by(Gene) %>%
  summarize(Pathways = paste0(sort(Pathway), collapse = "-")) %>%
  mutate(origin = "G")
#> `summarise()` ungrouping output (override with `.groups` argument)

summarized_pathways2 <- tidy_pathway2 %>%
  group_by(Gene) %>%
  summarize(Pathways = paste0(sort(Pathway), collapse = "-")) %>%
  mutate(origin = "F")
#> `summarise()` ungrouping output (override with `.groups` argument)

bind_rows(summarized_pathways1, summarized_pathways2) %>%
  ggplot(aes(x = Pathways)) +
  geom_bar(aes(fill = origin), position = position_dodge2(preserve = "single")) +
  ggupset::axis_combmatrix(sep ="-")

Created on 2020-09-18 by the reprex package (v0.3.0)

I know the result may not be perfect yet, but I hope it gets you started :)

hI,

it worked very fine. Thank you.

Is it possible to get matrix by that also?

fromList() is unable to make matrix, by which we can identify which gene is present in which group. I am using perl for that.

Can I add geom_text(aes(label=y),position=dodged)

Is it possible to get matrix by that also?

I am not quite sure I understand your question. However, I am sure if you can somehow do it in perl, there is also way to do it in R :)

Can I add geom_text(aes(label=y),position=dodged)

The great thing about ggupset is that you can apply all the tricks from ggplot. For example, take a look at https://github.com/const-ae/ggupset#adding-numbers-on-top

If it is okay for you, I would close this issue, as my previous post seemed to have answered your question.

Best,
Constantin

thank you