How to extract n_neighbors for a sample?

Question

How to extract n_neighbors for a sample?

komalsrathi opened this issue 4 years ago · 2 comments

Hi,

I want to know if it is possible to get the n neighbors for a sample of interest.

# reproducible data frame
set.seed(100)
df <- matrix(rnorm(1:100), nrow = 5, ncol = 20)
df <- as.data.frame(df)
rownames(df) <- paste0('row_', rownames(df))
colnames(df) <- paste0('col_', colnames(df))

# umap
ump <- umap(d = t(df), n_neighbors = 3, n_components = 2, metric = "correlation", random_state = 123L)
ump <- ump %>% 
  dplyr::select(UMAP1, UMAP2)
ump

             UMAP1      UMAP2
col_V1    1.248055 -3.2721519
col_V2   -2.732246  0.8900975
col_V3    1.382807 -3.7631097
col_V4   -1.719460  0.7712852
col_V5  -10.159772 -1.7583050
col_V6   -1.908897  0.4350810
col_V7  -11.244816 -1.9610249
col_V8    1.015908 -3.0224593
col_V9  -11.866040 -2.0123785
col_V10  -3.039619  1.3534858
col_V11  -8.347816 -1.2767988
col_V12  -8.580935 -0.3583812
col_V13   1.011420 -3.5656300
col_V14 -10.902308 -2.0594218
col_V15  -8.433976 -0.9522216
col_V16  -8.201790 -0.6048836
col_V17  -1.972616  1.2094533
col_V18  -9.591088 -1.5964160
col_V19 -11.605387 -2.2752790
col_V20  -2.681429  1.3110994

How do I extract say 3 closest neighbors to col_V15 for e.g.?

komalsrathi commented 4 years ago

Thanks!

Answer 1 · 2020-08-06T00:15:17.000Z

Hi there, If I understand correctly, you would like to calculate the distance from each row to each other row and then find the rows that are nearest each other. This function is not implemented in `umapr`. You can do it with the `dist` function that is in the `stats` package, which comes with base R. ``` x <- rnorm(10) y <- rnorm(10) d <- data.frame(x, y) rownames(d) <- letters[1:10] distance_matrix <- dist(d) sort(as.matrix(distance_matrix)[ , "e"]) ``` This code will print the distances of each row from row "e" in increasing order of distance. You can subset for the number of rows you want (note that distance to itself is included). Hope that helps!

…

On Mon, Aug 3, 2020 at 10:08 AM Komal Rathi ***@***.***> wrote: Hi, I want to know if it is possible to get the n neighbors for a sample of interest. # reproducible data frame set.seed(100) df <- matrix(rnorm(1:100), nrow = 5, ncol = 20) df <- as.data.frame(df) rownames(df) <- paste0('row_', rownames(df)) colnames(df) <- paste0('col_', colnames(df)) # umap ump <- umap(d = t(df), n_neighbors = 3, n_components = 2, metric = "correlation", random_state = 123L) ump <- ump %>% dplyr::select(UMAP1, UMAP2) ump UMAP1 UMAP2 col_V1 1.248055 -3.2721519 col_V2 -2.732246 0.8900975 col_V3 1.382807 -3.7631097 col_V4 -1.719460 0.7712852 col_V5 -10.159772 -1.7583050 col_V6 -1.908897 0.4350810 col_V7 -11.244816 -1.9610249 col_V8 1.015908 -3.0224593 col_V9 -11.866040 -2.0123785 col_V10 -3.039619 1.3534858 col_V11 -8.347816 -1.2767988 col_V12 -8.580935 -0.3583812 col_V13 1.011420 -3.5656300 col_V14 -10.902308 -2.0594218 col_V15 -8.433976 -0.9522216 col_V16 -8.201790 -0.6048836 col_V17 -1.972616 1.2094533 col_V18 -9.591088 -1.5964160 col_V19 -11.605387 -2.2752790 col_V20 -2.681429 1.3110994 How do I extract say 3 closest neighbors to col_V15 for e.g.? — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#32>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACQM25N2WJI45RTQCZSLDQTR63VHLANCNFSM4PTRHCHQ> .