How to extract n_neighbors for a sample?
komalsrathi opened this issue · 2 comments
komalsrathi commented
Hi,
I want to know if it is possible to get the n neighbors for a sample of interest.
# reproducible data frame
set.seed(100)
df <- matrix(rnorm(1:100), nrow = 5, ncol = 20)
df <- as.data.frame(df)
rownames(df) <- paste0('row_', rownames(df))
colnames(df) <- paste0('col_', colnames(df))
# umap
ump <- umap(d = t(df), n_neighbors = 3, n_components = 2, metric = "correlation", random_state = 123L)
ump <- ump %>%
dplyr::select(UMAP1, UMAP2)
ump
UMAP1 UMAP2
col_V1 1.248055 -3.2721519
col_V2 -2.732246 0.8900975
col_V3 1.382807 -3.7631097
col_V4 -1.719460 0.7712852
col_V5 -10.159772 -1.7583050
col_V6 -1.908897 0.4350810
col_V7 -11.244816 -1.9610249
col_V8 1.015908 -3.0224593
col_V9 -11.866040 -2.0123785
col_V10 -3.039619 1.3534858
col_V11 -8.347816 -1.2767988
col_V12 -8.580935 -0.3583812
col_V13 1.011420 -3.5656300
col_V14 -10.902308 -2.0594218
col_V15 -8.433976 -0.9522216
col_V16 -8.201790 -0.6048836
col_V17 -1.972616 1.2094533
col_V18 -9.591088 -1.5964160
col_V19 -11.605387 -2.2752790
col_V20 -2.681429 1.3110994
How do I extract say 3 closest neighbors to col_V15
for e.g.?
seaaan commented
Hi there,
If I understand correctly, you would like to calculate the distance from
each row to each other row and then find the rows that are nearest each
other. This function is not implemented in `umapr`. You can do it with the
`dist` function that is in the `stats` package, which comes with base R.
```
x <- rnorm(10)
y <- rnorm(10)
d <- data.frame(x, y)
rownames(d) <- letters[1:10]
distance_matrix <- dist(d)
sort(as.matrix(distance_matrix)[ , "e"])
```
This code will print the distances of each row from row "e" in increasing
order of distance. You can subset for the number of rows you want (note
that distance to itself is included).
Hope that helps!
…On Mon, Aug 3, 2020 at 10:08 AM Komal Rathi ***@***.***> wrote:
Hi,
I want to know if it is possible to get the n neighbors for a sample of
interest.
# reproducible data frame
set.seed(100)
df <- matrix(rnorm(1:100), nrow = 5, ncol = 20)
df <- as.data.frame(df)
rownames(df) <- paste0('row_', rownames(df))
colnames(df) <- paste0('col_', colnames(df))
# umap
ump <- umap(d = t(df), n_neighbors = 3, n_components = 2, metric = "correlation", random_state = 123L)
ump <- ump %>%
dplyr::select(UMAP1, UMAP2)
ump
UMAP1 UMAP2
col_V1 1.248055 -3.2721519
col_V2 -2.732246 0.8900975
col_V3 1.382807 -3.7631097
col_V4 -1.719460 0.7712852
col_V5 -10.159772 -1.7583050
col_V6 -1.908897 0.4350810
col_V7 -11.244816 -1.9610249
col_V8 1.015908 -3.0224593
col_V9 -11.866040 -2.0123785
col_V10 -3.039619 1.3534858
col_V11 -8.347816 -1.2767988
col_V12 -8.580935 -0.3583812
col_V13 1.011420 -3.5656300
col_V14 -10.902308 -2.0594218
col_V15 -8.433976 -0.9522216
col_V16 -8.201790 -0.6048836
col_V17 -1.972616 1.2094533
col_V18 -9.591088 -1.5964160
col_V19 -11.605387 -2.2752790
col_V20 -2.681429 1.3110994
How do I extract say 3 closest neighbors to col_V15 for e.g.?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#32>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACQM25N2WJI45RTQCZSLDQTR63VHLANCNFSM4PTRHCHQ>
.
komalsrathi commented
Thanks!