trinker/clustext

Add iterative argument/function

Opened this issue · 2 comments

Add an n loop iterative partition that splits of the large mass into k clusters. There'd be an argument to pass n length vector of k splits. Not sure if this is a separate function after xxx_cluster . This seems to be the most sense.

ALTERNATIVE...

Investigate dynamic tree cut: https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/WORKSHOP/2013/LectureHierarchicalClusteringLangfelder1.pdf

https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/BranchCutting/Supplement.pdf

https://cran.r-project.org/web/packages/dynamicTreeCut/

https://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/BranchCutting/

This might require a tree object and may not be usable by all clustering algorithms.

p_load(dynamicTreeCut)

## make data
data <- c(1,2,3,4,5, 7,9,10,11,12,  19,24,28,32,38, 54);
dim(data) <- c(1, length(data));
dissim <- dist(t(data));

## tree and plot
dendro <- hclust(dissim, method = "average");
plot(dendro)

##==========
## Cutting
##==========
cutree(dendro, h = 12)

## [1] 1 1 1 1 1 1 1 1 1 1 2 2 2 2 3 4

cutreeDynamic(
    dendro = dendro,
    cutHeight = NULL, 
    minClusterSize = 3,
    method = "tree", 
    deepSplit = TRUE
)

## [1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 0
split(data[1,], cutree(dendro, h = 12))

## $`1`
##  [1]  1  2  3  4  5  7  9 10 11 12
## 
## $`2`
## [1] 19 24 28 32
## 
## $`3`
## [1] 38
## 
## $`4`
## [1] 54

cuts <- cutreeDynamic(
    dendro = dendro,
    cutHeight = NULL, 
    minClusterSize = 3,
    method = "tree", 
    deepSplit = TRUE
)

split(data[1,], cuts)

## $`0`
## [1] 54
## 
## $`1`
## [1] 1 2 3 4 5
## 
## $`2`
## [1]  7  9 10 11 12
## 
## $`3`
## [1] 19 24 28 32 38

learn more about the parameters

Use snake case instead of camel case in these arg names