FoldSets potentially match partial attribute names
Opened this issue · 0 comments
jdurbin commented
When attributes have names that are subsets of other attribute names, FoldSets can get the wrong fold set... for example TP53 is a subset of TP53A, so if those are two different attributes foldsets may match the second when searching for the first. Need to fix match to be more precise than "contains". The regex matching below may do the trick, need to verify and make the change.
FoldSets getFoldSetsForAttribute(Attribute a){
-
err.println "allfoldsets: "+this.map
-
// Get foldsets matching fold\d+ def foldKeys = map.keySet().grep(~/fold\d+/) def foldMap = map.subMap(foldKeys) // Get foldsets matching attribute name def attributeName = a.name()
-
def attributeKeys = map.keySet().grep(attributeName)
-
// Attribute folds can be attribute name _Rep01 _Rep02, etc.
-
// This keeps us from matching attributes that are subsets of others
-
// e.g. TP53Syn and TP53SynPlus the second should not match when
-
// the attribute is the first.
-
def attributePattern = ~/($attributeName)(_Rep\d+)*/
-
def allkeys = map.keySet()
-
def attributeKeys = allkeys.grep(attributePattern)
-
err.println "DEBUG allkeys = "+allkeys
-
err.println "DEBUG attributeName: $attributeName"
-
err.println "DEBUG grep out $attributeKeys"
-
def attributeMap = map.subMap(attributeKeys) // Merge the two maps... def newMap = attributeMap + foldMap def newData = newMap.values() as ArrayList def newFoldSet = new FoldSets(newMap,newData)
-
err.println "newfoldset: "+newFoldSet?.map
-
return(newFoldSet) }