open2c/bioframe

strand-aware closest

Opened this issue · 1 comments

Example task:

Select CTCF peaks that are upstream of genes (in a strand-aware fashion). These peaks don't necessarily need to be on the same strand as the gene (hence would not be resolved just by adding the on=[] to closest()).

Solving this task currently requires multiple calls to closest.

Potential solutions:

  • a convenience function that makes multiple calls to closest and merges the outputs.
  • passing a column which defines the upstream/downstream relative to the genome. Default is relative to the reference genome (i.e. non-stranded closest).

Questions:

is this issue closed @agalitsyna ?