tf_expression: bed file format
Closed this issue · 1 comments
I got tf_expression.py: get_genes function to work but I don't think the function is looking for a bed file per the definition of bed file format in the following reference:
https://genome.ucsc.edu/FAQ/FAQformat.html#format1
The file it successfully reads is: ALL_ARRAYS_NORMALIZED_MAXPROBE_LOG2_COORDS.sorted.txt
Whose format is:
CHR START STOP GENE <sample 1> <sample 2> ...
The bed format does not seem to allow multiple scores, but rather expects 1 score and no more than 12 columns of data. Each column is a distinctly different piece of information rather than the same piece for different samples.
Yeah, it's pretty common to refer to file formats like we're using as "BED-like" files, due to the genome coordinates occupying the first three columns and using the BED convention for them (0-based). I guess I should be more explicit with my word choice, probably better to just refer to them as "tab-delimited data files in a BED-like format" since they aren't true BED files. Sorry for the confusion.