sample_status_file_creator.py csv_input_file
Opened this issue · 7 comments
Dear @BioinfoUNIBA @claudiologiudice @asilvestris84
Thanks for this great tool. I am keen to run it on my samples but unable to get sample_status_file_creator.py to run. What information is required in the csv_input_file? In the example at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file there is no column headers. I can see the first column is the grouping variable but I am unable to see what the 2nd to 11th columns refer to for each sample. Is it only the first column that is required? Even with the example files i am getting error:
Traceback (most recent call last):
File "sample_status_file_creator.py", line 40, in <module>
srr = line[3]
IndexError: list index out of range
Many thanks,
Oliver
Thanks @claudiologiudice ,
I am not sure what is needed in the file so am unable to show it to you. Can you please advise what is required in each column of the file.
I have tried it on the example file at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file which led to the index out of range error above.
Best wishes,
Oliver
if i change line 40 from srr = line[3] to srr = line[1] it selects the second column which gets round that error and creates a sif file:
Sample,Group,Type
mn_cyt_d35_ctrl_R1
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R2
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R3
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R4,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R1
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R2
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R3
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R4
,GROUPB,ctrl_nuc
This doesnt look right because the alternating columns start with a comma.
Running sample_path_folder_creator.py on this leads to error:
Traceback (most recent call last):
File "sample_path_folder_creator.py", line 39, in <module>
os.makedirs(dna_rna_folder)
File "python2.7/os.py", line 150, in makedirs
makedirs(head, mode)
File "python2.7/os.py", line 157, in makedirs
mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/editing'
Im not clear from the README how i should take the overall editing level, ALU editing index and/or recoding index results and point these to sample_status_file_creator.py.
Many thanks for your help with this,
Oliver
Thank you @claudiologiudice,
A generalisable script would really help as well as a brief outline of what is needed in the csv_input_file. Is there any update on this?
Many thanks,
Oliver
I am also curious about this as the 4th column seems to be needed but what if we are using Ensembl instead of NCBI what does that column really mean? What do the numbers after the SR id mean - they don't seem to be used in the sif
file.
I got it, but the results are being evaluated. I don't know if they're correct.
-
in
sample_path_folder_creator.py
, I modified the script to make sure it was output like thiscontrol-vs-case.sif
Sample,Group,Type control-1,GROUPA,control control-2,GROUPA,control control-3,GROUPA,control case-1,GROUPB,case case-2,GROUPB,case case-3,GROUPB,case
and make sure in the output filesrr_dir + '/editing/DnaRna_'
like thisreditools outputfile*xls
-
in
get_DE_events.py
, I modified the script to make sure outfile look like is ok
eg :chromosome position editing_type scramble-1_scramble scramble-2_scramble scramble-3_scramble sh415-1_sh415 sh415-2_sh415 sh415-3_sh415 [groupA_samples/groupB_samples] delta_diff pvalue (Mannwhitney) chrY 272161 TC 1.00^TC^67 1.00^TC^63 1.00^TC^32 1.00^TC^64 1.00^TC^60 1.00^TC^77 [3,3] 0.0 1.0