BioinfoUNIBA/QEdit

sample_status_file_creator.py csv_input_file

Opened this issue · 7 comments

Dear @BioinfoUNIBA @claudiologiudice @asilvestris84

Thanks for this great tool. I am keen to run it on my samples but unable to get sample_status_file_creator.py to run. What information is required in the csv_input_file? In the example at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file there is no column headers. I can see the first column is the grouping variable but I am unable to see what the 2nd to 11th columns refer to for each sample. Is it only the first column that is required? Even with the example files i am getting error:

Traceback (most recent call last):
  File "sample_status_file_creator.py", line 40, in <module>
    srr = line[3]
IndexError: list index out of range

Many thanks,
Oliver

Thanks @claudiologiudice ,

I am not sure what is needed in the file so am unable to show it to you. Can you please advise what is required in each column of the file.

I have tried it on the example file at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file which led to the index out of range error above.

Best wishes,
Oliver

if i change line 40 from srr = line[3] to srr = line[1] it selects the second column which gets round that error and creates a sif file:

Sample,Group,Type
mn_cyt_d35_ctrl_R1
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R2
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R3
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R4,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R1
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R2
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R3
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R4
,GROUPB,ctrl_nuc

This doesnt look right because the alternating columns start with a comma.

Running sample_path_folder_creator.py on this leads to error:

Traceback (most recent call last):
  File "sample_path_folder_creator.py", line 39, in <module>
    os.makedirs(dna_rna_folder)
  File "python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/editing'

Im not clear from the README how i should take the overall editing level, ALU editing index and/or recoding index results and point these to sample_status_file_creator.py.

Many thanks for your help with this,
Oliver

Thank you @claudiologiudice,
A generalisable script would really help as well as a brief outline of what is needed in the csv_input_file. Is there any update on this?
Many thanks,
Oliver

yr542 commented

I am also curious about this as the 4th column seems to be needed but what if we are using Ensembl instead of NCBI what does that column really mean? What do the numbers after the SR id mean - they don't seem to be used in the sif file.

I got it, but the results are being evaluated. I don't know if they're correct.

  1. in sample_path_folder_creator.py , I modified the script to make sure it was output like this control-vs-case.sif
    Sample,Group,Type control-1,GROUPA,control control-2,GROUPA,control control-3,GROUPA,control case-1,GROUPB,case case-2,GROUPB,case case-3,GROUPB,case and make sure in the output file srr_dir + '/editing/DnaRna_' like this reditools outputfile*xls

  2. in get_DE_events.py , I modified the script to make sure outfile look like is ok
    eg : chromosome position editing_type scramble-1_scramble scramble-2_scramble scramble-3_scramble sh415-1_sh415 sh415-2_sh415 sh415-3_sh415 [groupA_samples/groupB_samples] delta_diff pvalue (Mannwhitney) chrY 272161 TC 1.00^TC^67 1.00^TC^63 1.00^TC^32 1.00^TC^64 1.00^TC^60 1.00^TC^77 [3,3] 0.0 1.0