sample_status_file_creator.py csv_input_file

Question

sample_status_file_creator.py csv_input_file

Opened this issue 4 years ago · 7 comments

Dear @BioinfoUNIBA @claudiologiudice @asilvestris84

Thanks for this great tool. I am keen to run it on my samples but unable to get sample_status_file_creator.py to run. What information is required in the csv_input_file? In the example at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file there is no column headers. I can see the first column is the grouping variable but I am unable to see what the 2nd to 11th columns refer to for each sample. Is it only the first column that is required? Even with the example files i am getting error:

Traceback (most recent call last):
  File "sample_status_file_creator.py", line 40, in <module>
    srr = line[3]
IndexError: list index out of range

Many thanks,
Oliver

Answer 1 · 2021-06-15T21:01:58.000Z

Dear Oliver, could you please send me the first two rows of your input file? Best regards, Claudio Il Mar 15 Giu 2021, 22:01 Oliver Ziff ***@***.***> ha scritto:

…

Dear @BioinfoUNIBA <https://github.com/BioinfoUNIBA> @claudiologiudice <https://github.com/claudiologiudice> @asilvestris84 <https://github.com/asilvestris84> Thanks for this great tool. I am keen to run it on my samples but unable to get sample_status_file_creator.py to run. What information is required in the csv_input_file? In the example at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file there is no column headers. I can see the first column is the grouping variable but I am unable to see what the 2nd to 11th columns refer to for each sample. Is it only the first column that is required? Even with the example files i am getting error: Traceback (most recent call last): File "sample_status_file_creator.py", line 40, in <module> srr = line[3] IndexError: list index out of range Many thanks, Oliver — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMGJYGM7FFUO457ZJWRHVV3TS6WQ3ANCNFSM46X6JLIA> .

Answer 2 · 2021-06-16T08:25:16.000Z

Thanks @claudiologiudice ,

I am not sure what is needed in the file so am unable to show it to you. Can you please advise what is required in each column of the file.

I have tried it on the example file at https://github.com/BioinfoUNIBA/QEdit/blob/master/Example_files/csv_input_file which led to the index out of range error above.

Best wishes,
Oliver

Answer 3 · 2021-06-16T21:23:19.000Z

if i change line 40 from srr = line[3] to srr = line[1] it selects the second column which gets round that error and creates a sif file:

Sample,Group,Type
mn_cyt_d35_ctrl_R1
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R2
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R3
,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R4,GROUPA,ctrl_cyt
mn_cyt_d35_ctrl_R1
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R2
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R3
,GROUPB,ctrl_nuc
mn_nuc_d35_ctrl_R4
,GROUPB,ctrl_nuc

This doesnt look right because the alternating columns start with a comma.

Running sample_path_folder_creator.py on this leads to error:

Traceback (most recent call last):
  File "sample_path_folder_creator.py", line 39, in <module>
    os.makedirs(dna_rna_folder)
  File "python2.7/os.py", line 150, in makedirs
    makedirs(head, mode)
  File "python2.7/os.py", line 157, in makedirs
    mkdir(name, mode)
OSError: [Errno 13] Permission denied: '/editing'

Im not clear from the README how i should take the overall editing level, ALU editing index and/or recoding index results and point these to sample_status_file_creator.py.

Many thanks for your help with this,
Oliver

Answer 4 · 2021-06-16T21:48:39.000Z

Got it. Probably you don't have the rights to maka that directory. I'm trying to generalize the script. Il giorno mer 16 giu 2021 alle ore 23:23 Oliver Ziff < ***@***.***> ha scritto:

…

if i change line 40 from srr = line[3] to srr = line[1] it selects the second column which gets round that error and creates a sif file: Sample,Group,Type mn_cyt_d35_ctrl_R1 ,GROUPA,ctrl_cyt mn_cyt_d35_ctrl_R2 ,GROUPA,ctrl_cyt mn_cyt_d35_ctrl_R3 ,GROUPA,ctrl_cyt mn_cyt_d35_ctrl_R4,GROUPA,ctrl_cyt mn_cyt_d35_ctrl_R1 ,GROUPB,ctrl_nuc mn_nuc_d35_ctrl_R2 ,GROUPB,ctrl_nuc mn_nuc_d35_ctrl_R3 ,GROUPB,ctrl_nuc mn_nuc_d35_ctrl_R4 ,GROUPB,ctrl_nuc This doesnt look right because the alternating columns start with a comma. Running sample_path_folder_creator.py on this leads to error: Traceback (most recent call last): File "sample_path_folder_creator.py", line 39, in <module> os.makedirs(dna_rna_folder) File "python2.7/os.py", line 150, in makedirs makedirs(head, mode) File "python2.7/os.py", line 157, in makedirs mkdir(name, mode) OSError: [Errno 13] Permission denied: '/editing' Im not clear from the README how i should take the overall editing level, ALU editing index and/or recoding index results and point these to sample_status_file_creator.py. Many thanks for your help with this, Oliver — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMGJYGPHIHSOVPBKSLXPBUTTTEI5BANCNFSM46X6JLIA> .

Answer 5 · 2021-07-01T20:31:44.000Z

Thank you @claudiologiudice,
A generalisable script would really help as well as a brief outline of what is needed in the csv_input_file. Is there any update on this?
Many thanks,
Oliver

Answer 6 · 2023-12-17T18:14:56.000Z

I am also curious about this as the 4th column seems to be needed but what if we are using Ensembl instead of NCBI what does that column really mean? What do the numbers after the SR id mean - they don't seem to be used in the sif file.

Answer 7 · 2024-04-28T03:49:58.000Z

I got it, but the results are being evaluated. I don't know if they're correct.

in sample_path_folder_creator.py ， I modified the script to make sure it was output like this control-vs-case.sif
Sample,Group,Type control-1,GROUPA,control control-2,GROUPA,control control-3,GROUPA,control case-1,GROUPB,case case-2,GROUPB,case case-3,GROUPB,case and make sure in the output file srr_dir + '/editing/DnaRna_' like this reditools outputfile*xls
in get_DE_events.py , I modified the script to make sure outfile look like is ok
eg : chromosome position editing_type scramble-1_scramble scramble-2_scramble scramble-3_scramble sh415-1_sh415 sh415-2_sh415 sh415-3_sh415 [groupA_samples/groupB_samples] delta_diff pvalue (Mannwhitney) chrY 272161 TC 1.00^TC^67 1.00^TC^63 1.00^TC^32 1.00^TC^64 1.00^TC^60 1.00^TC^77 [3,3] 0.0 1.0