yupenghe/methylpy

Call DMR using files in 10 bp bin format

Opened this issue · 6 comments

Hi yupeng,

I've obtained a batch of processed WGBS data from GEO database, which was generated using the 'methylpy allc-to-bigwig --bin-size 10' command, resulting in 10 bp bin format files (e.g., GSM4603053_allc_18.CGN.10bp.bw). I'm curious about how to utilize these files for DMR calling. Could you provide some guidance or suggestions on this matter?

Attached is an example of 10 bp bin format file.

Thank you in advance for your help!
bigwig data in 10 bp bin format

You can use 10bp bin as a test unit. Converting the bigwig file to the allc format where each row corresponds to a 10bp bin will allow methylpy to perform differential methylation analysis.

Thank you for your prompt response!

Meanwhile I'm considering comparing these public data with my own WGBS data, should I convert my own data into 10bp bin format and then use 10bp bin as a single unit for differential methylation analysis as you mentioned above?

Thank you once again for your time and expertise.

Yes I think that would be a valid strategy.

Thanks a lot!

You can use 10bp bin as a test unit. Converting the bigwig file to the allc format where each row corresponds to a 10bp bin will allow methylpy to perform differential methylation analysis.

Regarding the bw format file mentioned by Minghui, the fourth column represents the average methylation level for this 10bp region. However, the allc file format requires seven mandatory columns, particularly the position of the 1-based cytosine (C). How should this position be chosen? How should the sixth column, 'cov', and the seventh column, 'methylated', be transformed? What specific method are you referring to for converting the bw format file into an allc file?

Looking forward to your reply, thank you.

For the position, you can choose the first base of each 10bp bin. For mc and cov, I would recommend them to be the sum of methyl bases and total bases of all CpGs in the 10bp bins. For the last column, setting all values to 1 will work.

I don't think there are any specific tools you can use for this. Custom script will be needed.