imahdimir/2A240317_UKB_imputed_gt_corr

Code Review

Opened this issue · 0 comments

Hi @JonJala,

This is one of the bash codes I wrote recently a part of a bigger project I am working on with Alex, there are a lot of more in different repos, But I keep things separate and concise although they are related to one project but as they are different steps toward I keep them in different repos. Long story short the other ones have somewhat the same structure.

https://github.com/imahdimir/1734_UKB_imputed_gt_corr/blob/main/sh/filter_bgen.sh
Above is the link to the bash code I appreciate your input on. Please let me know if any improvement I can make to make the code more concise and easier and faster.

I usually keep Python codes in the PY folder of the project like the current one. I also keep the files and directory names in an other python module and I import it in the other modules. Does that make sense to you? Do you think it is a good way to keep things organized and tidy?
https://github.com/imahdimir/1734_UKB_imputed_gt_corr/blob/main/Py/a_main.py
This is one of the modules I wrote to make the data ready for the analysis I wanted to do. I create multiple modules that are in order and name them a_. py and b_.py and so on. The overall structure of each module is this way Imports are in top then I define ENV vars then classes if any then function then the main function and the _test function which I use to test during development. They are lots of ## in my code they are cells because I code step by step and run the code on each step and check the outputs and variables to make sure the code has done the exact thing I wanted. Pycharm cell mode let's me to run each cell in the console separately like the #%% in the VScode (I used to use vscode but after a point I preferred pycharm professional it is more pythonic in my opinion). I don't remove those ## to keep where the cells were for future testing and debug. There might be a better solution or approach if you have or know any better way please advise me so I can change my coding conventions to a better one.

The R codes are in the R folder. I use R-Studio though it has limited functionality. By the way I used the R-studio server using RDP to the server but it is so slow and it is annoying so I gave up. Is there a way to speed up the connection to servers so we can use R-studio server with a higher speed?

Please review my python codes and bash and R and let me know wherever I am doing it wrong or any Improvement I can make in my coding.

Thank you for your time and consideration Jon!