Some notes to work in bash and useful functions

I started my PhD with no knowledge of bioinformatics. However, I learnt some very basic but useful functions that made my life way easier. This repository aims to share some of those functions used in Bash to work in microbial metagenomics, but can be used in any topic.

This repository will be updated constantly (or that's the intention hehe).

Function 1: rename

rename  's/-/_/' *.fna 

I used this function to edit the name of all the .fna files, changing all dashes (-) for underscores (_). Really useful function to homogenize our file names.

Function 2: ls

We can create a list of all the files on a folder.

ls *.fna > list_files.txt

We can also estimate the number of files within a folder with:

ls | wc -l 

Function 3: sed

sed -i 's:"::g' my_file.tsv

This function eliminates all the " within the file named my_file.tsv.

Function 4: loops

for file in *.fa.protein.translations.faa ; do
    mv "$file" "${file/.fa.protein.translations/}"

With this simple loop we eliminate from the names of all the .faa files one section of the name (.fa.protein.translations), making the names of the files more simple and shorter.

On a similar mode, we can add text also:

for file in * ; do 
    mv -- "$file" "OUT_$file" ; 

Here we add "OUT_" to all the files in the folder.