`faops` operates fasta files

faops is a lightweight tool for operating sequences in the fasta format.

This tool can be regarded as a combination of faCount, faSize, faFrag, faRc, faSomeRecords, faFilter and faSplit from UCSC Jim Kent's utilities.

Comparing to Kent's fa* utilities, faops is:

much smaller (kilo vs mega bytes)
easy to compile (only one external dependency)
well tested
contains only one executable file
can operate gzipped (bgzipped) files
and can be run under all major OSes (including Windows).

faops is also inspired/influenced/stealing from seqtk and ufasta.

$ ./faops

Usage:     faops <command> [options] <arguments>
Version:   0.8.21

Commands:
    help           print this message
    count          count base statistics in FA file(s)
    size           count total bases in FA file(s)
    masked         masked (or gaps) regions in FA file(s)
    frag           extract sub-sequences from a FA file
    rc             reverse complement a FA file
    one            extract one fa record
    some           extract some fa records
    order          extract some fa records by the given order
    replace        replace headers from a FA file
    filter         filter fa records
    split-name     splitting by sequence names
    split-about    splitting to chunks about specified size
    n50            compute N50 and other statistics
    dazz           rename records for dazz_db
    interleave     interleave two PE files
    region         extract regions from a FA file

Options:
    There're no global options.
    Type "faops command-name" for detailed options of each command.
    Options *MUST* be placed just after command.

Examples

Reverse complement

  faops rc test/ufasta.fa out.fa       # prepend RC_ to names
  faops rc -n test/ufasta.fa out.fa    # keep original names

Extract sequences with names in list.file, one name per line
```
  faops some test/ufasta.fa list.file out.fa
```

Same as above, but from stdin and to stdout

  cat test/ufasta.fa | faops some stdin list.file stdout

Sort by header strings

  faops order test/ufasta.fa \
      <(cat test/ufasta.fa | grep '>' | sed 's/>//' | sort) \
      out.fa

Sort by lengths

  faops order test/ufasta.fa \
      <(faops size test/ufasta.fa | sort -n -r -k2,2 | cut -f 1) \
      out2.fa

Tidy fasta file to 80 characters of sequence per line
```
  faops filter -l 80 test/ufasta.fa out.fa
```

All content written on one line

  faops filter -l 0 test/ufasta.fa out.fa

Convert fastq to fasta
```
  faops filter -l 0 in.fq out.fa
```
Compute N50, clean result
```
  faops n50 -H test/ufasta.fa
```
Compute N75
```
  faops n50 -N 75 test/ufasta.fa
```
Compute N90, sum and average of contigs with estimated genome size
```
  faops n50 -N 90 -S -A -g 10000 test/ufasta.fa
```

Compiling

faops can be compiled under Linux, macOS (gcc or clang) and Windows (MinGW).

git clone https://github.com/wang-q/faops
cd faops
make

Installing with Homebrew or Linuxbrew

brew install wang-q/tap/faops

Tests

Done with bats. Useful articles:

# brew install bats-core
make test

Dependency

zlib
kseq.h and khash.h from klib (bundled)

AUTHOR

Qiang Wang <wang-q@outlook.com>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

wang-q/faops

faops operates fasta files