/BFCounter

Primary LanguageC++GNU General Public License v3.0GPL-3.0

BFCounter is a memory efficient program for counting k-mers from sequencing files.

Installation
============
Just run make to compile.

% make

The default maximum k-mer size supported is 31 (8 bytes of memory per k-mer), to modify this either
replace MAX_KMER_SIZE in Makefile with an appropriate number or directly to make with

% make MAX_KMER_SIZE=64

In this case it the maximum k-mer size allowed is 63, and each k-mer will use 16 bytes of memory.

Running
=======
To run the program use

% ./BFCounter

To list available commands

% ./BFCounter count
% ./BFCounter dump

will provide more complete help.

Notes
=====

* BFCounter dump will print a text file with each k-mer and the number of occurrances of the k-mer.
  Only counts of 2 and greater will be printed, there is no limit on the number of occurrances it can handle.
  For each k-mer only the lexicographically smaller of the k-mer and the reverse complement of the k-mer, will be printed
  and the counts will be the sum of both values.

* BFCounter was developed on x86-64 GNU/Linux. Porting to other unix-like
  platforms should be easy, but we haven't done so yet.

* If you run into bugs or problems or have suggestions for future versions 
  please contact me at pmelsted@gmail.com

License
=======

* The hash functions used are from the MurmurHash Library, version 3, released under the
  MIT License. http://code.google.com/p/smhasher/

* The kseq functions for reading fast(a|q)(.gz) files are copyrighted by Heng Li and released
  under the MIT License. http://lh3lh3.users.sourceforge.net/kseq.shtml

* The Google sparsehash library is copyrighted by Google and released under a BSD License.
  http://code.google.com/p/google-sparsehash/

*   This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see <http://www.gnu.org/licenses/>.