/nthash

Go implementation of ntHash

Primary LanguageGoMIT LicenseMIT

ntHash

ntHash implementation in Go


travis GoDoc goreportcard codecov

Overview

This is a Go implementation of the ntHash recursive hash function for hashing all possible k-mers in a DNA/RNA sequence.

For more information, read the ntHash paper by Mohamadi et al. or check out their C++ implementation.

This implementation was inspired by Luiz Irber and his recent blog post on his cool Rust ntHash implementation.

I have coded this up in Go so that ntHash can be used in my HULK and GROOT projects but feel free to use it for yourselves.

Installation

go get github.com/will-rowe/nthash

Example usage

range over ntHash values for a sequence

package main

import (
    "log"
    "github.com/will-rowe/nthash"
)

var (
    sequence = []byte("ACGTCGTCAGTCGATGCAGTACGTCGTCAGTCGATGCAGT")
    kmerSize = 11
)

func main() {

    // create the ntHash iterator using a pointer to the sequence and a k-mer size
    hasher, err := ntHash.New(&sequence, kmerSize)

    // check for errors (e.g. bad k-mer size choice)
    if err != nil {
        log.Fatal(err)
    }

    // collect the hashes by ranging over the hash channel produced by the Hash method
    canonical := true
    for hash := range hasher.Hash(canonical) {
        log.Println(hash)
    }
}